Fonts, Privacy, and Not Breaking the Web

Abstract

Examining the list of fonts installed on a user's system can be used to fingerprint the user. Opinions vary on the scope and severity of such fingerprinting, and efforts to mitigate it must be balanced against other factors such as functionality, or even basic legibility.

Flash Player

In the early days of the Web, Adobe Flash Player (and before that, Macromedia Flash Player) would by default create a complete list of all installed fonts. Third party libraries could, and did, take this list and upload it to a website. This was a fast and highly effective fingerprinting method.

It was in theory possible to disable this but few end-users were aware and motivated enough to do so.

JavaScript libraries such as Font Detective v1 relied on Flash Player to obtain this list.

Flash player End of Life, announced July 2017 heralded the end of this privacy violation and from July 2021 Flash content was blocked from running in Flash Player. Thus, it can no longer be used for font-based fingerprinting.

The non-English Web

The original, 1990s-era Web was World Wide in name and aspiration but Europe-and-America centric in practice. Content was defined to be in the Latin-1 encoding, and Unicode was in it's infancy.

Even popular languages such as Greek were often encoded by (mis)using fonts such as Symbol to map Latin characters to Greek glyphs.

For lesser-supported languages, websites relied on users installing one of a small number of fonts, which would be referred to by name. This usage pre-dated the deployment of Web Fonts.

While this was a poor technical solution, it had the advantage of actually working, in the 1990s, and continues to work today, 30 years later.

Better Internationalization

In general, over the intervening decades, support for other languages has improved greatly:

Font detection

As described by Browserleaks, modern font fingerprinting is slow, requiring a list of named fonts to try one by one; fonts are detected based on whether the width of some text (which may not be visible) changes when the font is applied.

JavaScript libraries such as Font Detective v2 are examples of this approach:

Detects available system fonts with JavaScript, from a list of common fonts.

Another example is Cover Your Tracks by EFF which, once again, tries to find only a list of common system fonts.

The list of fonts you have installed on your machine is generally consistent and linked to a particular operating system. If you install just one font which is unusual for your particular browser, this can be a highly identifying metric.

Brute force examination of potential fonts is slow, unless the number of fonts examined is fairly modest (a few hundred). From the readme of fingerprintjs2:

By default, JS font detection will only detect up to 65 installed fonts. If you want to improve the font detection, you can pass extendedJsFonts: true option. This will increase the number of detectable fonts to ~500.
The default FP process takes about 80-100ms. If you use extendedJsFonts option this time will increase up to 2000ms (cold font cache).

So these approaches don't even look for non-system fonts, and only look for common system fonts. Unusual fonts for minority languages are simply not tested, because they would not be present almost all of the time but would slow fingerprinting for everyone.

Primarily, this approach just identifies the Operating System, information which is already available more rapidly by other means such as the User Agent string.

Intelligent Tracking Protection

On 17 September 2018, Safari 12.0 (release notes) enabled "Intelligent Tracking Protection" which primarily affected tracking cookies. In addition (not mentioned in the release notes) it disabled access to all locally-installed fonts, except those installed by the OS itself. Developers were quick to notice this:

The Web of Minority Languages

Readers of less-common and thus, less-supported languages often rely on locally-installed fonts. People in that linguistic community are aware of, and willing to install, one of a (typically small) number of well known fonts.

Web content, in the Javanese language, rendered in the Jogjakartaip font designed by Aditya Bayu Perdana

An obvious solution is to upgrade such content to use Web Fonts, rather than User-Installed fonts. However, this may not be feasible for various reasons:

As a consequence of disabling use of locally-installed fonts, Web content which has been available for two or three decades suddenly stopped working.

It didn't become less beautiful, it became completely unreadable!

Trade-offs: Balancing Privacy and Functionality

The CSS Fonts 4 specification explicitly leaves undefined the set of installed fonts available to the font matching algorithm.

This allows (but does not require) a user agent to ignore User-Installed Fonts, for the purpose of the Font Matching Algorithm. Several existing user agents already do this, either by default or in an opt-in "resist fingerprinting" or "incognito" mode.

The Privacy Considerations suggests a mitigation for the loss of functionality caused by the mitigation to resist fingerprinting:

The possibility of a configurable, per-user opt-in to exposing some or all User-Installed Fonts, or a per-origin opt-in, is being discussed.
The possibility of a privacy budget, which would penalize or disable a malicious web page which tested a large number of fonts, but allow a harmless page which tested a much smaller number, has also been discussed.

Control over user-installed fonts should ultimately rest with the user. CSS Fonts 4 says:

The default set of installed fonts will vary by UA, platform, and locale; it is important that users be able to customise which installed fonts are available for rendering web pages and to which generic font families, if any, these fonts are mapped.