1. Introduction
Today, user agents generally identify themselves to servers by sending a User-Agent
HTTP request
header field along with each request (defined in Section 5.5.3 of [RFC7231]). Ideally, this header
would give servers the ability to perform content negotiation, sending down exactly those bits that
best represent the requested resource in a given user agent, optimizing both bandwidth and user
experience. In practice, however, this header’s value exposes far more information about the user’s
device than seems appropriate as a default, on the one hand, and intentionally obscures the true
user agent in order to bypass misguided server-side heuristics, on the other.
For example, a recent version of Chrome on iOS identifies itself as:
User-Agent: Mozilla/5.0 (iPhone; CPU iPhone OS 12_0 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) CriOS/69.0.3497.105 Mobile/15E148 Safari/605.1
While a recent version of Edge identifies itself as:
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.2704.79 Safari/537.36 Edge/18.014
There’s quite a bit of information packed into those strings (along with a fair number of lies). Version numbers, platform details, model information, etc. are all broadcast along with every request, and form the basis for fingerprinting schemes of all sorts. Individual vendors have taken stabs at altering their user agent strings, and have run into a few categories of feedback from developers that have stymied historical approaches:
-
Brand and version information (e.g. "Chrome 69") allows websites to work around known bugs in specific releases that aren’t otherwise detectable. For example, implementations of Content Security Policy have varied wildly between vendors, and it’s difficult to know what policy to send in an HTTP response without knowing what browser is responsible for its parsing and execution.
-
Developers will often negotiate what content to send based on the user agent and platform. Some application frameworks, for instance, will style an application on iOS differently from the same application on Android in order to match each platform’s aesthetic and design patterns.
-
Similarly to #1, OS revisions and architecture can be responsible for specific bugs which can be worked around in website’s code, and narrowly useful for things like selecting appropriate executables for download (32 vs 64 bit, ARM vs Intel, etc).
-
Sophisticated developers use model/make to tailor their sites to the capabilities of the device (e.g. [FacebookYearClass]) and to pinpoint performance bugs and regressions which sometimes are specific to model/make.
This document proposes a mechanism which might allow user agents to be a bit more aggressive about
removing entropy from the User-Agent
string generally by giving servers that really need some
specific details about the client the ability to opt-into receiving them. It introduces four new
Client Hints ([I-D.ietf-httpbis-client-hints]) that can provide the client’s branding and version
information, the underlying operating system’s branding and major version, as well as details about
the underlying device. Rather than broadcasting this data to everyone, all the time, user agents can
make reasonable decisions about how to respond to given sites' requests for more granular data,
reducing the passive fingerprinting surface area exposed to the network.
1.1. Examples
A user navigates to https://example.com/
for the first time. Their user agent sends the following
header along with the HTTP request:
Sec-CH-UA: "Examplary Browser 73"
The server is interested in rendering content consistent with the user’s underlying platform, and
asks for a little more information by sending an Accept-CH
header (Section 2.2.1 of [I-D.ietf-httpbis-client-hints]) along with the initial response:
Accept-CH: UA, Platform
In response, the user agent includes more detailed version information, as well as information about the underlying platform in the next request:
Sec-CH-UA: "Examplary Browser 73.3R8.2H.1" Sec-CH-Platform: "Windows 10"
2. User Agent Hints
The following sections define a number of HTTP request header fields that expose detail about a given user agent, which servers can opt-into receiving via the Client Hints infrastructure defined in [I-D.ietf-httpbis-client-hints]. The definitions below assume that each user agent has defined a number of properties for itself:
-
brand (for example: "cURL", "Edge", "The World’s Best Web Browser")
-
major version (for example: "72", "3", or "28")
-
full version (for example: "72.0.3245.12", "3.14159", or "297.70E04154A")
-
platform brand and version (for example: "Windows NT 6.0", "iOS 15", or "AmazingOS 17G")
-
platform architecture (for example: "ARM64", or "ia32")
-
model (for example: "", or "Pixel 2 XL")
-
mobileness (for example: ?0 or ?1)
User agents SHOULD keep these strings short and to the point, but servers MUST accept arbitrary values for each, as they are all values constructed at the user agent’s whim.
2.1. The 'Sec-CH-Arch' Header Field
The Sec-CH-Arch
request header field gives a server information about
the architecture of the platform on which a given user agent is executing. It is a Structured Header whose value MUST be a string [I-D.ietf-httpbis-header-structure].
The header’s ABNF is:
Sec-CH-Arch = sh-string
2.2. The 'Sec-CH-Model' Header Field
The Sec-CH-Model
request header field gives a server information about
the device on which a given user agent is executing. It is a Structured Header whose value MUST
be a string [I-D.ietf-httpbis-header-structure].
The header’s ABNF is:
Sec-CH-Model = sh-string
Perhaps Sec-CH-Mobile
is enough, and we don’t need to expose the model?
2.3. The 'Sec-CH-Platform' Header Field
The Sec-CH-Platform
request header field gives a server information about
the platform on which a given user agent is executing. It is a Structured Header whose value
MUST be a string [I-D.ietf-httpbis-header-structure].
The header’s ABNF is:
Sec-CH-Platform = sh-string
2.4. The 'Sec-CH-UA' Header Field
The Sec-CH-UA
request header field gives a server information about a
user agent’s branding and version. It is a Structured Header whose value MUST be a list [I-D.ietf-httpbis-header-structure].
The header’s ABNF is:
Sec-CH-UA = sh-list
Unlike most Client Hints, the Sec-CH-UA
header will be sent with all requests, whether or not the
server opted-into receiving the header via an Accept-CH
header. Prior to an opt-in, however, it
will include only the user agent’s branding information, and the major version number (both of which
are fairly clearly sniffable by "examining the structure of other headers and by testing for the
availability and semantics of the features introduced or modified between releases of a particular
browser" [Janc2014]).
To return the Sec-CH-UA
value for a request, given a client hints set (set),
user agents MUST:
-
Let value be a Structured Header object whose value is a list.
-
Let version be the user agent’s full version if set contains
UA
, and the user agent’s major version otherwise. -
Let ua be a string whose value is the concatenation of the user agent’s brand, a U+0020 SPACE character, and version.
Should we split the version out into a separate
Sec-CH-UA-Version
header? Or keep it here? <https://github.com/wicg/ua-client-hints/issues/7> -
Append ua to value.
-
The user agent MAY execute the following steps:
-
Append additional items to value containing arbitrary brand and version combinations.
-
Randomize the order of the items in value.
Note: See § 5.2 GREASE-like UA Strings for more details on why these steps might be appropriate.
-
-
Return value.
2.5. The 'Sec-CH-Mobile' Header Field
The Sec-CH-Mobile
request header field gives a server information about
whether or not a user agent prefers a "mobile" user experience. It is a Structured Header whose value MUST be a boolean [I-D.ietf-httpbis-header-structure].
The header’s ABNF is:
Sec-CH-Mobile = sh-boolean
2.6. Integration with Fetch
Fetch integration of this specification is defined as part of the Client Hints infrastructure specification.3. Interface
[Exposed =Window ]interface {
NavigatorUAData readonly attribute DOMString ;
brand readonly attribute DOMString ;
version readonly attribute DOMString ;
platform readonly attribute DOMString ;
architecture readonly attribute DOMString ;
model readonly attribute boolean ; };
mobile interface mixin { [
NavigatorUA SecureContext ]Promise <NavigatorUAData >getUserAgent (); };Navigator includes NavigatorUA ;
3.1. Processing model
getUserAgent()
method MUST run these steps:
-
Let p be a a new promise.
-
Run the following steps in parallel:
-
Let UAData be a new
NavigatorUAData
object whose values are initialized as follows:brand
-
The user agent’s brand.
platform
-
The user agent’s platform brand and version.
architecture
-
The user agent’s platform architecture.
model
-
The user agent’s model.
mobile
-
The user agent’s mobileness.
version
-
The user agent’s full version.
-
Resolve p with UAData.
-
-
Return p.
Provide a method to only access the UA’s major version.
4. Security and Privacy Considerations
4.1. Secure Transport
Client Hints will not be delivered to non-secure endpoints (see the secure transport requirements in Section 2.2.1 of [I-D.ietf-httpbis-client-hints]). This means that user agent information will not be leaked over plaintext channels, reducing the opportunity for network attackers to build a profile of a given agent’s behavior over time.
4.2. Delegation
Client Hints will be delegated from top-level pages via Feature Policy. This reduces the likelihood that user agent information will be delivered along with subresource requests, which reduces the potential for passive fingerprinting.
That delegation is defined as part of append client hints to request.
4.3. Access Restrictions
The information in the Client Hints defined above reveals quite a bit of information about the user agent and the platform/device upon which it runs. User agents ought to exercise judgement before granting access to this information, and MAY impose restrictions above and beyond the secure transport and delegation requirements noted above. For instance, user agents could choose to reveal platform architecture only on requests it intends to download, giving the server the opportunity to serve the right binary. Likewise, they could offer users control over the values revealed to servers, or gate access on explicit user interaction via a permission prompt or via a settings interface.
5. Implementation Considerations
5.1. The 'User-Agent' Header
User agents SHOULD deprecate the User-Agent
header in favor of the Client Hints model described in
this document. The header, however, is likely to be impossible to remove entirely in the near-term,
as existing sites' content negotiation code will continue to require its presence (see [Rossi2015] for a recent example of a new browser’s struggles in this area).
One approach which might be advisable could be for each user agent to lock the value of its User-Agent
header, ensuring backwards compatibility by maintaining the crufty declarations of
"like Gecko" and "AppleWebKit/537.36" on into eternity. This can ratchet over time, first freezing
the version number, then shifting platform and model information to something reasonably generic in
order to reduce the fingerprint the header provides.
5.2. GREASE-like UA Strings
History has shown us that there are real incentives for user agents to lie about their branding in order to thread the needle of sites' sniffing scripts. While I’m optimistic that we can reset expectations around sniffing by freezing the thing that’s sniffed-upon today, and creating a sane set of options for developers, it’s likely that this is hopelessly naive. It’s reasonable to ponder what we should do to encourage sniffing in the right way, if we believe it’s going to happen one way or another.
User agents may choose to model UA
as a set, rather than a single entry. This could encourage
standardized processing of the UA
string by
Randomly including additional, intentionally incorrect, comma-separated entries with arbitrary
ordering (similar conceptually to [I-D.ietf-tls-grease]) could encourage standardized processing
if the UA
string by servers, and reduce the chance that we ossify on a few required strings.
For example, Chrome 73’s Sec-CH-UA
header might be "Chrome 73", "NotBrowser 12"
, or "BrowsingIsFun Version 12b", "Chrome 73"
, or something completely different.
5.3. The 'Sec-CH-' prefix
Based on some discussion in https://github.com/w3ctag/design-reviews/issues/320, it seems
reasonable to forbid access to these headers from JavaScript, and demarcate them as
browser-controlled client hints so they can be documented and included in requests without
triggering CORS preflights. A Sec-CH-
prefix seems like a viable approach, but this bit might
shift as the broader Client Hints discussions above coalesce into something more solid that lands
in specs.
6. IANA Considerations
This document intends to define the Sec-CH-Arch
, Sec-CH-Model
, Sec-CH-Platform
, and Sec-CH-UA
HTTP request header fields, and register them in the permanent message header
field registry ([RFC3864]).
It also intends to deprecate the User-Agent
header field.
6.1. 'Sec-CH-Arch' Header Field
Header field name:
- Sec-CH-Arch
Applicable protocol:
- http
Status:
- standard
Author/Change controller:
- IETF
Specification document:
- this specification (§ 2.1 The 'Sec-CH-Arch' Header Field)
6.2. 'Sec-CH-Model' Header Field
Header field name:
- Sec-CH-Model
Applicable protocol:
- http
Status:
- standard
Author/Change controller:
- IETF
Specification document:
- this specification (§ 2.4 The 'Sec-CH-UA' Header Field)
6.3. 'Sec-CH-Platform' Header Field
Header field name:
- Sec-CH-Platform
Applicable protocol:
- http
Status:
- standard
Author/Change controller:
- IETF
Specification document:
- this specification (§ 2.3 The 'Sec-CH-Platform' Header Field)
6.4. 'Sec-CH-UA' Header Field
Header field name:
- Sec-CH-UA
Applicable protocol:
- http
Status:
- standard
Author/Change controller:
- IETF
Specification document:
- this specification (§ 2.4 The 'Sec-CH-UA' Header Field)
6.5. 'Sec-CH-Mobile' Header Field
Header field name:
- Sec-CH-Mobile
Applicable protocol:
- http
Status:
- standard
Author/Change controller:
- IETF
Specification document:
- this specification (§ 2.5 The 'Sec-CH-Mobile' Header Field)
6.6. 'User-Agent' Header Field
Header field name:
- User-Agent
Applicable protocol:
- http
Status:
- deprecated
Author/Change controller:
- IETF
Specification document:
- this specification (§ 5.1 The 'User-Agent' Header), and Section 5.5.3 of [RFC7231]