ARM pointer authentication

Benefits for LWN subscribers

The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

By Jonathan Corbet
April 5, 2017

Many exploits come down to convincing code (kernel or otherwise) to access a pointer that was crafted by the attacker. Buffer-overflow exploits and return-oriented programming, for example, both rely on placing a pointer where a return address is expected; when the processor "returns" to that address, the attacker takes control. Much of the hardening work over the years has focused on making it harder to overwrite addresses in this way. But, as demonstrated by a recent kernel patch set, there may be another way: using a new ARM processor feature to detect and reject crafted addresses.

In particular, the ARM 8.3 architecture added a feature called "pointer authentication"; its purpose is to detect pointers created by an external entity. In essence, it attaches a cryptographic signature to pointer values; those signatures can be verified before a pointer is used. An attacker, lacking the key used to create the signatures, is unlikely to be able to create valid pointers for use in an exploit.

Contemporary processors use 64-bit pointer values, but not all of those bits are actually significant. On an ARM64 Linux system using three-level page tables, only the bottom 40 bits are used, while the remaining 24 are equal to the highest significant bit — the 40-bit address is sign-extended to 64 bits, in other words. (For the curious, Documentation/arm64/memory.txt describes the virtual address space layout on ARM64 systems). Those uppermost bits (or a subset thereof) could be put to other uses, including holding an authentication code.

That code is calculated from three values: the pointer itself, a secret key hidden in the process context, and a third value like the current stack pointer. The secret key is intended to make it impossible for an attacker to generate valid codes, while the stack pointer (or some other environmental value) can help prevent the reuse of a valid, signed pointer should one leak to the attacker. The new PAC instruction can be used to calculate the authentication code and insert it into a pointer value.

The value containing the authentication code cannot be dereferenced directly, since, without the sign-extension bits, it is no longer recognized as a valid address. Regaining a usable pointer requires using the AUT instruction, which will recalculate the authentication code and compare it to what is found in the authenticated pointer value. If the two match, the authentication code will be removed; otherwise, the pointer will be modified to ensure a fault should it be dereferenced. Thus, any attempt to use a pointer that lacks a proper authentication code will lead to a crash.

ARM 8.3 provides five separate keys that can be used to authenticate pointers: two for executable (instruction) pointers, two for data, and one "general" key. The RFC patch set from Mark Rutland only uses one of the instruction keys, though, reserving the other keys for future use. For the time being, the feature is only provided for user space; it is not yet used within the kernel itself. Whenever a process is created, the kernel will generate a random key and store it in that process's context; the process will then be able to use that key to sign and authenticate pointers, but it cannot read the key itself.

Actually making use of this feature to, for example, block buffer-overflow exploits is left to user space. The good news here is that the GCC 7 compiler will include basic support for pointer authentication in the form of the -msign-return-address option. Turning this option on will cause code to be added to function prologues to sign the return address; that address will then be authenticated before returning to it. Options exist to limit authentication to non-leaf functions (those that call other functions), or to all functions in the compilation unit.

If this feature works as advertised, this return-address authentication should be enough to block basic buffer-overrun attacks. An attacker may be able to overwrite a function's return address, but they cannot generate the proper authentication code, so a jump to that address will never be taken. The code itself is not large, so the potential for brute-force attacks exists, but those attacks cannot be performed without causing the target process to crash multiple times — an outcome that should attract attention in most settings.

The patch posting is a first-round request for comments, so it is likely to see some changes before being considered for merging. There is some room for future work, including deciding what to do with the other available key values and, perhaps, protecting the kernel as well. There are ways this feature could be used beyond protecting return addresses. Structures containing function pointers are a common target, for example; these, too, could be protected using authentication. Pointer authentication will not solve all of our security problems but it will, with luck, make our systems that much more resistant to attack.

(More information about this feature can be found in this Qualcomm white paper [PDF].

(Log in to post comments)

ARM pointer authentication

Posted Apr 6, 2017 2:01 UTC (Thu) by smoogen (subscriber, #97) [Link]

The code itself is not large, so the potential for brute-force attacks exists, but those attacks cannot be performed without causing the target process to crash multiple times — an outcome that should attract attention in most settings.
====
Or you think the twitter app on the phone has just gone out to lunch again. [I am going from a phone point of view where the app will be fired off automatically if it crashes because it may be prone to crashing anyway.] I wonder if there is a way for a watcher program in the OS can see these crashes and alert the user that this is more malicious than your pokemon go has gone again.

ARM pointer authentication

Posted Apr 6, 2017 14:27 UTC (Thu) by khim (subscriber, #9252) [Link]

On Android there are already a watchdog and program which crashes all the time would be stopped automatically.

This was done mainly not to make system more secure but to save battery: application which starts crashing in a loop for any reason could drain battery to zero pretty quickly.

ARM pointer authentication

Posted Apr 6, 2017 11:39 UTC (Thu) by MarkRutland (subscriber, #74197) [Link]

Great article! Especially given how sparse information is on this topic today.

A couple of minor clarifications:

The "general" key should be the "generic" key.
PAC* and AUT* are families of instructions, rather than specific instructions. For example, GCC uses PACIASP and AUTIASP for authenticating the return address.

Another thing that may be worth noting is that the instructions used by GCC are treated as NOPs by existing CPUs. Libraries and applications using those instructions will function on existing hardware (albeit without authentication). At some point, distributions might consider building with authentication unconditionally

ARM pointer authentication

Posted Apr 6, 2017 13:03 UTC (Thu) by patrick_g (subscriber, #44470) [Link]

Do you know how this new "ARM Pointer Authentication" compare to Grsecurity RAP ?

ARM pointer authentication

Posted Apr 7, 2017 6:35 UTC (Fri) by yootis (subscriber, #4762) [Link]

Right now there are 24 spare bits on addresses, but presumably that number will shrink over time as processors need to address more physical memory. x86-64 uses 48 bits in virtual addresses, and presumably ARM64 will eventually follow.

Also, is there any estimate of the performance impact of adding this protection?

ARM pointer authentication

Posted Apr 7, 2017 13:27 UTC (Fri) by alonz (subscriber, #815) [Link]

As the encryption/decryption primitives are implemented in hardware as part of the CPU, I would estimate that the PACIASP instruction would often add just a single cycle to the function prologue - its true latency can be pipelined.
The AUTIASP instruction, on the other hand, must fully complete before returning from a protected function; per the published paper on the QARMA cipher used by this mechanism, the core may for synthesized at anything between 1 cycle per operation up to 16. Assuming the 16-cycle version is chosen (to allow for high frequencies), I would guess the AUTIASP will add around this number of cycles to the function epilogue.
Overall - I would estimate the performance costs to be on par (or better) with all existing schemes.

Regarding the tag size - Qualcomm's paper discusses this quite comprehensively. 24-bit tags are currently "best case" for userspace; some systems will have to give up some bits (e.g. if they use tagged pointers, or if they use a larger virtual size) so the worst-case is just a measly 3-bit tag. On the other hand, more sensitive environments can choose to restrict their virtual address space to (e.g.) 32 bits, and thus have up to 31-bit tags.

ARM pointer authentication

Posted Apr 9, 2017 11:37 UTC (Sun) by anton (subscriber, #25547) [Link]

Function returns are typically predicted very well by a branch prediction using a CPU-internal return stack, so any latency added by the AUT instruction just increases the time until the prediction is verified; on a modern out-of-order implementation that should not be noticable.

Concerning the number of bits, I wonder if such instructions could not use all bits in the pointer, producing a kind of encrypted pointer with PAC, and decrypting it with AUT. That would then also work if all 64 bits are used for pointers; admittedly, in that case you don't notice that the pointer is fishy when decrypting it, and there is a small chance of it hitting mapped memory, but even then the result is unlikely to be something that the attacker can exploit.

ARM pointer authentication

Posted Apr 13, 2017 7:50 UTC (Thu) by Mity (guest, #85011) [Link]

"Whenever a process is created, the kernel will generate a random key and store it in that process's context."

So, if I understand it correctly, values created by PAC before fork(2) cannot be authenticated by AUT in the child process. Right?

ARM pointer authentication

Posted Apr 13, 2017 14:32 UTC (Thu) by corbet (editor, #1) [Link]

My fault, that wasn't expressed quite right. The key is assigned at exec() time, not at process creation. So threads share a key, and parent and child will share a key after fork() until one of them calls exec().