26/04/2016

Exploring Qualcomm's Secure Execution Environment

Welcome to a new series of blog posts!

In this series, we'll dive once more into the world of TrustZone, and explore a new chain of vulnerabilities and corresponding exploits which will allow us to elevate privileges from zero permissions to code execution in the TrustZone kernel.

This may sound familiar to those of you who have read the previous series - but let me reassure you; this series will be much more exciting!

First of all, this exploit chain features a privilege escalation which is universal across all Android versions and phones (and which requires zero permissions) and a TrustZone exploit which affects a very wide variety of devices. Secondly, we will dive deep into an as-of-yet unexplored operating system - QSEE - Qualcomm's Secure Execution Environment. Lastly, we'll see some interesting TrustZone payloads, such as directly extracting a real fingerprint from TrustZone's encrypted file-system.

In case you would like to follow along with the symbols and disassembled binaries, I will be using my own Nexus 6 throughout this series, with the following fingerprint:
    google/shamu/shamu:5.1.1/LMY48M/2167285:user/release-keys 

You can find the exact factory image here.

 

 

Oh say can QSEE


In this blog post, we'll explore Qualcomm's Secure Execution Environment (QSEE).

As we've previously discussed, one of the main reasons for the inclusion of TrustZone on devices is the ability to provide a "Trusted Execution Environment" (TEE) - an environment which should theoretically allow computation which cannot be interfered with from the regular operating system, and is therefore "trusted".

This is achieved by creating a small operating system which operates solely in the "Secure World" facilitated by TrustZone. This operating system provides a small number of services directly in the form of system calls which are handled by the TrustZone kernel (TZBSP) itself. However, in order to allow for an extensible model where "trusted" functionality can be added, the TrustZone kernel can also securely load and execute small programs called "Trustlets", which are meant to provide a secure service to the insecure ("Normal World") operating system (in our case, Android).




There are several such Trustlets commonly used on devices:
  • keymaster - Implements the key management API provided by the Android "keystore" daemon. It can securely generate and store cryptographic keys and allow the users to operate on data using these keys.
  • widevine - Implementation of Widevine DRM, which allows "secure" playback of media on the device.
In fact, there are many more DRM related trustlets, depending on the OEM and the device, but these two trustlets are universally used.

Where do we start?


Naturally, one place to start would be to look at a trustlet of our choice, and to try and understand what makes it tick. Since the "widevine" module is one of the most ubiquitous, we'll focus on it.

Searching briefly for the widevine trustlet itself in the device's firmware reveals the following:


Apparently the trustlet is split into a few different files... Opening the files reveals a jumbled up mess - some files contain what looks like code, others contain ELF headers and metadata. In any case, before we can start disassembling the trustlet, we need to make some sense out of this format. We can either do this by opening each of the files and guessing the meaning of each blob, or by following the code-paths responsible for loading the trustlet - let's try a little of both.

Loading a Trustlet


In order to load a trustlet from the "Normal World", applications can use the libQSEECom.so shared object, which exports the function "QSEECom_start_app":


Unfortunately this library's source code is not available, so we'll have to reverse engineer the function's implementation to find out what it does. Doing so reveals that it performs the following operations:
  • Opens the /dev/qseecom device and calls some ioctls to configure it
  • Opens the ".mdt" file associated with the trustlet and reads the first 0x34 bytes from it
  • Calculates the number of ".bXX" files using the 0x34 bytes from the ".mdt"
  • Allocates a physically continuous buffer (using "ion") and copies the ".mdt" and ".bXX" files into it
  • Finally, calls a ioctl to load the trustlet itself, using the allocated buffer
So, still no luck on exactly how the images are loaded, but we're getting there.

First of all, the number 0x34 might look familiar - this is the size of a (32 bit) ELF header. Opening the MDT file reveals that the first 0x34 bytes are indeed a valid ELF header:


Moreover, the "QSEECOM_start_app" function we just had a look at used the word at offset 0x2C in order to calculate the number of ".bXX" files. As you can see above, this corresponds to the "e_phnum" field in the ELF header.

Since the "e_phnum" field is usually used to specify the number of program headers, this hints that perhaps each of the ".bXX" files contains single segment of the trustlet. Indeed, opening each of the files reveals content the seems like it may be a segment of the program being loaded... But in order to make sure, we'll need to find the program headers themselves (and see if they match the ".bXX" files).

Looking further, the next few chunks in the ".mdt" file are in fact the program headers themselves, one for each of the ".bXX" files present.


And, confirming our earlier suspicion, their sizes match the sizes of the ".bXX" files exactly. Great!

Note that the first two program headers above look a little strange - they are both NULL-type headers, meaning they are "reserved" and should not be loaded into the resulting ELF image. Strangely, opening the corresponding ".bXX" files reveals that the first block contains the same ELF header and program headers present in the ".mdt", and the second block contains the rest of the ".mdt" file.

In any case, here's a short schematic summing up what we know so far:



Also, note that since the ELF header and the program headers are all present in the ".mdt", we can use "readelf" in order to quickly dump the information about program headers in the trustlet:




At this point we have all the information we need in order to create a complete and valid ELF file from the ".mdt" and ".bXX" files; we have the ELF header and the program headers, as well as each of the segments themselves. We just need to write a small script that will create an ELF file using this data.

I've written a small python script which does just that. You can find it here:

https://github.com/laginimaineb/unify_trustlet

Reflections on Trusting Trustlets

 

By now have a basic understanding of how trustlets are assembled into an executable file, but we still don't know how they are verified. However, since we know the ".bXX" files contain only the segments to be loaded, this means that this data must reside in the ".mdt" file.

So it's time for some guesswork - if we were to build a trusted loader, how would we do it?

One very common paradigm would be to use hash-and-sign (relying on a CRHF and a digital signature). Essentially - we calculate the hash of the data to be authenticated and sign it using a private key for which a corresponding public key is known to the loader.

If that were the case, we'd expect to find two things in the ".mdt":
  • A certificate chain
  • A signature blob
Let's start by looking for a certificate chain. There are way too many formats for certificates, but since the ".mdt" file only contains binary data, we can assume it'll probably be a binary format, the most common of which is DER.

There's a quick hack we can use to find DER encoded certificates - they almost always start with an "ASN.1 SEQUENCE" blob, which is encoded as: 0x30 0x82. So let's search for these two bytes in the ".mdt" and save each found blob into a file. Now, we can check if these blobs are well-formed certificates using "openssl":


Yup, we guessed correctly - those are certificates.

In fact, the trustlet contains three certificates, one after the other. Just for good measure, we might also want to check that these three certificates are in fact a certificate chain which forms a valid chain of trust. We can do this by dumping the certificates to a single "certificate chain" file and using "openssl" to verify each certificate using this chain:


As for the root of trust of this chain - looking at the root certificate in the chain reveals the same root certificate which is used to verify all other parts of the boot chain in Qualcomm's "Secure Boot" process. There has been some research about this mechanism, which has shown that the validation occurs by comparing the SHA256 of the root certificate to a special value called "OEM_PK_HASH", which is "fused" into the devices QFuses during the production process. Since this value should theoretically not be modifiable after the production of the device, this means that forging such a root certificate would essentially require a second pre-image attack against SHA256.

Now, let's get back to the ".mdt" - we've found the certificate chain, so now it's time to look for a signature. Normally, the private key is used to produce a signature and the public key can be used to recover the signed data. Since we have the public key of the top-most certificate in the chain, we can use it to go over the file and opportunistically try to "recover" each blob.

But how will we know when we've succeeded?

Recall that RSA is a trapdoor permutation family - every blob with the same number of bits as the public modulus N is mapped to another blob of the same size.

However, while the RSA public modulus in our case is 2048 bits long, most hashes are much shorter than that (160 bits for SHA1, 256 bits for SHA256). This means that if we try to "decrypt" a blob using our public key and it happens to end with a lot of "slack" space (for example, zero bytes), there's a very good chance that this is the signature we're looking for (for a completely random permutation, the chance of n consecutive zero bits is 2^-n - extremely small for even a moderate n)

In order to do so, I wrote a small program which loads the public key from the top-most certificate in the chain and tries to "recover" each blob in the ".mdt" (using rsa_public_decrypt with PKCS #1 v1.5 padding). If the "recovered" blob ends with a bunch of zero bytes, the program outputs it. So... Running it on our ".mdt":


We've found a signature! Great.

What's more, this signature is 256 bits long, which implies that it may be a SHA256 hash... And if there's one SHA256 in the ".mdt", perhaps there are more?



Lucky once again!

As we can see, the SHA256 hashes for each of the ".bXX" files are also stored in the ".mdt", consecutively. We can also make an educated guess that this will be the data (or at least some of the data) that is signed to produce the signature we found earlier.

Note that the ".b01" file's hash is missing - why is that? Remember that the ".b01" file contains all the data in the ".mdt" other than the ELF header and program headers. Since this data also contains the signature above, and the signature is (possibly) produced over the hashes of the block files, this would cause a circular dependency (since changing the block file would change the hash, which would change the signature, which would again change the block file, etc.). So it makes sense that this block's hash wouldn't be present.

By now we've actually decoded all of the data in the ".mdt" file apart from a small structure which resides right after the program headers. However, after looking at it for a while, we can see that it simply contains pointers and lengths of the various parts of the ".mdt" that we've already decoded:


So finally, we've decoded all of the information in the ".mdt"... Phew.


Motorola's High Assurance Boot


Although the ".mdt" file format we've seen above is universal for all OEMs, Motorola decided to add a little twist.

Instead of supplying an RSA signature like the one we saw earlier, they actually leave the signature blob empty (in fact, the signature I showed you earlier was from a Nexus 5). In fact, Motorola's signature looks like this:


So how is the image verified?

This is done by using a mechanism which Motorola calls HAB ("High Assurance Boot"). This mechanism allows them to verify the ".mdt" file by appending a certificate chain and a signature over the whole ".mdt" to the end of the file, encoded using a proprietary format used by "HAB":


For more information about this mechanism, you can check out this great research by Tal Aloni. In short, the ".mdt" is hashed and signed using the top-most key in the certificate chain, while the root certificate in the chain is verified using a "Super Root Key", which is hard-coded in one of the bootloader's stages.

 

Life of a Trustlet

After the verification process we saw above, the TrustZone kernel loads the trustlet's segments into a secure memory region ("secapp-region") which is inaccessible from the "Normal World" and assigns an ID to it.

Then, the kernel switches into "Secure World" user-mode and executes the trustlet's entry function:



As you can see, the trustlet registers itself with the TrustZone kernel, along with a "handler function". After registering the trustlet, control is returned to the TrustZone kernel, and the loading process finishes.

Now, once the trustlet is loaded, the "Normal World" can send commands to the trustlet by issuing a special SCM call (called "QSEOS_CLIENT_SEND_DATA_COMMAND") containing the loaded trustlet's ID and the request and response buffers. Here's what it looks like:


The TrustZone kernel (TZBSP) receives the SCM call, maps it to QSEOS, which then finds the application with the given ID and calls the handler function which was registered earlier (from "Secure World" user-mode) in order to serve the request.




What's Next?


Now that we have some understanding of what trustlets are and how they are loaded, we can move on to the exploits! In the next blog post we'll find a vulnerability in a very popular trustlet and exploit it in order to execute code within QSEE.


21 comments:

  1. Great post as always!
    I'm waiting for the next post and in the meanwhile I hope you get some time to release the code for standalone version (without the kernel modification) of the older trustzone exploit :).

    As I understand from the post, trustlets execute in a separate usermode inside QSEE. Can exploiting a trustlet lead to possible bootloader unlock? Since it's just a QSEE usermode application the possibly don't have access privileged instructions like blowing a qfuse right?

    PS: can I know the hex editor you are using? :)

    ReplyDelete
    Replies
    1. I have re-written his original zero-write exploit as a standalone kernel module here: https://github.com/ghassani/qc-tz-es-activated-exploit

      Delete
    2. The hex-editor looks like 010 editor, probably running on linux.

      Delete
    3. Hi Madushan,

      First of all, thank you! Glad you enjoyed the post.

      I totally forgot about the "standalone" version of the older TZ exploit. In any case, I just cleaned it up a little bit and put it up on github, here: https://github.com/laginimaineb/standalone_msm8974

      It relies on the previous kernel exploit (https://github.com/laginimaineb/cve-2014-4322/tree/master/Feud/jni) to get kernel code execution, and then dynamically finds the needed kernel symbols and does all the work without needing a kernel module.

      You should definitely check out Ghassan's version as well! I just found out about it, but looking at the code it looks very clean and easy to use.

      As for your second question - you are absolutely correct. QSEE does not have sufficient privileges in order to blow a QFuse, which means we'll need to go further than just exploiting QSEE - you'll see all the details in tomorrow's blog post :)

      Finally, as shuffle2 mentioned below - yes, I am using 010editor on Linux (which I highly recommend! It's a fantastic tool).

      Cheers,
      Gal.

      Delete
    4. Ghassan -

      Just wanted to say thank you for wrapping the exploit in an LKM! Great work. The code is really clean and well-written. Kudos :)

      Gal.

      Delete
    5. Great :D Thanks all you guys for all the help and sources. I waiting for the next post.

      Delete
    6. Gal, thank you for that shout out. That means a lot! I love reading your articles, I learn so much from your explanations and examples. Glad that I am able to contribute something to you and your readers. Keep it up, I look forward to the next exploit :)

      Delete
  2. Just wanted to drop a note and say: you have done some really great work here.

    I actually needed to implement some signing verification tools for Qualcomm's mdt/mbn format very recently. While I have a full set of docs from them it was actually reading through this article that got me most of the information I needed instead :)

    It is funny how reversers often know more about the nitty gritty details of a company's product than they themselves seem to.

    I have also been enjoying the follow up articles on exploitation immensely. Keep up the great work!

    ReplyDelete
    Replies
    1. Thank you Eric, It means a lot to me!

      More posts coming soon :)

      Delete
  3. Very good! Enjoyed reading it, looking forward to new posts.

    Cheers.

    ReplyDelete
  4. Great post!
    Could this exploit be used on a snapdragon 820? Or is it only limited to the 800 and the 810?

    ReplyDelete
  5. Hi, great post!
    I got lost in the signature recover phase. You wrote:

    "Normally, the private key is used to produce a signature and the public key can be used to recover the signed data"
    Please, could you explain me how you recover the signed data with the public key? It's the first time I heard about it.

    Later on you wrote:

    "This means that if we try to "decrypt" a blob using our public key[...]"
    What do you want to mean by "decrypt"? Again, it's the first time I heard about using a public key RSA to decrypt data.

    I do not understand quite well your process of looking for the signature, what does trydec function do?
    Could not you to try to do it by brute force? Sign-hash blob files with several hash functions and the public key and then looking for the result on the .mdt binary data?

    Maybe the problem is that I am a mathematician (...XD) trying to get your work.

    Thank you again and waiting for more post!

    ReplyDelete
    Replies
    1. Thank you!

      As for you question - the main problem here is terminology regarding RSA.

      First, the mathematical definition of RSA is a trapdoor permutation family. Let's say we have the public exponent e, private exponent d and modulus N.

      Now, for each message m in Zn*, applying m' = (m^e) mod N is a onto one-to-one mapping into Zn*. Recall that we chosen e,d such that e*d = 1 (mod phi(N)). This means that by applying the permutation (m' ^ d) mod N = ((m^e) ^ d) mod N = (m ^ (e*d mod phi (N))) mod N = m (mod N).

      So applying the reverse permutation allowed us to retrieve the original message m.

      Now - people often refer to RSA as an encryption scheme. It isn't (because it's not CPA-secure, as it's completely deterministic). But you *could* think of it as encryption in the sense that after permuting a message with the public exponent, it's hard to retrieve the message without knowing the private exponent.

      In that sense, we can say that the operation (m^e) mod N is public-encryption and (m^d) mod N is private-decryption.

      The inverse is also true (since RSA is symmetric with regards to e,d). So we could say that (m^d) mod N is private-encryption and (m^e) mod N is public-decryption.

      Next comes an important primitive that is often used with RSA - signing. Imagine you have a message and would like to guarantee it was produced *only* by someone who knows the private exponent. You could do this by applying the permutation using the private exponent - that is, for each message m, produce (m^d) mod N. We just called this operation "private-encryption" in the previous paragraph, but when using RSA as a signature scheme, we could call this operation "signing".

      So... how is this useful at all? Well, someone with the public exponent can apply "public-decryption" on the message, and by the commutativity of multiplication: (m' ^ d) mod N = ((m^d) ^ e) mod N = (m ^ (d*e mod phi (N))) mod N = (e*d mod phi (N))) mod N = m (mod N). So anyone with the public key can use this to retrieve the message m from the signature. We'll call this operation "verify".

      Finally, if we already have a signature block produced using the private exponent, and we know the signed message has some unique structure, we can scan each block and attempt to perform RSA-verification ("public-decryption") on every block. This will produce some message m' - if it matches the structure we know, it is (other than a negligible probability) our signature block.

      Cheers,
      Gal.

      Delete
    2. Hi Gal,

      Thank you by your quick response. I know what involves RSA cryptography, part of my daily job is related with crypto issues. The thing was that this was the first time I read the concept of encryption when you want to refer to signing, but I agree is just terminology.

      I know a forum is not the best place to write maths...jajaja but I suppose you wanted to write (the ' was left):

      (m^e) mod N = m' is public-encryption and (m'^d) mod N is private-decryption
      &&
      (m^d) mod N = m' is private-encryption and (m'^e) mod N is public-decryption

      Therefore, what I understand is that when you write "we try to "decrypt" a blob using our public key" what you are looking for is a hash value. Because what has been "encrypted/signed" should be a hash value. That's the way signing process works.

      Then, what is the output of your TryDecrypt function, the hash value of some of the blocks? How did you choose the m message along the .mdt binary to perform the "decryption"?

      Thank you
      Regards

      Delete
    3. Hi Jota,

      Sorry, just wanted to write a full explanation just in case other people find it useful. Anyway, I agree, writing math in a blog post is pretty hard :)

      As for the actual value that is signed - it's actually special version of HMAC-SHA256 (w/ a different i_pad and o_pad) over all the block files' data, concatenated. But you can outright ignore that and still find the signature block.

      Here are a couple of facts:
      1. The signature is 2048 bits long, while the HMAC-SHA256 is only 256 bits long.
      2. The signature uses PKCS#1 v1.5 padding

      If we simply use RSA-public-decrypt w/ the appropriate padding on each 2048-bit block, we'll get a 2048 bit result. For each randomly-chosen block, the resulting block's bits will be roughly uniformly distributed (since RSA is a trapdoor permutation). But we know that in the signature blob the first 2048-256 bits will be zero (remember this is after removing the padding). The chances of that happening in uniformly distributed message is negligibly small 2^(-1792).

      So all TryDecrypt does is iterate over each block, use "RSA public decrypt" w/ the appropriate padding, and check if the resulting block starts with a bunch of leading zeros.

      Delete
    4. Hi Gal (my name is Jose, I use yours so I think is fair you to know mine),

      You do not have to apologize, you did not know it and, probably, some readers have learnt a little more about crypto ;)

      Perfect, everything clear now! Maybe, you will find interesting this article: https://www.cs.cornell.edu/courses/cs5430/2015sp/notes/rsa_sign_vs_dec.php

      Look for vulnerabilities on non-public things is quite exciting, but have you try to check how good is the implementation of public TEEs such OP-TEE?

      Will you participate at any security conference to talk about this?

      Thank you,
      Regards

      Delete
    5. Hi Jose,

      Sorry for the late response! I missed your response.

      I didn't look at public TEEs yet, but I might get around to it (for example, Trusty TEE looks like it could be interesting...)

      Also, I haven't spoken in any conference yet (and have nothing planned up ahead). Mainly because the conferences happen to coincide with the exam period :)

      All the best,
      Gal.

      Delete
  6. Awesome post!

    I am trying to follow your post and reverse widevine trustlet(and if needed libQSEECom.so for the loading part).
    Is there a easy way to locate the entry function for the trustlet being loaded? Do I have to look at the libQSEECom.so ?

    Thanks in advance

    ReplyDelete
    Replies
    1. Thank you! I think the easiest way is to disassemble the first function (func_0) and look for the function the returns a function pointer. That function pointer points to the entry function. Alternately, you can just search for the Widevine commands (such as PRDiag*) and work backwards from there using XREFs.

      Delete
  7. Cool post!

    "So let's search for these two bytes in the ".mdt" and save each found blob into a file."

    How could you know the length of these blobs?

    ReplyDelete
    Replies
    1. I didn't, but you can save the blob from the match index until the end of the file, and asn1parse will stop at the end of the ASN1 data.

      Delete