the morning paper

End of term, and thank you to the ACM

April 10, 2017

We’ve reached the end of term again, and The Morning Paper will be taking a two week break to recharge my batteries and my paper backlog! We covered a lot of ground over the last few months, and I’ve selected a few highlighted papers/posts at the end of this piece to tide you over until Monday 24th April when The Morning Paper will resume normal service.

I’d like to take this opportunity to thank you all once more for reading! The Morning Paper flies in the face of fashion – I write long-form pieces, and although I try to explain the material as simply as I can, the subject matter invariably makes for dense reading at times. The blog is hosted on a WordPress site using a very basic theme (all the cool kids are on Medium I hear), and the primary distribution mechanism is an email list (how 90’s!). Despite all that, The Morning Paper mailing list passed the 10,000 subscriber mark this last quarter – it’s wonderful to know that there are so many people out there interested in this kind of material. I’d also like to thank all of the researchers whose work I get to cover – you make researching and writing The Morning Paper a joy.

While we’re on the subject of thank yous, I’d also like to say thank you to the team at the ACM who recently worked on a mechanism to provide open access to any paper from the ACM Digital Library that is covered on The Morning Paper.

I always try to select papers that are open access, which often means scrabbling around to try and find a version an author has posted on their personal site. As well as opening up new potential content for the blog (for example, the ACM Computing Surveys), being able to link to ACM DL content should hopefully provide more stable links over time. If you see a link to an ACM DL piece in the blog and you’re not an ACM DL subscriber, please don’t be put off – you should be able to click through and download the pdf. Any difficulties just let me know and I’ll look into it for you.

One last thing before we get to the selections, there are now over 550 paper write-ups on this blog! If you’ve joined recently, that means there is a ton of great research you may have missed out on. Currently the only real way to explore that backlog is browsing through the archives by month. During this Easter break, I’m going to try and get my act together with a tagging scheme so that you can more easily find papers of interest from the backlog.

In TMP publication order, here are a few edited highlights from the first three months of 2017:

Incremental consistency guarantees for replicated objects – the ‘Correctable’ interface that enables speculative computation on possibly inconsistent results.
Weld: a common runtime for high performance analytics – up to a 31x speed-up when using multiple data processing frameworks in concert.
Toward sustainable insights, or why polygamy is bad for you – understanding the dangers of false correlations in data exploration
Quantifying controversy in social media (and the follow-up, Reducing controversy by connecting opposing views) – alerting you when there might be another side to the story that you’re not currently seeing.
Beyond the words: predicting user personality from heterogeneous information – building a model of your personality from your social media interactions.
Making smart contracts smarter – smart contracts in blockchains and some of the perils involved in making them robust
When DNNs go wrong – adversarial examples and what we can learn from them – reminding us that DNNs still don’t ‘think’ quite the same way that we do.
HopFS: Scaling hierarchical file system metadata using NewSQL databases – how Spotify improved Hadoop cluster throughput by an order of magnitude.
Thou shalt not depend on me: analysing the use of outdated JavaScript libraries on the web – vulnerable javascript libraries are everywhere, and they’re not getting updated…
Redundancy does not imply fault tolerance: analysis of distributed storage reactions to single errors and corruptions – the distributed datastore developer’s nightmare continues…
Enlightening the I/O path: a holistic approach to application performance – thinking about file system request priorities end-to-end leads to up to 53% higher request throughput and 42x better 99%-ile request latency.
Application crash consistency and performance with CCFS – stronger consistency and better performance, the file system you’ve been waiting for.
The curious case of the PDF converter that likes Mozart – rethinking privacy dialogs to great effect.

(Yes ok, I had a bit of trouble choosing this time around, that was rather a long list and it was difficult even getting it down to just those picks!).

Also, don’t forget we started working through the top 100 awesome deep learning papers list, and you can find the first week of posts from that starting here, and the second week here.

See you in a couple of weeks, Adrian.

SGXIO: Generic trusted I/O path for Intel SGX

April 7, 2017

SGXIO: Generic trusted I/O path for Intel SGX Weiser & Werner, CODASPY ’17

Intel’s SGX provides hardware-secured enclaves for trusted execution of applications in an untrusted environment. Previously we’ve looked at Haven, which uses SGX in the context of cloud infrastructure, SCONE which shows how to run docker containers under SGX, and Panoply which looks at what happens when multiple applications services, each in their own enclave, need to communicate with each other (think kubernetes pod or a collection of local microservices). SGXIO provides support for generic, trusted I/O paths to protect user input and output between enclaves and I/O devices. (The generic qualifier is important here – currently SGX only works with proprietary trusted paths such as Intel’s Protected Audio Video Path).

A neat twist is that instead of the usual use cases of DRM and untrusted cloud platforms, SGXIO looks at how you might use SGX on your own laptop or desktop. The trusted IO paths could then be for example a trusted screen path, and a trusted keyboard path. Running inside the enclave would be some app working with sensitive information (the example in the paper is a banking app), and the untrusted environment is your own computer – giving protection against any malware, keystroke loggers etc. that may have compromised it. It might be that your desktop or laptop already has SGX support – check the growing hardware list here (or compile and run test-sgx.c on your own system). No luck with my 2015 Dell XPS 13 sadly.

SGXIO surpasses traditional use cases in cloud computing and digital rights management, and makes SGX technology usable for protecting user-centric, local applications against kernel-level keyloggers and likewise. It is compatible with unmodified operating systems and works on a modern commodity notebook out of the box.

Threat model

SGX enforces a minimal TCB comprising the processor and enclave code. Even the processor’s physical environment is considered potentially malicious. In the local setting considered by SGXIO we’re trying to protect the local user against a potentially compromised operating system – physical attacks are not included in the threat model.

SGXIO supports user-centric applications like confidential document viewers, login prompts, password safes, secure conferencing and secure online banking. To take latter as example, with SGXIO an online bank cannot only communicate with the user via TLS but also with the end user via trusted paths between banking app and I/O devices, as depicted in Figure 1 [below]. This means that sensitive information like login credentials, the account balance, or the transaction amount can be protected even if other software running on the user’s computer, including the OS, is infected by malware.

SGXIO Architecture

At the base level of SGXIO we find a trusted hypervisor on top of which a driver host can run one or more secure I/O drivers. The hypervisor isolates all trusted memory. It also binds user devices to drivers and ensures mutual isolation between drivers. In addition to hypervisor isolation, drivers are also executed in enclaves.

We recommend to use seL4 as a hypervisor, as it directly supports isolation of trusted memory, as well as user device binding via its capability system.

The trusted hypervisor, drivers, and a special enclave that helps with hypervisor attestation, called the trusted boot (TB) enclave, together comprise the trusted stack of SGXIO. On top of the hypervisor a virtual machine hosts the (untrusted, commodity) operating system, in which user applications are run. The hypervisor configures the MMU to strictly partition hypervisor and VM memory, preventing the OS from escaping the VM and tampering with the trusted stack.

User apps want to communicate securely with the end user. They open an encrypted communication channel to a secure I/O driver to tunnel through the untrusted OS. The driver in turn requires secure communication with a generic user I/O device, which we term user device. To achieve this, the hypervisor exclusively binds user devices to the corresponding drivers.

(Other, non-protected devices can be assigned directly to the VM).

Secure user apps process all of their sensitive data inside an enclave. Any data leaving this enclave is encrypted.

For example, the user enclave can securely communicate with the driver enclave or a remote server via encrypted channels or store sensitive data offline using SGX sealing.

(Other, non-protected apps can just run directly within the VM as usual).

Creating an encrypted channel (trusted path) between enclaves (e.g., an app enclave and a driver enclave) requires the sharing of
an encryption key. The authors achieve this by piggy-backing on the SGX local (intra-platform) attestation mechanism (see section 3.1 in this Intel white paper). One enclave (in this case the user application enclave) can generate a local attestation report designed to be passed to another enclave (in this case the driver enclave) to prove that it is running on the same platform.

The user enclave generates a local attestation report over a random salt, targeted at the driver enclave. However, instead of delivering the actual report to the driver enclave, the user enclave keeps it private and uses the report’s MAC (Message-Authentication Code) as a symmetric key. It then sends the salt and its identity to the driver enclave, which can recompute the MAC to obtain the same key.

Establishing trust from the ground up.

The previous section described how SGXIO works when everything is up and running. We also need to be able to trust the hypervisor hasn’t been tampered with in the first place, which is achieved via a trusted boot and hypervisor attestation process.

Enclaves can use local attestation to extend trust to any other enclaves in the system… enclaves [also] need confidence that the hypervisor is not compromised and binds user devices correctly to drivers. Effectively, this requires enclaves to invoke hypervisor attestation. SGXIO achieves this with the assistance of the TB enclave.

SGXIO requires a hardware TPM (Trusted Platform Module) for trusted booting. Each boot stage measures the next one in a cryptographic log inside the TPM. Measurements accumulate in a TPM Platform Configuration Register (PCR) whose final value reflects the boot process and is used to prove integrity of the hypervisor.

The TB enclave verifies the PCR value by requesting a TPM quote (cryptographic signature over the PCR value alongside a fresh nonce). Once the TB enclave has attested the hypervisor, any driver enclave can query the TB enclave to determine if the attestation succeeded.

So far so good, but we still need to defend against remote TPM cuckoo attacks:

Here, the attacker compromises the hypervisor image, which yields a wrong PCR value during trusted boot. To avoid being detected by the TB enclave, the attacker generates a valid quote on an attacker-controlled remote TPM and feeds it into the TB enclave, which successfully approves the compromised hypervisor.

To defend against such cuckoo attacks, the TB enclave also needs to identify the TPM it is talking to, by means of the TPM’s Attestation Identity Key (AIK). And how does the TB enclave get the correct value of the AIK to compare against? This part requires external provisioning… and sounds pretty fiddly in practice:

Provisioning of AIKs could be done by system integrators. One has to introduce proper measures to prevent attackers from provisioning arbitrary AIKs. For example, the TB enclave could encode a list of public keys of approved system integrators, which are allowed to provision AIKs.

To prevent an attacker creating additional enclaves under the attackers control and directing legitimate TPM quote or TB enclave approval requests to these enclaves, the hypervisor hides the TPM as well as the TB enclave from the untrusted OS.

Only the legitimate TB enclave is given access to the TPM. Thus, the TB enclave might only succeed in hypervisor attestation if it has been legitimately launched by the hypervisor. Likewise, only legitimate driver enclaves are granted access to the legitimate TB enclave by the hypervisor. A driver enclave might only get approval if it can talk to the legitimate TB enclave, which implies that the driver enclave too has been legitimately launched by the hypervisor.

User verification

After all this work, how does a user know that they’re actually talking with the correct app via a trusted path? The answer relies on presenting a secret piece of information which is shared between the user and the app. For example, once a trusted path has been established to the screen, the app can display the pre-shared secret to the user via the screen driver.

Since the attacker does not know the secret information, he cannot fake this notification.

Once more though, we have to rely on an external provisioning mechanism to get the secret in place to start with:

This approach requires provisioning secret information to a user app, which seals it for later usage. Provisioning could be done once at installation time in a safe environment, e.g., with assistance of the hypervisor, or at any time via SGX’s remote attestation feature.

One last thing

I don’t know what Intel would think about this, but…

Intel’s licensing scheme for production enclaves might be too costly for small businesses or even incompatible with the open source idea. We show how to level up debug enclaves to behave like production enclaves in our model.

All production enclaves need to be licensed by Intel, whereas unlicensed enclaves can be launched in debug mode. Once launched though, the only difference between a debug and production enclave is the presence of SGX debug instructions
The core of the idea is to intercept all SGX debug instructions in the hypervisor, so that only the trusted hypervisor itself can debug enclaves. See section 7 in the paper for the fine print…

Detecting ROP with statistical learning of program characteristics

April 6, 2017

Detecting ROP with statistical learning of program characteristics Elsabagh et al., CODASPY ’17

Return-oriented programming (ROP) attacks work by finding short instruction sequences in a process’ executable memory (called gadgets) and chaining them together to achieve some goal of the attacker. For a quick introduction to ROP, see “The geometry of innocent flesh on the bone…,” which we covered back in December of 2015.

Since ROP attacks can bypass Data Execution Protection and Address Space Layout Randomization (ASLR), ROP detection solutions attempt to detect ROP attacks at runtime, either by signature-based or anomaly-based detection. Signature-based defenses look for (pre-defined) patterns in the execution trace of programs. They have very low overhead but can be bypassed by suitably skilled attackers. Anomaly based detection methods learn what normal looks like, and then detect attacks by looking for deviations from the normal baseline. Recent ROP anomaly detectors have explored the use of hardware performance counters to detect anomalies, but the very use of such counters may also hide the underlying program behaviour meaning that they mask the very signals we want to detect.

Today’s paper introduces EigenROP, which investigates the use microarchitecture-independent program characteristics for the detection of ROP attacks. It significantly improves on the prior state-of-the-art for anomaly-based ROP detection, with a detection accuracy of 81% and only 0.8% false positives.

The key idea of EigenROP is to identify anomalies in program characteristics, due to the execution of ROP gadgets. In this context, it is difficult to precisely define what anomalies are since that depends on the characteristics of both the monitored program and the ROP. However, it is reasonable to assume that some unexpected change occurs in the relationships among the different program characteristics due to the execution of the ROP.

The high level approach works is summarised in Fig. 1 below:

EigenROP samples a set of program characteristics every n instructions (we’ll see which characteristics in a moment), and embeds them into a snapshot vector.
EigenROP uses a sliding window over snapshots, and embeds the window measurements into a high-dimensional space.
The principal components of the measurements are then extracted from which a representative direction is estimated: “…the idea here is that any strong relationships among the measured characteristics will appear as principal components in the high-dimensional space.” The training phase learns the representative direction.
During detection, EigenROP performs the same computation of representative direction of the incoming measurements and compares it to the learned baseline. If the distance exceeds some threshold, an alarm is raised.

Which characteristics have the most predictive power?

The authors evaluated 15 different candidate characteristics, before finally selecting 10 of them for use in EigenROP. In the following table, characteristics are divided into one of three different types: A represents architectural characteristics, I represents microarchitecture-independent characteristics, and M represents micro-architectural characteristics.

Let’s look at the intuition behind a few of these characteristics:

Since ROP attacks disturb the normal control flow of execution, they may increase the number of mispredicted branches (MISP_CBR) by the processor branch predictor.
ROP attacks may show different usage for ret and call instructions, as well as push and pop since they depend on chaining blocks of instructions that load data from the hijacked program stack to registers, and later return to the stack (INST_RET, INST_CALL, INST_STACK).
ROP attacks chain gadgets from arbitrary memory locations, so attacks may exhibit low memory locality when compared to clean execution (MEM_RDIST, MEM_WDIST).
ROP attacks typically load data from the hijacked stack to registers using pop instructions with a single operand. The average number of register operands is therefore likely to be lower in a gadget chain (REG_OPS).
ROP attacks use the stack for chaining gadgets, and the gadgets are typically spread out across the memory of the program, thus they show abnormal (lower) reuse of the same memory blocks compared to clean executions (MEM_REUSE).

Implementation and evaluation

EigenROP is implemented in just 700 lines of Python using the MICA framework which is itself based on Intel’s Pin tool, and SciKit-Learn. Several detected-in-the-wild ROP attacks were used for the evaluation, as well as attacks generated by the ROPC gadgets finder and compiler for common Linux programs.

Using a sampling interval of 16K instructions, EigenROP successfully detected the OSVBD-ID:87289 Linux Hex Editor ROP exploit. The PHP ROP OSVBD-ID:72644 was detectable with a 32K instruction sampling interval.

Despite the very small ROP length (only ~60 instructions in the case of the Linux Hex Editor ROP attack) when compared to the sampling window size, EigenROP still detected the deviation in the program’s characteristics.

The overall accuracy of EigenROP across the test set was 81% with a false positive rate of 0.8%. State of the art microarchitectural defenses achieved accuracy between 49% and 68%.

Figure 4 shows the difference in accuracy with and without the microarchitecture-independent characteristics. By including microarchitecture-independent characteristics, an increase of 9% to 15% in accuracy was achieved. This indicates that microarchitecture-independent characteristics contribute significantly to the detection performance of EigenROP.

It’s possible to trade-off the overhead of EigenROP (by increasing the sampling interval) with detection accuracy, as the following chart reveals:

Recent advances in ROP attacks have shown how to bypass conventional ROP defenses though evasion and mimicry using call-preceded gadgets, evasion gadgets, and history-flushing gadgets.

Call-preceded gadgets fool defenses that depend on branch tracing to look for sequences ending in ret that are not preceded by a call. EigenROP doesn’t depend on branch detection, and will pick up the signal from the mispredicted return addresses.
Evasion gadgets use long gadgets to evade detectors looking for short gadgets! EigenROP does not depend on the gadget chain length for its detection prowess.
History-flushing gadgets target defenses that only keep a limited history of execution – such history can be ‘flushed’ by using innocuous gadgets to fill up the history. Flushing of history in the context of EigenROP though is much harder, since it requires chaining enough gadgets exhibiting similar characteristics to benign code across all measured characteristics.

While our work demonstrates that ROP payloads can be detected using simple program characteristics, there are still needed improvements concerning detection accuracy of very short chains and overhead reduction. Future hardware support can help on both fronts by enabling low-cost fine-grained monitoring. Despite that, EigenROP raises the bar for ROP attacks, and can easily run in-tandem with complementary ROP defenses.

The curious case of the PDF converter that likes Mozart

April 5, 2017

The curious case of the PDF converter that likes Mozart: dissecting and mitigating the privacy risk of personal cloud apps Harkous et al., PoPET ’16

This is the paper that preceded “If you can’t beat them, join them” we looked at yesterday, and well worth interrupting our coverage of CODASPY ’17 for. Harkous et al., study third-party apps that work on top of personal cloud service (e.g., Google Drive, Dropbox, Evernote,…). Careful analysis of 100 third-party apps in the Google Drive ecosystem showed that 64% of them request more permissions than they actually need. No surprise there sadly! The really interesting part of the paper for me is where the authors investigate alternate permission models to discover what works best in helping users to make informed privacy choices. Their Far-reaching insights model (which we’ll get to soon!) is a brilliant invention.

…cloud permissions allow 3rd party apps to get access to any file the user has stored in the cloud… Put simply, the scale and quality of data that can be collected is both a privacy nightmare for unaware users and a goldmine for advertisers.

The rest of this review will proceed as follows:

A quick summary of the findings from investigating existing apps and the permissions they request,
An analysis of permission models to see what best helps users to make privacy-informed choices
A brief look at PrivySeal, the privacy-informing app store the authors built for Google Drive
Suggestions from the authors for how cloud providers can improve their offerings to safeguard users’ privacy.

You want access to what?

This study concerned Google Drive apps, but I’ve no reason to believe there wouldn’t be similar findings in other ecosystems. Here are the possible permissions that a Google Drive app can request:

Finding out the actual permissions an app requests isn’t something you can easily automate, so the authors investigated 100 randomly selected apps out of the 420 “works with Google Drive” apps in the Google Chrome Web Store at the time of the study. When we get to PrivySeal, we’ll see that only around 25% of the apps that people install actually come from the official Google Web Store – and those that come from elsewhere tend to have lower privacy standards!

Analyzing the [application permissions], we found out that 64 out of 100 apps request unneeded permissions. In other words, the developers could have requested less invasive permissions with the current API provided by Google. In total, 76 out of the 100 apps requested full access to the all the files in the user’s Google Drive. Moreover, the 64 over-privileged apps have actually all requested full access. Accordingly, in our sample, around 84% (64/76)of apps requesting full access are over-privileged.

Far-reaching insights for privacy-informed consent

As we saw yesterday, the current permissions interface of Google Drive looks like this:

In the light of the risk that over-privileged apps pose, we propose three alternatives to the existing permission model before evaluating their efficacy.

The evaluation was done with 210 users, against a baseline of the current permissions model. The efficacy of each model is measured using an acceptance likelihood (AL) metric that measures the percentage of the times a user choses to actually install an app after being presented with a permissions dialog for it.

The first model is called Delta Permissions. In this model the permission dialog is modified to explicitly point out when an app requests more permissions than it actually needs, based on the hypothesis that users are less likely to install apps requesting unnecessary permissions.

That looks like it should raise a red flag in the user’s mind… And yet, “we found no evidence of any advantage that delta permissions can introduce, which means that telling our experiment’s participants explicitly about unneeded permissions did not help deter them from installing over-privileged apps.” A great reminder of the value of doing actual studies versus just coming up with a design you think is going to work well and shipping it!

The second model is called Immediate Insights, based on the hypothesis that users shown samples of the data that can be extracted from the unneeded permissions are less likely to install apps requesting them. The delta permissions dialog is expanded with a new panel on the right with the question “what do the unneeded permissions say about you?” and insights that can be:

an image selected at random from the user’s image files
a photo from the user’s image files, which includes GPS location information, placed on a map
an excerpt from the beginning of a randomly chosen text file
the profile picture and name of a randomly chosen collaborator

The third model is called Far-reaching Insights based on the hypothesis that “when the users are shown the far-reaching information that can be inferred from the unneeded permissions granted to apps, they are less likely to authorize these apps.” The appearance is similar to the immediate insights dialog, but the panel is replaced with one that shows deeper inferences from the data in six different categories:

Entities, concepts, and topics extracted by using NLP techniques on user’s textual files:

The Sentiment of the user towards entities with the most positive or negative sentiments:

The top collaborators a user has, based on the analysed files.
The user’s shared interests with collaborators:

Faces with context – faces of the most frequent people featuring in the user’s images, together with the concepts that appear in the same images.

Faces on a map – shows a user where selected photos were taken, together with the faces and items in those photos:

The techniques used to generate the inferences are shown in the following figure.

The experiment reveals which techniques are most successful in discouraging users from installing apps which request permissions they don’t need:

Overall, immediate insights are roughly twice as good as the baseline at discouraging the acceptance of such apps, but far-reaching insights are up to twice as good again as the immediate insights.

The category of personal insights relating to the user himself or herself (image & text from immediate insights, entities-concepts-topics and sentiments from far-reaching insights) are all associated with a significantly higher acceptance likelihood than the far-reaching insights faces-with-context, top-collaborators, and shared-interests.

We denote these as relational insights. From our results, we can conclude that relational insights promote greater privacy awareness in users, as such insights are more likely to dissuade them from installing over-privileged apps.

Also interesting here is that the top-collaborators insight, shown to be something that users care about with respect to their privacy, is something that can be inferred even if an app only has metadata permissions, and not full file access. §7.2 of the paper explores in more detail what can be learned from metadata alone.

PrivySeal

We could all wait for Google to implement something like the far-reaching insights in their app store – but that might take some time if ever (and even then, only covers a subset of apps).

… we decided not to wait and chose an alternative approach, which is independent of the company’s plans and is ready for user utilization immediately. We built PrivySeal, a privacy-focused store for Google Drive apps, which is readily available at https://privyseal.epfl.ch.

To generate its far-reaching insights of course, PrivySeal itself requires access to all your files. If such a solution where hosted by the cloud service provider themselves, similar considerations would of course apply – unless a user-key encryption scheme is in use. From the 1440 registered users of PrivySeal at the time the paper was compiled, there were 662 unique apps installed in their Google Drive accounts (many of these installed before the users connected to PrivySeal of course). These came from the Google Chrome Web Store, other Google stores (add-ons and apps marketplace), or outside of Google’s stores altogether. The following summary shows that apps in unofficial stores tend to be in the majority, as well as being much more cavalier with the permissions they request. Improving the Chrome Web Store permissions model may help a bit therefore, but an ideal solution would be independent of the various stores apps can be found in.

How can we do better?

The authors recommend four steps in additional to PrivySeal that would users make informed privacy decisions:

Providing finer-grained permission models to help apps request only what they truly need
A privacy-preserving intermediate API layer that sits over the top of existing cloud service APIs providing finer grained control, permissions reviews, and transparency dashboards.
Transparency dashboards – a post-installation technique to deter developers from abusing user data. Transparency dashboards allow the user to see which files have been downloaded by each 3rd party app and when such operations took place. This either needs to be integrated into the platform itself, or built into a privacy preserving intermediate API that extensions work with.
Insights based on used data, “unlike external solutions that can only determine what data can be potentially accessed, the cloud platform can provide users with insights based on the data that developers (vendors) have previously downloaded.“

See the related work in section 9.1 of the paper for references to studies looking at user consent mechanisms and their efficacy in the context of Facebook apps, Android apps, and Chrome extensions.

The privacy informing user interfaces in this work were designed in accordance with the principles set out in “A design space for effective privacy notices,” which is also well worth checking out. With new laws coming into effect soon that require informed consent, lets hope we start to see more of this sort of thing in the wild.

If you can’t beat them, join them: a usability approach to interdependent privacy in cloud apps

April 4, 2017

tags: Privacy

If you can’t beat them, join them: a usability approach to interdependent privacy in cloud apps Harkous & Aberer, CODASPY ’17

I’m quite used to thinking carefully about permissions before installing a Chrome browser extensions (they all seem to want permission to see absolutely everything – no thank you!). A similar issue comes up with third-party apps authorised to access documents in cloud storage services such as Google Drive and Dropbox. Many (64%) such third-party apps ask for more permissions than they really need, but Harkous and Aberer’s analysis shows that even if they don’t you can still suffer significant privacy loss. Especially revealing is the analysis that shows how privacy loss propagates when you collaborate with others on documents or folders, and one of your collaborators installs a third-party app which suddenly gains access to some of your own files, without you having any say in the matter.

Based on analyzing a real dataset of 183 Google Drive users and 131 third party apps, we discover that collaborators inflict a privacy loss which is at least 39% higher than what users themselves cause.

Given this state of affairs, the authors consider what practical steps can be taken to make users more aware of potential privacy loss, and perhaps change their decision making processes, when deciding which third-party apps should have access to files. Their elegant solution involves a privacy indicator (extra information shown to the user when deciding to grant a third-party app permissions) which a user study shows significantly increases the chances of a user making privacy-loss minimising decisions. Extrapolating the results of this study to simulations of larger Google Drive networks and an author collaboration network show that the indicator can help reduce privacy loss growth by 40% and 70% respectively.

Privacy loss metrics

With every app authorization decision that users make, they are trusting a new vendor with their data and increasing the potential attack surface… An additional intricacy is that when users grant access to a third-party app, they are not only sharing their own data but also others’ data. This is because cloud storage providers are inherently collaborative platforms where users share and cooperate on common files.

The main concept used in the paper to evaluate privacy loss is the notion of vendor file coverage (VFC). For a single vendor, this is simply the percentage of a users files that the vendor has access to, i.e., a number in the range 0..1. For V vendors we simply add up their coverage scores, giving a result in the range 0..V.

The set of vendors of interest for a given user u comprises those that u has explicitly authorised (Self-VFC) together with the set of vendors that collaborators of u have authorised (Collaborators-VFC). This combination is the Aggregate-VFC. Whenever the paper mentions privacy loss, it means as measured by this VFC metric.

This metric choice allows relaying a message that is simple enough for users to grasp, yet powerful enough to capture a significant part of the privacy loss… telling users that a company has 30% of their files is more concrete than a black-box description informing them that the calculated loss is 30%.

The impact of collaboration

The first part of the study looks at a Google Drive based collaboration network collected during PrivySeal research that we will look at tomorrow. The dataset comprises 3,422 users, who between them had installed 131 distinct Google Drive apps from 99 distinct vendors.

The charts below show the distributions of files, sharing, collaborators, and vendors per user.

We computed the Self-VFC, the Collaborators-VFC, and the Aggregate-VFC for users in the PrivySeal dataset.

The chart below shows the impact of sharing on privacy loss, broken down into populations that share differing percentages of their files. Even at 5% sharing, the median Collaborators-VFC score is 39% higher than the corresponding Self-VFC score – and at 60% sharing this jumps to a 100% increase in VFC score.

Collaborator’s app adoption decisions make a significant contribute to a user’s privacy loss.
The more collaborators, the worse it gets…

Reducing privacy loss through privacy indicators

Now the question becomes: how can we help users minimise their privacy loss when selecting third-party applications? The standard Google Drive application permissions dialog looks like this:

The authors very carefully design an enhanced permissions dialog incorporating privacy indicators that looks like this:

We call our proposed privacy indicators “History-based Insights” as they allow users to account for the previous decisions taken by them or by their collaborators.

Note that the enhanced dialog shows the percentage of files readily accessible by the vendor. The best strategy for the user to minimise privacy loss when considering a new app (if there are several alternatives available) is to select the vendor that already has access to the largest percentage of the user’s files.

157 study participants were divided into two groups: one group (BL) performed a set of tasks using the standard permissions dialog, and the other group (HB) performed the same tasks but were shown the enhanced history-based insights dialog. The participants were told nothing more than that the study was designed to ‘check how people make decisions when they install third-party apps.’

There are four scenarios:

Self-history: tests whether a user is more likely to install an app from a vendor they have installed from before. For the baseline (BL) group with the standard dialog, there was no indication of any privacy considerations in the selection of a subsequent app. They chose based on description, logo, name or the perceived trustworthiness of the URL. In the HB group though, 72% of participants chose the app from the vendor that they previously granted permission to. Quoting one user: “This company has access to all my files, so I woud choose them as I don’t want to have 2 companies with full access to my files.“

The new privacy indicator leads users to more frequently choose apps from vendors they have already authorized.

Collaborator’s app: tests whether a user is more likely to install the same app that a collaborator has used. The baseline group show a fairly even split between the two choices presented, whereas in the HB group 88% choose the app their collaborator is using. “Thanks to John, they already have access to 70% of my data. Sharing the last 30% isn’t as bad as sharing 100% of my data with [the other vendor].”
Collaborator’s vendor: as the previous test, except instead of being offered a choice of an app that is the same as that used by a collaborator, participants are offered the choice of an app from the same vendor as another app used by a collaborator. Most users didn’t really distinguish between the concept of a vendor and the concept of an app, and behaved very similarly to the previous scenario.
Multiple collaborators: given two collaborators, one with which the user shares many files, and one few, does the user chose to install an app that is in common with the higher-sharing collaborator? In the baseline group just less than half choose this app, whereas in the HB group 83% did.

Here’s the summary table for these results:

Overall, we found out that … participants in the HB group were significantly more likely to install the app with less privacy loss than those in the BL group.

The changes in behaviour from this study were extrapolated to two simulated larger networks: one Google Drive network seeded from the PrivySeal Dataset and inflated while retaining input degree distribution; and one network based on author collaborations on papers in the Microsoft Academic graph.

The charts below show the privacy loss under three scenarios:

EBL is the experimental baseline, i.e. no bias towards minimising privacy loss
EHB is the experimental HB model, where the user takes decisions in accordance with the preferences discovered in the preceding user study
FA is a ‘fully aware’ model in which the user always makes the best privacy loss minimisation decision

The privacy loss in the EHB group drops by 41% (inflated network) and 28% (authors network) respectively when compared to the baseline.

One of the major outcomes is that a user’s collaborators can be much more detrimental to her privacy than her own decisions. Consequently, accounting for collaborator’s decisions should be a key component of future privacy indicators in 3rd party cloud apps… Finally, due to their usability and effectiveness, we envision History-based Insights as an important technique within the movement from static privacy indicators towards dynamic privacy assistants that lead users to data-driven privacy decisions.

A study of security vulnerabilities on Docker Hub

April 3, 2017

tags: Containers, Security

A study of security vulnerabilities on Docker Hub Shu et al., CODASPY ’17

This is the first of five papers we’ll be looking at this week from the ACM Conference on Data and Application Security and Privacy which took place earlier this month. Today’s choice is a study looking at image vulnerabilities for container images in Docker Hub. (You may recall an earlier study by Banyan which looked at just official images. Jérôme Petazzoni’s response to that is also well worth reading.) It’s important to note the timing of this study – the analysis was done in April of 2016, and one month later Docker announced the Docker Security Scanning service (as you’ll soon see, it was very much needed!). This service provides automated security analysis, validation, and continuous monitoring for binary images hosted on Docker Hub. Even more recently, Docker introduced the Docker Store with Docker Verified Images (community/Hub images are not verified by Docker). So we would hope, if the analysis was repeated today, that the official and verified images would not contain any known vulnerabilities. There are plenty of unverified images out there too though, and this paper is a very good reminder of the need for software supply-chain management (a phrase I first heard via Joshua Corman).

It’s the same scenario as we recently saw with JavaScript dependencies on the web. As a developer, it’s incredibly easy to add a dependency to your project, which tends to lead to dependencies being added with little thought. In fact of course, adding dependencies is actively encouraged versus rewriting functionality yourself from scratch. Furthermore, any one dependency you bring in often has dependencies of its own. We need to add a little more discipline around the inclusion of dependencies (images, libraries, packages etc.) in projects. I recommend something like the following checklist.

Software supply-chain management checklist

☑️ Is the license of this dependency compatible with my project?
☑️ Is the dependency actively maintained, and do I know where information about vulnerabilities in the dependency will be published?
☑️ Is the version of the dependency I propose to include currently free from vulnerabilities? (Including any vulnerabilities in the transitive closure of dependencies).
☑️ Do I have an active process in place to alert me when a new vulnerability is found in the dependency?
☑️ Do I know where new versions of the dependency will be announced?
☑️ Do I have an active process in place to alert me when a new version is released? (The release notes may contain information about important fixes that didn’t get flagged as a vulnerability for some reason).

Such a process works best if you also have an automated way to catch and flag any new dependency being added to a project – via commits or pull requests for example. (I’ve been using Snyk.io to do this on one of my recent projects, other tools are available). Then you can make sure to go through the checklist during code review – and perhaps even go so far as to fail the build if it is not completed. For your own reference, a good place to document the checklist answers (where to go to find vulnerability information, and the processes in place for notification) might be in a dependencies.md file or similar checked into the root of your project.

Anyway, let’s get back to the study of security vulnerabilities in Docker Hub images…

Docker images and the Docker Hub

In case you’ve been living under a rock for the last couple of years:

Docker distributes applications (e.g., Apache, MySQL) in the form of images. Each image contains the target application software as well as its supporting libraries and configuration files… Docker Hub, introduced in 2014, is a cloud registry service for sharing application images. Images are distributed using repositories, which allowed versioned image development and maintenance.

Official repositories contain certified images from vendors, community repositories can be created by any user or organisation.

Analysis process

This is pretty much as you would expect: download images and scan them! Of course, there are lots of images so you want to do that in parallel. One difficulty is that although official image repositories are listed, there is no such list of community repositories. Not to be put off, the authors generated 5,000,000 unique strings, used the Docker Hub API to query Docker Hub for each of them, and recorded the results. In this way almost 100,000 unique repositories were discovered, ultimately leading to vulnerability scanning of 356,218 community images and 3,802 official images.

The Clair open source tool from CoreOS was used to do the scanning. Clair matches package metadata against the CVE database and related vulnerability tracking databases. Each found vulnerability is graded ‘low’, ‘medium’, or ‘high’ risk based on the Common Vulnerability Scoring System.

The analysis also examines how vulnerabilities propagate through across layers and images. For example:

Findings

The table below shows the number of vulnerabilities found for in images. Note that (as of April 2016) the worst offending community images contained almost 1,800 vulnerabilities! Official images were much better, but still contained 392 vulnerabilities in the worst case. Perhaps the most useful number is the median number of vulnerabilities in the :latest version of each image: 153 for community images, and 76 for official images. For official images, the :latest version tends to far much better than its predecessors (indicating active updating of vulnerable components), whereas for community images there’s not much difference.

If we look at the distribution of vulnerability severities, we see that pretty much all of them are high severity, for both official and community images. What we’re not told is the underlying distribution of vulnerability severities in the CVE database, so this could simply be a reflection of that distribution.

Over 80% of the :latest versions of official images contained at least on high severity vulnerability!

What kind of vulnerabilities did the authors find? Here’s the breakdown for the latest official images (year columns are the year the vulnerability was first reported in the CVE database, not the year of the image – Docker isn’t that old for a start!).

Many Docker Hub repositories are well maintained, whereas others remain unmaintained. Intuitively, an image that has not been updated in a long time is more likely to contain vulnerabilities. Therefore we seek to characterize the age of images at the time of analysis.

Here’s the CDF. It shows that official images tend to be actively updated, whereas the latest versions of community images can be very old. By my reading, about a third of community :latest images are over a year old.

When analysed by layer, we see that child images inherit on average 80 or more vulnerabilities from their parents:

This is an interesting observation, because it suggests that when a child installs new software packages, the maintainer is not applying security updates (e.g., with apt-get upgrade).

Tools

The paper mentions several tools that can assist in scanning container images for vulnerabilities:

Docker’s Security Scanning Service (for private repository customers)
Clair
Banyan collector
OpenSCAP
Twistlock

To this list we can also add the following (and maybe others I’m not aware of or have forgotten too):

Anchore
Tenable
Aqua (formerly Scalock)

See also Docker’s ‘Benchmark for Security‘ recommendations. Be careful out there!

BBR: Congestion-based congestion control

March 31, 2017

BBR: Congestion-based congestion control Cardwell et al., ACM Queue Sep-Oct 2016

With thanks to Hossein Ghodse (@hossg) for recommending today’s paper selection.

This is the story of how members of Google’s make-tcp-fast project developed and deployed a new congestion control algorithm for TCP called BBR (for Bandwidth Bottleneck and Round-trip propagation time), leading to 2-25x throughput improvement over the previous loss-based congestion control CUBIC algorithm. In fact, the improvements would have been even more significant but for the fact that throughput became limited by the deployed TCP receive buffer size. Increasing this buffer size led to a huge 133x relative improvement with BBR (2Gbps), while CUBIC remained at 15Mbps. BBR is also being deployed on YouTube servers, with a small percentage of users being assigned BBR playback.

Playbacks using BBBR show significant improvement in all of YouTube’s quality-of-experience metrics, possibly because BBR’s behavior is more consistent and predictable… BBR reduces median RTT by 53 percent on average globally, and by more than 80 percent in the developing world.

TCP congestion and bottlenecks

The Internet isn’t working as well as it should, and many of the problems relate to TCP’s loss-based congestion control, even with the current best-of-breed CUBIC algorithm. This ties back to design decisions taken in the 1980’s when packet loss and congestion were synonymous due to technology limitations. That correspondence no longer holds so directly.

When bottleneck buffers are large, loss-based congestion control keeps them full, causing bufferbloat. When bottleneck buffers are small, loss-based congestion control misinterprets loss as a signal of congestion, leading to low throughput. Fixing these problems requires finding an alternative to loss-based congestion control.

From the perspective of TCP, the performance of an arbitrarily complex path is bound by two constraints: round-trip propagation time (RT_prop), and bottleneck bandwidth, BtlBw (the bandwidth at the slowest link in each direction).

Here’s a picture to help make this clearer:

The RT_prop time is the minimum time for round-trip propagation if there are no queuing delays and no processing delays at the receiver. The more familiar RTT (round-trip time) is formed of RT_prop + these additional sources of noise and delay.

Bandwidth Delay Product (BDP) is the maximum possible amount of data in transit in a network, and is obtained by multiplying the bottleneck bandwidth and round-trip propagation time.

BDP is central to understanding network performance. Consider what happens to delivery rate as we gradually increase the amount of data inflight. When the amount of inflight data is less than BDP, then delivery rate increases as we send more data – delivery rate is limited by the application. Once the bandwidth at the bottleneck is saturated though, the delivery rate cannot go up anymore – we’re pushing data through that pipe just as fast as it can go. The buffer will fill up, eventually we’ll start dropping packets, but we still won’t increase delivery rate.

The optimum operating point is right on the BDP threshold (blue dot above), but loss-based congestion control operates at the BDP + Bottleneck Buffer Size point (green dot above).

Now let’s look at what happens to RTT as we increase the amount of data inflight. It can never be better than RT_prop, so until we reach BDP, RTT ~= RT_prop. Beyond BDP, as buffers start to fill, RTT goes up until buffers are completely full and we start dropping packets.

Once more, the optimum operating point would be right on the BDP threshold. This was proved by Leonard Kleinrock in 1979, unfortunately about the same time Jeffrey M. Jaffe proved that it was impossible to create a distributed algorithm that converged to this operation point. Jaffe’s result rests on fundamental measurement ambiguities.

Although it is impossible to disambiguate any single measurement, a connection’s behavior over time tells a clearer story, suggesting the possibility of measurement strategies designed to resolve ambiguity.

Introducing BBR

BBR is a congestion control algorithm based on these two parameters that fundamentally characterise a path: bottleneck bandwidth and round-trip propagation time. It makes continuous estimates of these values, resulting in a distributed congestion control algorithm that reacts to actual congestion, not packet loss or transient queue delay, and converges with high probability to Kleinrock’s optimal operating point.

(BBR is a simple instance of a Max-plus control system, a new approach to control based on nonstandard algebra. This approach allows the adaptation rate [controlled by the max gain] to be independent of the queue growth [controlled by the average gain]. Applied to this problem, it results in a simple, implicit control loop where the adaptation to physical constraint changes is automatically handled by the filters representing those constraints. A conventional control system would require multiple loops connected by a complex state machine to accomplish the same result.)

Since RTT can never be less than RT_prop, tracking the minimum RTT provides an unbiased and efficient estimator of the round-trip propagation time. The existing TCP acks provide enough information for us to calculate RTT.

Unlike RTT, nothing in the TCP spec requires implementations to track bottleneck bandwidth, but a good estimate results from tracking delivery rate.

The average delivery rate between a send and an ack is simply the amount of data delivered divided by the time taken. We know that this must be less than the true bottleneck delivery rate, so we can use the highest recorded delivery rate as our running estimate of bandwidth bottleneck.

Putting this altogether leads to a core BBR algorithm with two parts: a protocol to follow on receiving an ack, and a protocol to following when sending. You’ll find the pseudocode for these on pages 28 and 29-30. From my reading, there are a couple of small mistakes in the pseudocode (but I could be mistaken!), so I’ve recreated clean versions below. Please do check against those in the original article if you’re digging deeper…

Here’s the ack protocol:

(app_limited_until is set on the sending side, when the app is not sending enough data to reach BDP). This is what the sending protocol looks like:

The pacing_gain controls how fast packets are sent relative to BtlBw and is key to BBR’s ability to learn. A pacing_gain greater than 1 increases inflight and decreases packet inter-arrival time, moving the connection to the right on the performance charts. A pacing_gain less than 1 has the opposite effect, moving the connection to the left. BBR uses this pacing_gain to implement a simple sequential probing state machine that alternates between testing for higher bandwidths and then testing for lower round-trip times.

The frequency, magnitude, duration and structure of these experiments differ depending on what’s already known (start-up or steady state) and the sending app’s behaviour (intermittent or continuous). Most time is spent in the ProbeBW state probing bandwidth. BBR cycles through a sequence of gains for pacing_gain, using an eight-phase cycle with values 5/4, 3/4, 1, 1, 1, 1, 1, 1. Each phase lasts for the estimated round-trip propagation time.

This design allows the gain cycle first to probe for more bandwidth with a pacing_gain above 1.0, then drain any resulting queue with a pacing_gain an equal distance below 1.0, and then cruise with a short queue using a pacing_gain of 1.0.

The result is a control loop that looks like this plot below showing the RTT (blue), inflight (green) and delivery rate (red) from 700ms of a 10Mbps, 40-ms flow.

Here’s how BBR compares to CUBIC during the first second of a 10 Mbps, 40-ms flow. (BBR in green, CUBIC in red).

We talked about the BBR benefits in Google’s high-speed WAN network (B4) and in YouTube in the introduction. It also has massive benefits for low bandwidth mobile subscriptions.

More than half of the world’s 7 billion mobile Internet subscriptions connect via 8-to 114-kbps 2.5 G systems, which suffer well-documented problems because of loss-based congestion control’s buffer-filling propensities. The bottleneck link for these systems is usually between the SGSN (serving GPRS support node)18 and mobile device. SGSN software runs on a standard PC platform with ample memory, so there are frequently megabytes of buffer between the Internet and mobile device. Figure 10 [ below] compares (emulated) SGSN Internet-to-mobile delay for BBR and CUBIC.