Controlling Privacy

Projects in this group focus on the enforcement of privacy properties in applications that play a central role in the daily life of their users.
The projects will deliver a variety of privacy definitions suitable to capture the notion of privacy in different application domains, along with corresponding enforcement
A common challenge is to resolve the inherent tension between privacy and system functionalities, coming up with privacy notions and enforcement techniques that provide strong guarantees, yet do not hamper performance or usability.
A related challenge is the characterization of attacker models that are strong enough to cover the attack surface available in the various application domains, yet integrate realistic assumptions on the attacker capabilities in order to preserve efficiency and usability.
We will develop innovative techniques to enforce at run-time a wide range of fine-grained security policies on Web applications and mobile devices; devise foundational cryptographic solutions to protect the privacy of user data in a broad spectrum of cloud services, including data sharing applications and computing environments; design a novel secure hardware-assisted platform for privacy-preserving analytics, with the goal of providing users with rigorous privacy guarantees and, at the same time, analysts
with accurate results; develop techniques to enhance the anonymity of users online, preventing tracking and censorship in anonymous communication networks; and design an architecture to protect users’ privacy in the presence of mobile and wearable devices with recording capabilities.
We will develop programming principles and methodologies to support the development of secure and privacy-preserving applications, as well as methods and tools to prove formal privacy guarantees in protocols and systems, making use of cryptography to protect user data.

B1: Privacy Enforcement for Third-Party Software

Bernd Finkbeiner & Deepak Garg

When running software on a device, the user rightfully expects that sensitive information remains private. Unfortunately, this expectation is often violated. Incidents with mobile applications that silently transferred the complete address book (e.g. Twitter or Facebook) and other personal information to their servers, and a study showing that many of the top web sites on the Internet leak personal data in an unintended way.
We see the need for a mechanism that allows the user to enforce precise restrictions on the flow of information in an application, even if these restrictions are not offered as an option in the application, or may in fact run contrary to the objectives of the potentially disingenuous developer of the application. Our approach builds on recent advances in the property oriented approach to IT-security.
Requirements on information-flow, such as non-interference, and the changes these requirements go through over time, such as declassifying certain information, can be characterized using logics for temporal hyperproperties, such as the temporal logic HyperLTL [B01-17]. A uniform algorithmic framework for the verification (model checking) of such properties already exists. In this project, we 1) extended this algorithmic framework to the automatic monitoring and enforcement of temporal hyperproperties at runtime and 2) studied how enforcement on programs containing both first- and third-party code can be reduced to the verification of first-party code only via sandboxing. To achieve this goal, we had to overcome substantial theoretical and practical challenges. First, not every property can be monitored at runtime, let alone enforced. We identified the monitorable and enforcible sublogic of HyperLTL. Second, efficiency is a prime concern, as the enforcement mechanism should not substantially slow down the main application. We investigated a spectrum of monitoring and enforcement algorithms and developed many optimization techniques. With our specification analysis that reduces the algorithmic work by exploiting symmetry, transitivity or reflexivity of the HyperLTL formula. With our trace analysis, we minimized space consumption. Third, in practice, first-party code is often linked with arbitrary, unverifiable code such as that from third-party libraries, but we still want to verify that the entire program has certain properties. To facilitate this, we examined how verification of an entire program can be reduced to the verification of the first-party code only, by sandboxing the third-party code.

B2: Programming Principles and Abstractions for Privacy

Deepak Garg & Christian Hammer

The world of software has changed a lot in the past decade. Today, broad platforms like mobile phones and web browsers allow users to download software from a variety of sources. In many cases, the software wants access to the user’s private information. With such access comes the risk that the software may leak the private information or misuse it in ways that the user does not expect. Currently, a privacy-conscious user can only deny software access to private information, e.g., by using the “incognito mode” in web browsers and by explicitly denying permissions on mobile platforms like Android. However, this can cause applications to malfunction since they are expecting the private information to be available.
Our overall goal in this project was to enable the design of adaptive applications that react to allowed permissions by gracefully adjusting the level of functionality.1 Our technical approach is based on advancing the area of information flow control, a language-based paradigm for enforcing privacy policies. The key idea is to allow data to come with policies that specify how it can be used (i.e., its confidentiality requirements), and application modules to dually specify what policies they require of input data to provide specific functionality. While our focus is on theoretical foundations, we also make progress on practical deployment of this idea in various settings.
Specifically, we 1) examined the foundations of policy specification that balances the privacy needs of data and the access requirement of program modules, 2) advanced the state of the art in fundamental techniques for enforcing information flow policies in higher-order stateful languages and in low-level languages, and 3) worked on developing our overall idea in the context of Android applications. Overall, our work in this project layed the foundations for achieving the goal we set out with. In executing the project, we made several fundamental contributions to information flow analysis. First, we resolved an open problem in information flow analysis by showing that two different styles of specifying policies – at fine-granularity and at coarse-granularity – are actually equally expressive. Second, we develop semantic models of information flow types that support higher-order state; these models enable showing that enforcement techniques are sound. Third, we develop a technique for enforcing information flow policies in low-level languages like C, where even control flow integrity is not a given.

B3: Computationally Sound Reasoning about Privacy Properties

Michael Backes & Véronique Cortier

Proofs of security protocols are known to be error-prone and, due to multiple interleaved protocol runs that are typical for distributed systems, awkward for humans to construct. Vulnerabilities have indeed accompanied the early design of security protocols, from the timing attack by Bleichenbacher [B03-13] on the SSL 3.0 protocol, which two years earlier was shown secure by Wagner and Schneier [B03-28] in a weaker adversarial model, to a flaw in the Microsoft Passport service [B03-23], which allowed an attacker to reset passwords, to several Kerberos 5 flaws found during the verification of the protocol [B03-15].
Soon after the first protocols were developed, researchers began to work on automating security proofs. The predominantly used class of tools for this purpose are symbolic protocol verifiers, e.g., ProVerif [B03-12], AVISPA [B03-5], Scyther [B03-19], Deepsec [B03-17], Tamarin [B03-27], and APTE [B03-16]. Symbolic protocol verifiers abstract cryptographic operations as symbolic terms with corresponding cancellation rules, e.g., the encryption of a message m with the key k is represented as a term enc(k;m) with the simple cancellation rule dec(k;enc(k;m)) = m. The attacker in such a symbolic model is given by a non-deterministic algorithm that is restricted to applying the given set of rules. Establishing security in such models then boils down to showing that no algorithm that pertains to the given set of rules is capable of violating the security property under consideration, e.g., it cannot extract the message m. Grounding the security analysis on symbolic terms and a fixed set of rules reliefs proofs from cryptographic details such as reduction proofs, probabilistic behavior, and error probabilities. This was the key for automation, at least for moderate-sized protocols and for simple trace properties such as authentication.
However, to show the absence of attacks against actual cryptographic realizations, such a symbolic abstraction has to be sound, i.e., replacing the symbolic abstraction with a (provable secure) cryptographic instantiation should not give rise to attacks that were not covered on the symbolic level. We call symbolic models that are accurate in this sense computationally sound. In a similar spirit to symbolic protocol verifiers, tools have been proposed that allow for checking information-flow properties within a given language, e.g., Joana [B03-24] or Cassandra [B03-26]. These tools enable a detailed analysis already on the implementation level using the semantics of the programming language under consideration, while relying on a symbolic model for cryptographic operations in declassification or endorsement policies(typically encryption or signatures). Hence these works would benefit from computational soundness results as well. While these tools allow for declassification and endorsement policies, figuring out whether some piece of data can be declassified requires reasoning about the secrecy of, e.g., the key it was encrypted with, which comes back to the original question, information flow. Our main objective in this project was thus to develop novel methods and tools to enable a computationally sound analysis of privacy properties in modern cryptographic protocols. This encompasses providing support for reasoning about interactive cryptographic primitives that are at the core of many privacyprotection technologies; moreover, it aims to include extended, privacy-critical adversarial capabilities and to ensure the secure composition and integration of these results. Much of the work done focused on the CoSP framework [B03-6] for computational soundness proofs.
Prior to CoSP, computational soundness results were specific to a given, often ad hoc formalism. The reusability of such results was limited. e.g., computational soundness for public-key encryption did not automatically carry over to different formalisms and underlying programming languages. As a consequence, there were many computational soundness results for the same cryptographic primitives but within different formalisms. To remedy this situation, the CoSP framework decouples the embedding of the programming language from the computational soundness proofs, thereby entailing the reusability of computational soundness results to other programming languages. Hence, proving x cryptographic primitives sound for y programming languages / formal calculi only requires x+y proofs instead of x _ y without CoSP. Prior to this project, the applied p-calculus [B03-6] and RCF [B03-7] have been embedded into CoSP. We extend CoSP to encompass binary compiled languages and fragments of Dalvik Bytecode, the language used for the distribution of Android Apps.

B4: Privacy-Preserving Cloud Storage

Michael Backes & Dominique Schröder

The focus of this project was the confidentiality of storage that was outsourced to cloud providers. Companies and end-users store confidential data on third-party services like Amazon AWS and Dropbox. Whereas one may think that encrypting the entries of the database gives already the “best-possible” notion of privacy, this is indeed not the case: A lot of information can be inferred not only by the content of a certain database entry but also on which entry is accessed. As an example, accessing a folder related to medical data may leak that the client is undergoing some medical treatment that he would like to keep confidential.
This becomes especially problematic when outsourcing large parts of our data to the cloud. Observing the patterns used to access outsourced data, e.g., by monitoring the clients’ queries, allows a curious service provider to learn sensitive information, even if data are encrypted. While the cloud architecture provides benefits in terms of convenience and reliability, it effectively poses a threat to users’ privacy. Ideally, we would like to achieve the best of both worlds where the client is relieved from the burden of maintaining its database but at the same time, no information is leaked to the eyes of an external observer.
Before this project, existing solutions for secure data outsourcing fell short of providing adequate privacy and functionality. On the one side of the spectrum, we had classical cryptographic solutions (ORAM) that provided strong security with little usability and without the possibility of sharing data with other clients. On the opposite side, we had solutions developed by the industry that offered great usability but where the privacy guarantees were poorly understood. In particular, a privacy-preserving solution (that hides data as well as access patterns) with the flexibility of selectively granting read/write accesses to peer clients was missing. The objective of this project was to design, analyze, and implement a cryptographic infrastructure that covers a range security and privacy properties while at the same time offering support for standard database queries, the possibility to grant selective permissions on the database, and the possibility to let everyone verify that the database performed the correct computation. The working plan built on an extension of the ORAM model to the multi-client setting, and further extensions to provide for the verifiability of queries, for fine-grained access control and for database queries like range queries and join. Finally, we investigated the minimal cryptographic assumptions needed for constructing ORAM schemes. We managed to realize the foundational extension of ORAM to the multi-client setting. This extension, called anonymous RAM, provides privacy and integrity guarantees, including the privacy of content, access patterns and user’s identity from curious servers. We also investigated the minimal communication complexity of verifiable computation and on the minimal cryptographic assumptions needed to achieve privacy in outsourced databases.

B5: Anonymous and Censorship-Resistant Communication

Christian Rossow

Anonymous communication is essential to protect the privacy of users online. The academic community has developed several anonymous communication networks over the years, which can be classified in semi-decentralized (like Tor, which uses a central directory) and fully decentralized (like I2P) networks. Today, some of them have reached an unprecedented degree of maturity and popularity: Tor is used daily by over four million users worldwide, I2P by thirty thousand, and their user base is constantly increasing.
Despite this success, anonymous communication networks are affected by a fundamental problem, which undermines the functionality they offer: ISP providers, and thus governmental agencies, can determine which traffic enters an anonymous communication network, which enables them to censor undesired communication, as reported in China or Syria. The same techniques have also been used to identify the users in such network, with the goal of monitoring their online behavior (the second German, after the prime minister, to be reported subject of monitoring by NSA is an Erlangen student running a Tor entry node), and they can also be used more generally to perform traffic analysis and effectively deanonymize users. For all these reasons, developing censorship-resistant communication techniques is of paramount importance not only for the freedom of speech but also, more generally, for the privacy of users in the digital society. This project aimed at devising a broad range of techniques to make anonymous communication networks censorship-resistant. First, we explored whether censorship events can be reliably observed using public observations, which allows us to assess the impact of the censoring problem and to react to censorship events. Second, we explored censorship-resistant communication methodologies that solve the challenge of reliably bootstrapping to the network, in particular decoy routing systems. Third, we investigated how the general technique of credit networks can be used to model trust in a censorship setting. Finally, we investigated if emerging new name resolutions protocols such as DNS-over-HTTP and DNS-over-TLS solve censorship threats by hiding domain names from passive adversaries.

B6: Privacy-Friendly Data Analytics

Paul Francis

User data are constantly collected with various organizations for the purpose of aggregate analysis. Here, the analysist is not interested in individual user data, but rather in user data in the aggregate. Nevertheless, the privacy of individual user data is at risk, for two fundamental reasons: the query result may leak too much information, or the data aggregator itself may leak collected data, intentionally (e.g., selling) or unintentionally (being comprised). A further complication comes from the fact that the analyst is often interested in querying the dataset obtained by joining data collected by different organizations, which poses a secure data sharing problem. In other words, the privacy of data analytics is a two-fold problem, which encompasses the sanitisation of query results as well as the secure storage and sharing of user data.
The project consisted of five work packages in order to solve the above mentioned problems and realize the overall aim to design an effective and privacy-preserving architecture for data analytics.

B7: Privacy-Preserving Digital Capture

Peter Druschel & Bernt Schiele

Sophisticated mobile computing, sensing and recording devices like smartphones and Google Glass are carried by their users virtually around the clock, thus blurring the distinction between the online and offline worlds. While these devices enable transformative new applications and services, they also introduce entirely new threats to users’ privacy, because they can capture a complete record of the user’s location, online and offline activities, and social encounters, including an audiovisual record. In this project, we aim to protect users’ privacy in the presence of mobile and wearable devices with recording capabilities. As a basic building block, we will develop privacy-preserving mechanisms to associate a network communication endpoint with robust signatures of the user which can be based e.g. on the visual appearance or the voice of the user. Using such a mechanism, a device can advertise, via the Internet or short-range radio (e.g. Bluetooth), a signature of its owner’s appearance. The challenge lies on the one hand to develop signatures and corresponding mechanisms that allow to robustly identify the device’s owner in a photo or video, yet, without such a recording, are insufficient to identify the owner, or track the owner across repeated encounters. A mechanism with this capability can be used as follows.
A device can advertise to nearby devices its owner’s signature along with the device owner’s privacy preferences. A preference may state, for instance, that the owner wishes to have her face blurred in any photo or video taken by other devices. We will develop an architecture that enables the platform software on mobile devices (Android, iOS, Windows Phone) to enforce these privacy preferences independent of applications. Additionally, based on the same principles, we will research ways to also protect “privacy” of objects.