79 research outputs found

    Building a Collaborative Phone Blacklisting System with Local Differential Privacy

    Full text link
    Spam phone calls have been rapidly growing from nuisance to an increasingly effective scam delivery tool. To counter this increasingly successful attack vector, a number of commercial smartphone apps that promise to block spam phone calls have appeared on app stores, and are now used by hundreds of thousands or even millions of users. However, following a business model similar to some online social network services, these apps often collect call records or other potentially sensitive information from users' phones with little or no formal privacy guarantees. In this paper, we study whether it is possible to build a practical collaborative phone blacklisting system that makes use of local differential privacy (LDP) mechanisms to provide clear privacy guarantees. We analyze the challenges and trade-offs related to using LDP, evaluate our LDP-based system on real-world user-reported call records collected by the FTC, and show that it is possible to learn a phone blacklist using a reasonable overall privacy budget and at the same time preserve users' privacy while maintaining utility for the learned blacklist.Comment: 15 pages, 10 figures, 7 algorithm

    Input Secrecy & Output Privacy: Efficient Secure Computation of Differential Privacy Mechanisms

    Get PDF
    Data is the driving force of modern businesses. For example, customer-generated data is collected by companies to improve their products, discover emerging trends, and provide insights to marketers. However, data might contain personal information which allows to identify a person and violate their privacy. Examples of privacy violations are abundant – such as revealing typical whereabout and habits, financial status, or health information, either directly or indirectly by linking the data to other available data sources. To protect personal data and regulate its collection and processing, the general data protection regulation (GDPR) was adopted by all members of the European Union. Anonymization addresses such regulations and alleviates privacy concerns by altering personal data to hinder identification. Differential privacy (DP), a rigorous privacy notion for anonymization mechanisms, is widely deployed in the industry, e.g., by Google, Apple, and Microsoft. Additionally, cryptographic tools, namely, secure multi-party computation (MPC), protect the data during processing. MPC allows distributed parties to jointly compute a function over their data such that only the function output is revealed but none of the input data. MPC and DP provide orthogonal protection guarantees. MPC provides input secrecy, i.e., MPC protects the inputs of a computation via encrypted processing. DP provides output privacy, i.e., DP anonymizes the output of a computation via randomization. In typical deployments of DP the data is randomized locally, i.e., by each client, and aggregated centrally by a server. MPC allows to apply the randomization centrally as well, i.e., only once, which is optimal for accuracy. Overall, MPC and DP augment each other nicely. However, universal MPC is inefficient – requiring large computation and communication overhead – which makes MPC of DP mechanisms challenging for general real-world deployments. In this thesis, we present efficient MPC protocols for distributed parties to collaboratively compute DP statistics with high accuracy. We support general rank-based statistics, e.g., min, max, median, as well as decomposable aggregate functions, where local evaluations can be efficiently combined to global ones, e.g., for convex optimizations. Furthermore, we detect heavy hitters, i.e., most frequently appearing values, over known as well as unknown data domains. We prove the semi-honest security and differential privacy of our protocols. Also, we theoretically analyse and empirically evaluate their accuracy as well as efficiency. Our protocols provide higher accuracy than comparable solutions based on DP alone. Our protocols are efficient, with running times of seconds to minutes evaluated in real-world WANs between Frankfurt and Ohio (100 ms delay, 100 Mbits/s bandwidth), and have modest hardware requirements compared to related work (mainly, 4 CPU cores at 3.3 GHz and 2 GB RAM per party). Additionally, our protocols can be outsourced, i.e., clients can send encrypted inputs to few servers which run the MPC protocol on their behalf
    • …
    corecore