56 research outputs found
Recommended from our members
Security, Privacy, and Transparency Guarantees for Machine Learning Systems
Machine learning (ML) is transforming a wide range of applications, promising to bring immense economic and social benefits. However, it also raises substantial security, privacy, and transparency challenges. ML workloads indeed push companies toward aggressive data collection and loose data access policies, placing troves of sensitive user information at risk if the company is hacked. ML also introduces new attack vectors, such as adversarial example attacks, which can completely nullify models’ accuracy under attack. Finally, ML models make complex data-driven decisions, which are opaque to the end-users, and difficult to inspect for programmers. In this dissertation we describe three systems we developed. Each system addresses a dimension of the previous challenges, by combining new practical systems techniques with rigorous theory to achieve a guaranteed level of protection, and make systems easier to understand. First we present Sage, a differentially private ML platform that enforces a meaningful protection semantic for the troves of personal information amassed by today’s companies. Second we describe PixelDP, a defense against adversarial examples that leverages differential privacy theory to provide a guaranteed level of accuracy under attack. Third we introduce Sunlight, a tool to enhance the transparency of opaque targeting services, using rigorous causal inference theory to explain targeting decisions to end-users
SoK: Differential Privacies
Shortly after it was first introduced in 2006, differential privacy became
the flagship data privacy definition. Since then, numerous variants and
extensions were proposed to adapt it to different scenarios and attacker
models. In this work, we propose a systematic taxonomy of these variants and
extensions. We list all data privacy definitions based on differential privacy,
and partition them into seven categories, depending on which aspect of the
original definition is modified.
These categories act like dimensions: variants from the same category cannot
be combined, but variants from different categories can be combined to form new
definitions. We also establish a partial ordering of relative strength between
these notions by summarizing existing results. Furthermore, we list which of
these definitions satisfy some desirable properties, like composition,
post-processing, and convexity by either providing a novel proof or collecting
existing ones.Comment: This is the full version of the SoK paper with the same title,
accepted at PETS (Privacy Enhancing Technologies Symposium) 202
Smart Metering System: Developing New Designs to Improve Privacy and Functionality
This PhD project aims to develop a novel smart metering system that plays a dual role: Fulfil basic functions (metering, billing, management of demand for energy in grids) and protect households from privacy intrusions whilst enabling them a degree of freedom. The first two chapters of the thesis will introduce the research background and a detailed literature review on state-of-the-art works for protecting smart meter data. Chapter 3 discusses theory foundations for smart meter data analytics, including machine learning, deep learning, and information theory foundations. The rest of the thesis is split into two parts, ‘Privacy’ and ‘Functionality’, respectively. In the ‘Privacy’ part, the overall smart metering system, as well as privacy configurations, are presented. A threat/adversary model is developed at first. Then a multi-channel smart metering system is designed to reduce the privacy risks of the adversary. Each channel of the system is responsible for one functionality by transmitting different granular smart meter data. In addition, the privacy boundary of the smart meter data in the proposed system is also discovered by introducing a data mining algorithm. By employing the algorithm, a three-level privacy boundary is concluded. Furthermore, a differentially private federated learning-based value-added service platform is designed to provide flexible privacy guarantees to consumers and balance the trade-off between privacy loss and service accuracy. In the ‘Functionality’ part, three feeder-level functionalities: load forecasting, solar energy separation, and energy disaggregation are evaluated. These functionalities will increase thepredictability, visibility, and controllability of the distributed network without utilizing household smart meter data. Finally, the thesis will conclude and summarize the overall system and highlight the contributions and novelties of this project
A Brief Introduction to Machine Learning for Engineers
This monograph aims at providing an introduction to key concepts, algorithms,
and theoretical results in machine learning. The treatment concentrates on
probabilistic models for supervised and unsupervised learning problems. It
introduces fundamental concepts and algorithms by building on first principles,
while also exposing the reader to more advanced topics with extensive pointers
to the literature, within a unified notation and mathematical framework. The
material is organized according to clearly defined categories, such as
discriminative and generative models, frequentist and Bayesian approaches,
exact and approximate inference, as well as directed and undirected models.
This monograph is meant as an entry point for researchers with a background in
probability and linear algebra.Comment: This is an expanded and improved version of the original posting.
Feedback is welcom
- …