9 research outputs found
Differentially Private Bayesian Programming
We present PrivInfer, an expressive framework for writing and verifying
differentially private Bayesian machine learning algorithms. Programs in
PrivInfer are written in a rich functional probabilistic programming language
with constructs for performing Bayesian inference. Then, differential privacy
of programs is established using a relational refinement type system, in which
refinements on probability types are indexed by a metric on distributions. Our
framework leverages recent developments in Bayesian inference, probabilistic
programming languages, and in relational refinement types. We demonstrate the
expressiveness of PrivInfer by verifying privacy for several examples of
private Bayesian inference
Formal verification of higher-order probabilistic programs
Probabilistic programming provides a convenient lingua franca for writing
succinct and rigorous descriptions of probabilistic models and inference tasks.
Several probabilistic programming languages, including Anglican, Church or
Hakaru, derive their expressiveness from a powerful combination of continuous
distributions, conditioning, and higher-order functions. Although very
important for practical applications, these combined features raise fundamental
challenges for program semantics and verification. Several recent works offer
promising answers to these challenges, but their primary focus is on semantical
issues.
In this paper, we take a step further and we develop a set of program logics,
named PPV, for proving properties of programs written in an expressive
probabilistic higher-order language with continuous distributions and operators
for conditioning distributions by real-valued functions. Pleasingly, our
program logics retain the comfortable reasoning style of informal proofs thanks
to carefully selected axiomatizations of key results from probability theory.
The versatility of our logics is illustrated through the formal verification of
several intricate examples from statistics, probabilistic inference, and
machine learning. We further show the expressiveness of our logics by giving
sound embeddings of existing logics. In particular, we do this in a parametric
way by showing how the semantics idea of (unary and relational) TT-lifting can
be internalized in our logics. The soundness of PPV follows by interpreting
programs and assertions in quasi-Borel spaces (QBS), a recently proposed
variant of Borel spaces with a good structure for interpreting higher order
probabilistic programs
National Telecommunications and Information Administration: Comments from Researchers at Boston University and the University of Chicago
These comments were composed by an interdisciplinary group of legal, computer science, and data science faculty and researchers at Boston University and the University of Chicago. This group collaborates on research projects that grapple with the legal, policy, and ethical implications of the use of algorithms and digital innovation in general, and more specifically regarding the use of online platforms, machine learning algorithms for classification, prediction, and decision making, and generative AI. Specific areas of expertise include the functionality and impact of recommendation systems; the development of Privacy Enhancing Technologies (PETs) and their relationship to privacy and data security laws; legal regulation of platforms under privacy, intellectual property, and antitrust laws; the science of monitoring and measuring the behavior of large deployed systems and networks; and programming languages and the science of rigorously specifying and verifying properties of algorithms and information systems
Language-Based Differential Privacy with Accuracy Estimations and Sensitivity Analyses
This thesis focuses on the development of programming frameworks to enforce, by construction, desirable properties of software systems. Particularly, we are interested in enforcing differential privacy -- a mathematical notion of data privacy -- while statically reasoning about the accuracy of computations, along with deriving the sensitivity of arbitrary functions to further strengthen the expressiveness of these systems. To this end, we first introduce DPella, a programming framework for differentially-private queries that allows reasoning about the privacy and accuracy of data analyses. DPella provides a novel component that statically tracks the accuracy of different queries. This component leverages taint analysis to infer statistical independence of the different noises that were added to ensure the privacy of the overall computation. As a result, DPella allows analysts to implement privacy-preserving queries and adjust the privacy parameters to meet accuracy targets or vice-versa.In the context of differentially-private systems, the sensitivity of a function determines the amount of noise needed to achieve a desired level of privacy. However, establishing the sensitivity of arbitrary functions is non-trivial. Consequently, systems such as DPella provided a limited set of functions -- whose sensitivity is known -- to apply over sensitive data, thus hindering the expressiveness of the language. To overcome this limitation, we propose a new approach to derive proofs of sensitivity in programming languages with support for polymorphism. Our approach enriches base types with information about the metric relation between values and applies parametricity to derive proof of a function\u27s sensitivity. These ideas are formalized in a sound calculus and implemented as a Haskell library called Spar, enabling programmers to prove the sensitivity of their functions through type-checking alone.Overall, this thesis contributes to the development of expressive programming frameworks for data analysis with privacy and accuracy guarantees. The proposed approaches are feasible and effective, as demonstrated through the implementation of DPella and Spar
Coupled Relational Symbolic Execution for Differential Privacy
Differential privacy is a de facto standard in data privacy with applications
in the private and public sectors. Most of the techniques that achieve
differential privacy are based on a judicious use of randomness. However,
reasoning about randomized programs is difficult and error prone. For this
reason, several techniques have been recently proposed to support designer in
proving programs differentially private or in finding violations to it. In this
work we propose a technique based on symbolic execution for reasoning about
differential privacy. Symbolic execution is a classic technique used for
testing, counterexample generation and to prove absence of bugs. Here we use
symbolic execution to support these tasks specifically for differential
privacy. To achieve this goal, we leverage two ideas that have been already
proven useful in formal reasoning about differential privacy: relational
reasoning and probabilistic coupling. Our technique integrates these two ideas
and shows how such a combination can be used to both verify and find violations
to differential privacy