2,311 research outputs found
Cloud-based Quadratic Optimization with Partially Homomorphic Encryption
The development of large-scale distributed control systems has led to the
outsourcing of costly computations to cloud-computing platforms, as well as to
concerns about privacy of the collected sensitive data. This paper develops a
cloud-based protocol for a quadratic optimization problem involving multiple
parties, each holding information it seeks to maintain private. The protocol is
based on the projected gradient ascent on the Lagrange dual problem and
exploits partially homomorphic encryption and secure multi-party computation
techniques. Using formal cryptographic definitions of indistinguishability, the
protocol is shown to achieve computational privacy, i.e., there is no
computationally efficient algorithm that any involved party can employ to
obtain private information beyond what can be inferred from the party's inputs
and outputs only. In order to reduce the communication complexity of the
proposed protocol, we introduced a variant that achieves this objective at the
expense of weaker privacy guarantees. We discuss in detail the computational
and communication complexity properties of both algorithms theoretically and
also through implementations. We conclude the paper with a discussion on
computational privacy and other notions of privacy such as the non-unique
retrieval of the private information from the protocol outputs
Exploring Privacy Preservation in Outsourced K-Nearest Neighbors with Multiple Data Owners
The k-nearest neighbors (k-NN) algorithm is a popular and effective
classification algorithm. Due to its large storage and computational
requirements, it is suitable for cloud outsourcing. However, k-NN is often run
on sensitive data such as medical records, user images, or personal
information. It is important to protect the privacy of data in an outsourced
k-NN system.
Prior works have all assumed the data owners (who submit data to the
outsourced k-NN system) are a single trusted party. However, we observe that in
many practical scenarios, there may be multiple mutually distrusting data
owners. In this work, we present the first framing and exploration of privacy
preservation in an outsourced k-NN system with multiple data owners. We
consider the various threat models introduced by this modification. We discover
that under a particularly practical threat model that covers numerous
scenarios, there exists a set of adaptive attacks that breach the data privacy
of any exact k-NN system. The vulnerability is a result of the mathematical
properties of k-NN and its output. Thus, we propose a privacy-preserving
alternative system supporting kernel density estimation using a Gaussian
kernel, a classification algorithm from the same family as k-NN. In many
applications, this similar algorithm serves as a good substitute for k-NN. We
additionally investigate solutions for other threat models, often through
extensions on prior single data owner systems
Supporting Regularized Logistic Regression Privately and Efficiently
As one of the most popular statistical and machine learning models, logistic
regression with regularization has found wide adoption in biomedicine, social
sciences, information technology, and so on. These domains often involve data
of human subjects that are contingent upon strict privacy regulations.
Increasing concerns over data privacy make it more and more difficult to
coordinate and conduct large-scale collaborative studies, which typically rely
on cross-institution data sharing and joint analysis. Our work here focuses on
safeguarding regularized logistic regression, a widely-used machine learning
model in various disciplines while at the same time has not been investigated
from a data security and privacy perspective. We consider a common use scenario
of multi-institution collaborative studies, such as in the form of research
consortia or networks as widely seen in genetics, epidemiology, social
sciences, etc. To make our privacy-enhancing solution practical, we demonstrate
a non-conventional and computationally efficient method leveraging distributing
computing and strong cryptography to provide comprehensive protection over
individual-level and summary data. Extensive empirical evaluation on several
studies validated the privacy guarantees, efficiency and scalability of our
proposal. We also discuss the practical implications of our solution for
large-scale studies and applications from various disciplines, including
genetic and biomedical studies, smart grid, network analysis, etc
- …