1,241 research outputs found
Practical Privacy-Preserving Multiparty Linear Programming Based on Problem Transformation
International audienceCryptographic solutions to privacy-preserving multi-party linear programming are slow. This makes them unsuitable for many economically important applications, such as supply chain optimization, whose size exceeds their practically feasible input range. In this paper we present a privacy-preserving transformation that allows secure outsourcing of the linear program computation in an efficient manner. We evaluate security by quantifying the leakage about the input after the transformation and present implementation results. Using this transformation, we can mostly replace the costly cryptographic operations and securely solve problems several orders of magnitude larger
Scather: programming with multi-party computation and MapReduce
We present a prototype of a distributed computational infrastructure, an associated high level programming language, and an underlying formal framework that allow multiple parties to leverage their own cloud-based computational resources (capable of supporting MapReduce [27] operations) in concert with multi-party computation (MPC) to execute statistical analysis algorithms that have privacy-preserving properties. Our architecture allows a data analyst unfamiliar with MPC to: (1) author an analysis algorithm that is agnostic with regard to data privacy policies, (2) to use an automated process to derive algorithm implementation variants that have different privacy and performance properties, and (3) to compile those implementation variants so that they can be deployed on an infrastructures that allows computations to take place locally within each participant’s MapReduce cluster as well as across all the participants’ clusters using an MPC protocol. We describe implementation details of the architecture, discuss and demonstrate how the formal framework enables the exploration of tradeoffs between the efficiency and privacy properties of an analysis algorithm, and present two example applications that illustrate how such an infrastructure can be utilized in practice.This work was supported in part by NSF Grants: #1430145, #1414119, #1347522, and #1012798
A Hybrid Approach to Privacy-Preserving Federated Learning
Federated learning facilitates the collaborative training of models without
the sharing of raw data. However, recent attacks demonstrate that simply
maintaining data locality during training processes does not provide sufficient
privacy guarantees. Rather, we need a federated learning system capable of
preventing inference over both the messages exchanged during training and the
final trained model while ensuring the resulting model also has acceptable
predictive accuracy. Existing federated learning approaches either use secure
multiparty computation (SMC) which is vulnerable to inference or differential
privacy which can lead to low accuracy given a large number of parties with
relatively small amounts of data each. In this paper, we present an alternative
approach that utilizes both differential privacy and SMC to balance these
trade-offs. Combining differential privacy with secure multiparty computation
enables us to reduce the growth of noise injection as the number of parties
increases without sacrificing privacy while maintaining a pre-defined rate of
trust. Our system is therefore a scalable approach that protects against
inference threats and produces models with high accuracy. Additionally, our
system can be used to train a variety of machine learning models, which we
validate with experimental results on 3 different machine learning algorithms.
Our experiments demonstrate that our approach out-performs state of the art
solutions
Privacy preserving distributed optimization using homomorphic encryption
This paper studies how a system operator and a set of agents securely execute
a distributed projected gradient-based algorithm. In particular, each
participant holds a set of problem coefficients and/or states whose values are
private to the data owner. The concerned problem raises two questions: how to
securely compute given functions; and which functions should be computed in the
first place. For the first question, by using the techniques of homomorphic
encryption, we propose novel algorithms which can achieve secure multiparty
computation with perfect correctness. For the second question, we identify a
class of functions which can be securely computed. The correctness and
computational efficiency of the proposed algorithms are verified by two case
studies of power systems, one on a demand response problem and the other on an
optimal power flow problem.Comment: 24 pages, 5 figures, journa
Towards Improved Homomorphic Encryption for Privacy-Preserving Deep Learning
Mención Internacional en el tÃtulo de doctorDeep Learning (DL) has supposed a remarkable transformation for many fields, heralded
by some as a new technological revolution. The advent of large scale models has increased
the demands for data and computing platforms, for which cloud computing has become
the go-to solution. However, the permeability of DL and cloud computing are reduced
in privacy-enforcing areas that deal with sensitive data. These areas imperatively call for
privacy-enhancing technologies that enable responsible, ethical, and privacy-compliant
use of data in potentially hostile environments.
To this end, the cryptography community has addressed these concerns with what
is known as Privacy-Preserving Computation Techniques (PPCTs), a set of tools that
enable privacy-enhancing protocols where cleartext access to information is no longer
tenable. Of these techniques, Homomorphic Encryption (HE) stands out for its ability
to perform operations over encrypted data without compromising data confidentiality or
privacy. However, despite its promise, HE is still a relatively nascent solution with efficiency
and usability limitations. Improving the efficiency of HE has been a longstanding
challenge in the field of cryptography, and with improvements, the complexity of the
techniques has increased, especially for non-experts.
In this thesis, we address the problem of the complexity of HE when applied to DL.
We begin by systematizing existing knowledge in the field through an in-depth analysis
of state-of-the-art for privacy-preserving deep learning, identifying key trends, research
gaps, and issues associated with current approaches. One such identified gap lies in the
necessity for using vectorized algorithms with Packed Homomorphic Encryption (PaHE),
a state-of-the-art technique to reduce the overhead of HE in complex areas. This thesis
comprehensively analyzes existing algorithms and proposes new ones for using DL with
PaHE, presenting a formal analysis and usage guidelines for their implementation.
Parameter selection of HE schemes is another recurring challenge in the literature,
given that it plays a critical role in determining not only the security of the instantiation
but also the precision, performance, and degree of security of the scheme. To address
this challenge, this thesis proposes a novel system combining fuzzy logic with linear
programming tasks to produce secure parametrizations based on high-level user input
arguments without requiring low-level knowledge of the underlying primitives.
Finally, this thesis describes HEFactory, a symbolic execution compiler designed to
streamline the process of producing HE code and integrating it with Python. HEFactory
implements the previous proposals presented in this thesis in an easy-to-use tool. It provides
a unique architecture that layers the challenges associated with HE and produces
simplified operations interpretable by low-level HE libraries. HEFactory significantly reduces
the overall complexity to code DL applications using HE, resulting in an 80% length
reduction from expert-written code while maintaining equivalent accuracy and efficiency.El aprendizaje profundo ha supuesto una notable transformación para muchos campos
que algunos han calificado como una nueva revolución tecnológica. La aparición de modelos
masivos ha aumentado la demanda de datos y plataformas informáticas, para lo cual,
la computación en la nube se ha convertido en la solución a la que recurrir. Sin embargo,
la permeabilidad del aprendizaje profundo y la computación en la nube se reduce en los
ámbitos de la privacidad que manejan con datos sensibles. Estas áreas exigen imperativamente
el uso de tecnologÃas de mejora de la privacidad que permitan un uso responsable,
ético y respetuoso con la privacidad de los datos en entornos potencialmente hostiles.
Con este fin, la comunidad criptográfica ha abordado estas preocupaciones con las
denominadas técnicas de la preservación de la privacidad en el cómputo, un conjunto de
herramientas que permiten protocolos de mejora de la privacidad donde el acceso a la información
en texto claro ya no es sostenible. Entre estas técnicas, el cifrado homomórfico
destaca por su capacidad para realizar operaciones sobre datos cifrados sin comprometer
la confidencialidad o privacidad de la información. Sin embargo, a pesar de lo prometedor
de esta técnica, sigue siendo una solución relativamente incipiente con limitaciones
de eficiencia y usabilidad. La mejora de la eficiencia del cifrado homomórfico en la
criptografÃa ha sido todo un reto, y, con las mejoras, la complejidad de las técnicas ha
aumentado, especialmente para los usuarios no expertos.
En esta tesis, abordamos el problema de la complejidad del cifrado homomórfico
cuando se aplica al aprendizaje profundo. Comenzamos sistematizando el conocimiento
existente en el campo a través de un análisis exhaustivo del estado del arte para el aprendizaje
profundo que preserva la privacidad, identificando las tendencias clave, las lagunas
de investigación y los problemas asociados con los enfoques actuales. Una de las
lagunas identificadas radica en el uso de algoritmos vectorizados con cifrado homomórfico
empaquetado, que es una técnica del estado del arte que reduce el coste del cifrado
homomórfico en áreas complejas. Esta tesis analiza exhaustivamente los algoritmos existentes
y propone nuevos algoritmos para el uso de aprendizaje profundo utilizando cifrado
homomórfico empaquetado, presentando un análisis formal y unas pautas de uso para su
implementación.
La selección de parámetros de los esquemas del cifrado homomórfico es otro reto recurrente
en la literatura, dado que juega un papel crÃtico a la hora de determinar no sólo la
seguridad de la instanciación, sino también la precisión, el rendimiento y el grado de seguridad del esquema. Para abordar este reto, esta tesis propone un sistema innovador que
combina la lógica difusa con tareas de programación lineal para producir parametrizaciones
seguras basadas en argumentos de entrada de alto nivel sin requerir conocimientos
de bajo nivel de las primitivas subyacentes.
Por último, esta tesis propone HEFactory, un compilador de ejecución simbólica diseñado
para agilizar el proceso de producción de código de cifrado homomórfico e integrarlo
con Python. HEFactory es la culminación de las propuestas presentadas en esta
tesis, proporcionando una arquitectura única que estratifica los retos asociados con el
cifrado homomórfico, produciendo operaciones simplificadas que pueden ser interpretadas
por bibliotecas de bajo nivel. Este enfoque permite a HEFactory reducir significativamente
la longitud total del código, lo que supone una reducción del 80% en la
complejidad de programación de aplicaciones de aprendizaje profundo que usan cifrado
homomórfico en comparación con el código escrito por expertos, manteniendo una precisión
equivalente.Programa de Doctorado en Ciencia y TecnologÃa Informática por la Universidad Carlos III de MadridPresidenta: MarÃa Isabel González Vasco.- Secretario: David Arroyo Guardeño.- Vocal: Antonis Michala
Geo-tagging and privacy-preservation in mobile cloud computing
With the emerge of the cloud computing service and the explosive growth of the mobile devices and applications, mobile computing technologies and cloud computing technologies have been drawing significant attentions. Mobile cloud computing, with the synergy between the cloud and mobile technologies, has brought us new opportunities to develop novel and practical systems such as mobile multimedia systems and cloud systems that provide collaborative data-mining services for data from disparate owners (e.g., mobile users). However, it also creates new challenges, e.g., the algorithms deployed in the computationally weak mobile device require higher efficiency, and introduces new problems such as the privacy concern when the private data is shared in the cloud for collaborative data-mining. The main objectives of this dissertation are: 1. to develop practical systems based on the unique features of mobile devices (i.e., all-in-one computing platform and sensors) and the powerful computing capability of the cloud; 2. to propose solutions protecting the data privacy when the data from disparate owners are shared in the cloud for collaborative data-mining. We first propose a mobile geo-tagging system. It is a novel, accurate and efficient image and video based remote target localization and tracking system using the Android smartphone. To cope with the smartphones' computational limitation, we design light-weight image/video processing algorithms to achieve a good balance between estimation accuracy and computational complexity. Our system is first of its kind and we provide first hand real-world experimental results, which demonstrate that our system is feasible and practicable. To address the privacy concern when data from disparate owners are shared in the cloud for collaborative data-mining, we then propose a generic compressive sensing (CS) based secure multiparty computation (MPC) framework for privacy-preserving collaborative data-mining in which data mining is performed in the CS domain. We perform the CS transformation and reconstruction processes with MPC protocols. We modify the original orthogonal matching pursuit algorithm and develop new MPC protocols so that the CS reconstruction process can be implemented using MPC. Our analysis and experimental results show that our generic framework is capable of enabling privacy preserving collaborative data-mining. The proposed framework can be applied to many privacy preserving collaborative data-mining and signal processing applications in the cloud. We identify an application scenario that requires simultaneously performing secure watermark detection and privacy preserving multimedia data storage. We further propose a privacy preserving storage and secure watermark detection framework by adopting our generic framework to address such a requirement. In our secure watermark detection framework, the multimedia data and secret watermark pattern are presented to the cloud for secure watermark detection in a compressive sensing domain to protect the privacy. We also give mathematical and statistical analysis to derive the expected watermark detection performance in the compressive sensing domain, based on the target image, watermark pattern and the size of the compressive sensing matrix (but without the actual CS matrix), which means that the watermark detection performance in the CS domain can be estimated during the watermark embedding process. The correctness of the derived performance has been validated by our experiments. Our theoretical analysis and experimental results show that secure watermark detection in the compressive sensing domain is feasible. By taking advantage of our mobile geo-tagging system and compressive sensing based privacy preserving data-mining framework, we develop a mobile privacy preserving collaborative filtering system. In our system, mobile users can share their personal data with each other in the cloud and get daily activity recommendations based on the data-mining results generated by the cloud, without leaking the privacy and secrecy of the data to other parties. Experimental results demonstrate that the proposed system is effective in enabling efficient mobile privacy preserving collaborative filtering services.Includes bibliographical references (pages 126-133)
Privacy-Preserving Cloud-Assisted Data Analytics
Nowadays industries are collecting a massive and exponentially growing amount of data that can be utilized to extract useful insights for improving various aspects of our life. Data analytics (e.g., via the use of machine learning) has been extensively applied to make important decisions in various real world applications. However, it is challenging for resource-limited clients to analyze their data in an efficient way when its scale is large. Additionally, the data resources are increasingly distributed among different owners. Nonetheless, users\u27 data may contain private information that needs to be protected.
Cloud computing has become more and more popular in both academia and industry communities. By pooling infrastructure and servers together, it can offer virtually unlimited resources easily accessible via the Internet. Various services could be provided by cloud platforms including machine learning and data analytics.
The goal of this dissertation is to develop privacy-preserving cloud-assisted data analytics solutions to address the aforementioned challenges, leveraging the powerful and easy-to-access cloud. In particular, we propose the following systems.
To address the problem of limited computation power at user and the need of privacy protection in data analytics, we consider geometric programming (GP) in data analytics, and design a secure, efficient, and verifiable outsourcing protocol for GP. Our protocol consists of a transform scheme that converts GP to DGP, a transform scheme with computationally indistinguishability, and an efficient scheme to solve the transformed DGP at the cloud side with result verification. Evaluation results show that the proposed secure outsourcing protocol can achieve significant time savings for users.
To address the problem of limited data at individual users, we propose two distributed learning systems such that users can collaboratively train machine learning models without losing privacy. The first one is a differentially private framework to train logistic regression models with distributed data sources. We employ the relevance between input data features and the model output to significantly improve the learning accuracy. Moreover, we adopt an evaluation data set at the cloud side to suppress low-quality data sources and propose a differentially private mechanism to protect user\u27s data quality privacy. Experimental results show that the proposed framework can achieve high utility with low quality data, and strong privacy guarantee.
The second one is an efficient privacy-preserving federated learning system that enables multiple edge users to collaboratively train their models without revealing dataset. To reduce the communication overhead, we select well-aligned and large-enough magnitude gradients for uploading which leads to quick convergence. To minimize the noise added and improve model utility, each user only adds a small amount of noise to his selected gradients, encrypts the noise gradients before uploading, and the cloud server will only get the aggregate gradients that contain enough noise to achieve differential privacy. Evaluation results show that the proposed system can achieve high accuracy, low communication overhead, and strong privacy guarantee.
In future work, we plan to design a privacy-preserving data analytics with fair exchange, which ensures the payment fairness. We will also consider designing distributed learning systems with heterogeneous architectures
- …