83 research outputs found
Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications
We present Chameleon, a novel hybrid (mixed-protocol) framework for secure
function evaluation (SFE) which enables two parties to jointly compute a
function without disclosing their private inputs. Chameleon combines the best
aspects of generic SFE protocols with the ones that are based upon additive
secret sharing. In particular, the framework performs linear operations in the
ring using additively secret shared values and nonlinear
operations using Yao's Garbled Circuits or the Goldreich-Micali-Wigderson
protocol. Chameleon departs from the common assumption of additive or linear
secret sharing models where three or more parties need to communicate in the
online phase: the framework allows two parties with private inputs to
communicate in the online phase under the assumption of a third node generating
correlated randomness in an offline phase. Almost all of the heavy
cryptographic operations are precomputed in an offline phase which
substantially reduces the communication overhead. Chameleon is both scalable
and significantly more efficient than the ABY framework (NDSS'15) it is based
on. Our framework supports signed fixed-point numbers. In particular,
Chameleon's vector dot product of signed fixed-point numbers improves the
efficiency of mining and classification of encrypted data for algorithms based
upon heavy matrix multiplications. Our evaluation of Chameleon on a 5 layer
convolutional deep neural network shows 133x and 4.2x faster executions than
Microsoft CryptoNets (ICML'16) and MiniONN (CCS'17), respectively
ESCAPED: Efficient Secure and Private Dot Product Framework for Kernel-based Machine Learning Algorithms with Applications in Healthcare
To train sophisticated machine learning models one usually needs many
training samples. Especially in healthcare settings these samples can be very
expensive, meaning that one institution alone usually does not have enough on
its own. Merging privacy-sensitive data from different sources is usually
restricted by data security and data protection measures. This can lead to
approaches that reduce data quality by putting noise onto the variables (e.g.,
in -differential privacy) or omitting certain values (e.g., for
-anonymity). Other measures based on cryptographic methods can lead to very
time-consuming computations, which is especially problematic for larger
multi-omics data. We address this problem by introducing ESCAPED, which stands
for Efficient SeCure And PrivatE Dot product framework, enabling the
computation of the dot product of vectors from multiple sources on a
third-party, which later trains kernel-based machine learning algorithms, while
neither sacrificing privacy nor adding noise. We evaluated our framework on
drug resistance prediction for HIV-infected people and multi-omics
dimensionality reduction and clustering problems in precision medicine. In
terms of execution time, our framework significantly outperforms the
best-fitting existing approaches without sacrificing the performance of the
algorithm. Even though we only show the benefit for kernel-based algorithms,
our framework can open up new research opportunities for further machine
learning models that require the dot product of vectors from multiple sources.Comment: AAAI 2021, Preprint version of the full paper with supplementary
materia
Outsourced Privacy-Preserving kNN Classifier Model Based on Multi-Key Homomorphic Encryption
Outsourcing the k-Nearest Neighbor (kNN) classifier to the cloud is useful, yet it will lead to serious privacy leakage due to sensitive outsourced data and models. In this paper, we design, implement and evaluate a new system employing an outsourced privacy-preserving kNN Classifier Model based on Multi-Key Homomorphic Encryption (kNNCM-MKHE). We firstly propose a security protocol based on Multi-key Brakerski-Gentry-Vaikuntanathan (BGV) for collaborative evaluation of the kNN classifier provided by multiple model owners. Analyze the operations of kNN and extract basic operations, such as addition, multiplication, and comparison. It supports the computation of encrypted data with different public keys. At the same time, we further design a new scheme that outsources evaluation works to a third-party evaluator who should not have access to the models and data. In the evaluation process, each model owner encrypts the model and uploads the encrypted models to the evaluator. After receiving encrypted the kNN classifier and the user’s inputs, the evaluator calculated the aggregated results. The evaluator will perform a secure computing protocol to aggregate the number of each class label. Then, it sends the class labels with their associated counts to the user. Each model owner and user encrypt the result together. No information will be disclosed to the evaluator. The experimental results show that our new system can securely allow multiple model owners to delegate the evaluation of kNN classifier
Recommended from our members
Toward practical and private online services
Today's common online services (social networks, media streaming, messaging,
email, etc.) bring convenience. However, these services are susceptible to
privacy leaks. Certainly, email snooping by rogue employees, email server
hacks, and accidental disclosures of user ratings for movies are some
sources of private information leakage. This dissertation investigates the
following question: Can we build systems that (a) provide strong privacy
guarantees to the users, (b) are consistent with existing commercial and policy
regimes, and (c) are affordable?
Satisfying all three requirements simultaneously is challenging, as providing
strong privacy guarantees usually necessitates either sacrificing functionality,
incurring high resource costs, or both. Indeed, there are powerful cryptographic
protocols---private information retrieval (PIR), and secure two-party
computation (2PC)---that provide strong guarantees but are orders of magnitude
more expensive than their non-private counterparts. This dissertation takes
these protocols as a starting point and then substantially reduces their costs
by tailoring them using application-specific properties. It presents two
systems, Popcorn and Pretzel, built on this design ethos.
Popcorn is a Netflix-like media delivery system, that provably hides, even from
the content distributor (for example, Netflix), which movie a user is watching.
Popcorn tailors PIR protocols to the media domain. It amortizes the server-side
overhead of PIR by batching requests from the large number of concurrent users
retrieving content at any given time; and, it forms large batches without
introducing playback delays by leveraging the properties of media streaming.
Popcorn is consistent with the prevailing commercial regime (copyrights, etc.),
and its per-request dollar cost is 3.87 times that of a non-private system.
The other system described in this dissertation, Pretzel, is an email system
that encrypts emails end-to-end between senders and intended recipients, but
allows the email service provider to perform content-based spam filtering and
targeted advertising. Pretzel refines a 2PC protocol. It reduces the resource
consumption of the protocol by replacing the underlying encryption scheme with a
more efficient one, applying a packing technique to conserve invocations of the
encryption algorithm, and pruning the inputs to the protocol. Pretzel's costs,
versus a legacy non-private implementation, are estimated to be up to 5.4 times
for the email provider, with additional but modest client-side requirements.
Popcorn and Pretzel have fundamental connections. For instance, the
cryptographic protocols in both systems securely compute vector-matrix products.
However, we observe that differences in the vector and matrix dimensions lead to
different system designs.
Ultimately, both systems represent a potentially appealing compromise: sacrifice
some functionality to build in strong privacy properties at affordable costs.Computer Science
Confidential Boosting with Random Linear Classifiers for Outsourced User-generated Data
User-generated data is crucial to predictive modeling in many applications.
With a web/mobile/wearable interface, a data owner can continuously record data
generated by distributed users and build various predictive models from the
data to improve their operations, services, and revenue. Due to the large size
and evolving nature of users data, data owners may rely on public cloud service
providers (Cloud) for storage and computation scalability. Exposing sensitive
user-generated data and advanced analytic models to Cloud raises privacy
concerns. We present a confidential learning framework, SecureBoost, for data
owners that want to learn predictive models from aggregated user-generated data
but offload the storage and computational burden to Cloud without having to
worry about protecting the sensitive data. SecureBoost allows users to submit
encrypted or randomly masked data to designated Cloud directly. Our framework
utilizes random linear classifiers (RLCs) as the base classifiers in the
boosting framework to dramatically simplify the design of the proposed
confidential boosting protocols, yet still preserve the model quality. A
Cryptographic Service Provider (CSP) is used to assist the Cloud's processing,
reducing the complexity of the protocol constructions. We present two
constructions of SecureBoost: HE+GC and SecSh+GC, using combinations of
homomorphic encryption, garbled circuits, and random masking to achieve both
security and efficiency. For a boosted model, Cloud learns only the RLCs and
the CSP learns only the weights of the RLCs. Finally, the data owner collects
the two parts to get the complete model. We conduct extensive experiments to
understand the quality of the RLC-based boosting and the cost distribution of
the constructions. Our results show that SecureBoost can efficiently learn
high-quality boosting models from protected user-generated data
Efficient Privacy-Aware Imagery Data Analysis
The widespread use of smartphones and camera-coupled Internet of Thing (IoT) devices triggers an explosive growth of imagery data. To extract and process the rich contents contained in imagery data, various image analysis techniques have been investigated and applied to a spectrum of application scenarios. In recent years, breakthroughs in deep learning have powered a new revolution for image analysis in terms of effectiveness with high resource consumption. Given the fact that most smartphones and IoT devices have limited computational capability and battery life, they are not ready for the processing of computational intensive analytics over imagery data collected by them, especially when deep learning is involved. To resolve the bottleneck of computation, storage, and energy for these resource constrained devices, offloading complex image analysis to public cloud computing platforms has become a promising trend in both academia and industry. However, an outstanding challenge with public cloud is on the protection of sensitive information contained in many imagery data, such as personal identities and financial data. Directly sending imagery data to the public cloud can cause serious privacy concerns and even legal issues.
In this dissertation, I propose a comprehensive privacy-preserving imagery data analysis framework which can be integrated in different application scenarios to assist image analysis for resource-constrained devices with efficiency, accuracy, and privacy protection. I first identify security challenges in the utilization of public cloud for image analysis. Then, I design and develop a set of novel solutions to address these challenges. These solutions will be featured by strong privacy guarantee, lightweight computation, low accuracy loss compared with image analysis without privacy protection. To optimize the communication overhead and resource utilization of using cloud computing, I investigate edge computing, which is a promising technique to ameliorate the high communication overhead in cloud-assisted architectures. Furthermore, to boost the performance of my solutions under both cloud and edge deployment, I also provide a set of pluggable enhancement modules to be applied to meet different requirements for various tasks. By exploring the features of edge computing and cloud computing, I flexibly incorporate them as a comprehensive framework to provide privacy-preserving image analysis services
Data Service Outsourcing and Privacy Protection in Mobile Internet
Mobile Internet data have the characteristics of large scale, variety of patterns, and complex association. On the one hand, it needs efficient data processing model to provide support for data services, and on the other hand, it needs certain computing resources to provide data security services. Due to the limited resources of mobile terminals, it is impossible to complete large-scale data computation and storage. However, outsourcing to third parties may cause some risks in user privacy protection. This monography focuses on key technologies of data service outsourcing and privacy protection, including the existing methods of data analysis and processing, the fine-grained data access control through effective user privacy protection mechanism, and the data sharing in the mobile Internet
- …