3,932 research outputs found
Exploring Privacy Preservation in Outsourced K-Nearest Neighbors with Multiple Data Owners
The k-nearest neighbors (k-NN) algorithm is a popular and effective
classification algorithm. Due to its large storage and computational
requirements, it is suitable for cloud outsourcing. However, k-NN is often run
on sensitive data such as medical records, user images, or personal
information. It is important to protect the privacy of data in an outsourced
k-NN system.
Prior works have all assumed the data owners (who submit data to the
outsourced k-NN system) are a single trusted party. However, we observe that in
many practical scenarios, there may be multiple mutually distrusting data
owners. In this work, we present the first framing and exploration of privacy
preservation in an outsourced k-NN system with multiple data owners. We
consider the various threat models introduced by this modification. We discover
that under a particularly practical threat model that covers numerous
scenarios, there exists a set of adaptive attacks that breach the data privacy
of any exact k-NN system. The vulnerability is a result of the mathematical
properties of k-NN and its output. Thus, we propose a privacy-preserving
alternative system supporting kernel density estimation using a Gaussian
kernel, a classification algorithm from the same family as k-NN. In many
applications, this similar algorithm serves as a good substitute for k-NN. We
additionally investigate solutions for other threat models, often through
extensions on prior single data owner systems
Should We Learn Probabilistic Models for Model Checking? A New Approach and An Empirical Study
Many automated system analysis techniques (e.g., model checking, model-based
testing) rely on first obtaining a model of the system under analysis. System
modeling is often done manually, which is often considered as a hindrance to
adopt model-based system analysis and development techniques. To overcome this
problem, researchers have proposed to automatically "learn" models based on
sample system executions and shown that the learned models can be useful
sometimes. There are however many questions to be answered. For instance, how
much shall we generalize from the observed samples and how fast would learning
converge? Or, would the analysis result based on the learned model be more
accurate than the estimation we could have obtained by sampling many system
executions within the same amount of time? In this work, we investigate
existing algorithms for learning probabilistic models for model checking,
propose an evolution-based approach for better controlling the degree of
generalization and conduct an empirical study in order to answer the questions.
One of our findings is that the effectiveness of learning may sometimes be
limited.Comment: 15 pages, plus 2 reference pages, accepted by FASE 2017 in ETAP
Value Iteration for Long-run Average Reward in Markov Decision Processes
Markov decision processes (MDPs) are standard models for probabilistic
systems with non-deterministic behaviours. Long-run average rewards provide a
mathematically elegant formalism for expressing long term performance. Value
iteration (VI) is one of the simplest and most efficient algorithmic approaches
to MDPs with other properties, such as reachability objectives. Unfortunately,
a naive extension of VI does not work for MDPs with long-run average rewards,
as there is no known stopping criterion. In this work our contributions are
threefold. (1) We refute a conjecture related to stopping criteria for MDPs
with long-run average rewards. (2) We present two practical algorithms for MDPs
with long-run average rewards based on VI. First, we show that a combination of
applying VI locally for each maximal end-component (MEC) and VI for
reachability objectives can provide approximation guarantees. Second, extending
the above approach with a simulation-guided on-demand variant of VI, we present
an anytime algorithm that is able to deal with very large models. (3) Finally,
we present experimental results showing that our methods significantly
outperform the standard approaches on several benchmarks
Autonomous Highway Systems Safety and Security
Automated vehicles are getting closer each day to large-scale deployment. It is expected that self-driving cars will be able to alleviate traffic congestion by safely operating at distances closer than human drivers are capable of and will overall improve traffic throughput. In these conditions, passenger safety and security is of utmost importance.
When multiple autonomous cars follow each other on a highway, they will form what is known as a cyber-physical system. In a general setting, there are tools to assess the level of influence a possible attacker can have on such a system, which then describes the level of safety and security. An attacker might attempt to counter the benefits of automation by causing collisions and/or decreasing highway throughput.
These strings (platoons) of automated vehicles will rely on control algorithms to maintain required distances from other cars and objects around them. The vehicle dynamics themselves and the controllers used will form the cyber-physical system and its response to an attacker can be assessed in the context of multiple interacting vehicles.
While the vehicle dynamics play a pivotal role in the security of this system, the choice of controller can also be leveraged to enhance the safety of such a system. After knowledge of some attacker capabilities, adversarial-aware controllers can be designed to react to the presence of an attacker, adding an extra level of security.
This work will attempt to address these issues in vehicular platooning. Firstly, a general analysis concerning the capabilities of possible attacks in terms of control system theory will be presented. Secondly, mitigation strategies to some of these attacks will be discussed. Finally, the results of an experimental validation of these mitigation strategies and their implications will be shown
An Analysis of How Many Undiscovered Vulnerabilities Remain in Information Systems
Vulnerability management strategy, from both organizational and public policy
perspectives, hinges on an understanding of the supply of undiscovered
vulnerabilities. If the number of undiscovered vulnerabilities is small enough,
then a reasonable investment strategy would be to focus on finding and removing
the remaining undiscovered vulnerabilities. If the number of undiscovered
vulnerabilities is and will continue to be large, then a better investment
strategy would be to focus on quick patch dissemination and engineering
resilient systems. This paper examines a paradigm, namely that the number of
undiscovered vulnerabilities is manageably small, through the lens of
mathematical concepts from the theory of computing. From this perspective, we
find little support for the paradigm of limited undiscovered vulnerabilities.
We then briefly support the notion that these theory-based conclusions are
relevant to practical computers in use today. We find no reason to believe
undiscovered vulnerabilities are not essentially unlimited in practice and we
examine the possible economic impacts should this be the case. Based on our
analysis, we recommend vulnerability management strategy adopts an approach
favoring quick patch dissemination and engineering resilient systems, while
continuing good software engineering practices to reduce (but never eliminate)
vulnerabilities in information systems
Secure Compute-and-Forward in a Bidirectional Relay
We consider the basic bidirectional relaying problem, in which two users in a
wireless network wish to exchange messages through an intermediate relay node.
In the compute-and-forward strategy, the relay computes a function of the two
messages using the naturally-occurring sum of symbols simultaneously
transmitted by user nodes in a Gaussian multiple access (MAC) channel, and the
computed function value is forwarded to the user nodes in an ensuing broadcast
phase. In this paper, we study the problem under an additional security
constraint, which requires that each user's message be kept secure from the
relay. We consider two types of security constraints: perfect secrecy, in which
the MAC channel output seen by the relay is independent of each user's message;
and strong secrecy, which is a form of asymptotic independence. We propose a
coding scheme based on nested lattices, the main feature of which is that given
a pair of nested lattices that satisfy certain "goodness" properties, we can
explicitly specify probability distributions for randomization at the encoders
to achieve the desired security criteria. In particular, our coding scheme
guarantees perfect or strong secrecy even in the absence of channel noise. The
noise in the channel only affects reliability of computation at the relay, and
for Gaussian noise, we derive achievable rates for reliable and secure
computation. We also present an application of our methods to the multi-hop
line network in which a source needs to transmit messages to a destination
through a series of intermediate relays.Comment: v1 is a much expanded and updated version of arXiv:1204.6350; v2 is a
minor revision to fix some notational issues; v3 is a much expanded and
updated version of v2, and contains results on both perfect secrecy and
strong secrecy; v3 is a revised manuscript submitted to the IEEE Transactions
on Information Theory in April 201
A probabilistic design for practical homomorphic majority voting with intrinsic differential privacy
As machine learning (ML) has become pervasive throughout various fields (industry, healthcare, social networks), privacy concerns regarding the data used for its training have gained a critical importance. In settings where several parties wish to collaboratively train a common model without jeopardizing their sensitive data, the need for a private training protocol is particularly stringent and implies to protect the data against both the model's end-users and the other actors of the training phase. In this context of secure collaborative learning, Differential Privacy (DP) and Fully Homomorphic Encryption (FHE) are two complementary countermeasures of growing interest to thwart privacy attacks in ML systems. Central to many collaborative training protocols, in the line of PATE, is majority voting aggregation. Thus, in this paper, we design SHIELD, a probabilistic approximate majority voting operator which is faster when homomorphically executed than existing approaches based on exact argmax computation over an histogram of votes. As an additional benefit, the inaccuracy of SHIELD is used as a feature to provably enable DP guarantees. Although SHIELD may have other applications, we focus here on one setting and seamlessly integrate it in the SPEED collaborative training framework from \cite{grivet2021speed} to improve its computational efficiency. After thoroughly describing the FHE implementation of our algorithm and its DP analysis, we present experimental results. To the best of our knowledge, it is the first work in which relaxing the accuracy of an algorithm is constructively usable as a degree of freedom to achieve better FHE performances
- …