3,932 research outputs found

    Exploring Privacy Preservation in Outsourced K-Nearest Neighbors with Multiple Data Owners

    Full text link
    The k-nearest neighbors (k-NN) algorithm is a popular and effective classification algorithm. Due to its large storage and computational requirements, it is suitable for cloud outsourcing. However, k-NN is often run on sensitive data such as medical records, user images, or personal information. It is important to protect the privacy of data in an outsourced k-NN system. Prior works have all assumed the data owners (who submit data to the outsourced k-NN system) are a single trusted party. However, we observe that in many practical scenarios, there may be multiple mutually distrusting data owners. In this work, we present the first framing and exploration of privacy preservation in an outsourced k-NN system with multiple data owners. We consider the various threat models introduced by this modification. We discover that under a particularly practical threat model that covers numerous scenarios, there exists a set of adaptive attacks that breach the data privacy of any exact k-NN system. The vulnerability is a result of the mathematical properties of k-NN and its output. Thus, we propose a privacy-preserving alternative system supporting kernel density estimation using a Gaussian kernel, a classification algorithm from the same family as k-NN. In many applications, this similar algorithm serves as a good substitute for k-NN. We additionally investigate solutions for other threat models, often through extensions on prior single data owner systems

    Should We Learn Probabilistic Models for Model Checking? A New Approach and An Empirical Study

    Get PDF
    Many automated system analysis techniques (e.g., model checking, model-based testing) rely on first obtaining a model of the system under analysis. System modeling is often done manually, which is often considered as a hindrance to adopt model-based system analysis and development techniques. To overcome this problem, researchers have proposed to automatically "learn" models based on sample system executions and shown that the learned models can be useful sometimes. There are however many questions to be answered. For instance, how much shall we generalize from the observed samples and how fast would learning converge? Or, would the analysis result based on the learned model be more accurate than the estimation we could have obtained by sampling many system executions within the same amount of time? In this work, we investigate existing algorithms for learning probabilistic models for model checking, propose an evolution-based approach for better controlling the degree of generalization and conduct an empirical study in order to answer the questions. One of our findings is that the effectiveness of learning may sometimes be limited.Comment: 15 pages, plus 2 reference pages, accepted by FASE 2017 in ETAP

    Value Iteration for Long-run Average Reward in Markov Decision Processes

    Full text link
    Markov decision processes (MDPs) are standard models for probabilistic systems with non-deterministic behaviours. Long-run average rewards provide a mathematically elegant formalism for expressing long term performance. Value iteration (VI) is one of the simplest and most efficient algorithmic approaches to MDPs with other properties, such as reachability objectives. Unfortunately, a naive extension of VI does not work for MDPs with long-run average rewards, as there is no known stopping criterion. In this work our contributions are threefold. (1) We refute a conjecture related to stopping criteria for MDPs with long-run average rewards. (2) We present two practical algorithms for MDPs with long-run average rewards based on VI. First, we show that a combination of applying VI locally for each maximal end-component (MEC) and VI for reachability objectives can provide approximation guarantees. Second, extending the above approach with a simulation-guided on-demand variant of VI, we present an anytime algorithm that is able to deal with very large models. (3) Finally, we present experimental results showing that our methods significantly outperform the standard approaches on several benchmarks

    Autonomous Highway Systems Safety and Security

    Get PDF
    Automated vehicles are getting closer each day to large-scale deployment. It is expected that self-driving cars will be able to alleviate traffic congestion by safely operating at distances closer than human drivers are capable of and will overall improve traffic throughput. In these conditions, passenger safety and security is of utmost importance. When multiple autonomous cars follow each other on a highway, they will form what is known as a cyber-physical system. In a general setting, there are tools to assess the level of influence a possible attacker can have on such a system, which then describes the level of safety and security. An attacker might attempt to counter the benefits of automation by causing collisions and/or decreasing highway throughput. These strings (platoons) of automated vehicles will rely on control algorithms to maintain required distances from other cars and objects around them. The vehicle dynamics themselves and the controllers used will form the cyber-physical system and its response to an attacker can be assessed in the context of multiple interacting vehicles. While the vehicle dynamics play a pivotal role in the security of this system, the choice of controller can also be leveraged to enhance the safety of such a system. After knowledge of some attacker capabilities, adversarial-aware controllers can be designed to react to the presence of an attacker, adding an extra level of security. This work will attempt to address these issues in vehicular platooning. Firstly, a general analysis concerning the capabilities of possible attacks in terms of control system theory will be presented. Secondly, mitigation strategies to some of these attacks will be discussed. Finally, the results of an experimental validation of these mitigation strategies and their implications will be shown

    An Analysis of How Many Undiscovered Vulnerabilities Remain in Information Systems

    Full text link
    Vulnerability management strategy, from both organizational and public policy perspectives, hinges on an understanding of the supply of undiscovered vulnerabilities. If the number of undiscovered vulnerabilities is small enough, then a reasonable investment strategy would be to focus on finding and removing the remaining undiscovered vulnerabilities. If the number of undiscovered vulnerabilities is and will continue to be large, then a better investment strategy would be to focus on quick patch dissemination and engineering resilient systems. This paper examines a paradigm, namely that the number of undiscovered vulnerabilities is manageably small, through the lens of mathematical concepts from the theory of computing. From this perspective, we find little support for the paradigm of limited undiscovered vulnerabilities. We then briefly support the notion that these theory-based conclusions are relevant to practical computers in use today. We find no reason to believe undiscovered vulnerabilities are not essentially unlimited in practice and we examine the possible economic impacts should this be the case. Based on our analysis, we recommend vulnerability management strategy adopts an approach favoring quick patch dissemination and engineering resilient systems, while continuing good software engineering practices to reduce (but never eliminate) vulnerabilities in information systems

    Secure Compute-and-Forward in a Bidirectional Relay

    Full text link
    We consider the basic bidirectional relaying problem, in which two users in a wireless network wish to exchange messages through an intermediate relay node. In the compute-and-forward strategy, the relay computes a function of the two messages using the naturally-occurring sum of symbols simultaneously transmitted by user nodes in a Gaussian multiple access (MAC) channel, and the computed function value is forwarded to the user nodes in an ensuing broadcast phase. In this paper, we study the problem under an additional security constraint, which requires that each user's message be kept secure from the relay. We consider two types of security constraints: perfect secrecy, in which the MAC channel output seen by the relay is independent of each user's message; and strong secrecy, which is a form of asymptotic independence. We propose a coding scheme based on nested lattices, the main feature of which is that given a pair of nested lattices that satisfy certain "goodness" properties, we can explicitly specify probability distributions for randomization at the encoders to achieve the desired security criteria. In particular, our coding scheme guarantees perfect or strong secrecy even in the absence of channel noise. The noise in the channel only affects reliability of computation at the relay, and for Gaussian noise, we derive achievable rates for reliable and secure computation. We also present an application of our methods to the multi-hop line network in which a source needs to transmit messages to a destination through a series of intermediate relays.Comment: v1 is a much expanded and updated version of arXiv:1204.6350; v2 is a minor revision to fix some notational issues; v3 is a much expanded and updated version of v2, and contains results on both perfect secrecy and strong secrecy; v3 is a revised manuscript submitted to the IEEE Transactions on Information Theory in April 201

    A probabilistic design for practical homomorphic majority voting with intrinsic differential privacy

    Get PDF
    As machine learning (ML) has become pervasive throughout various fields (industry, healthcare, social networks), privacy concerns regarding the data used for its training have gained a critical importance. In settings where several parties wish to collaboratively train a common model without jeopardizing their sensitive data, the need for a private training protocol is particularly stringent and implies to protect the data against both the model's end-users and the other actors of the training phase. In this context of secure collaborative learning, Differential Privacy (DP) and Fully Homomorphic Encryption (FHE) are two complementary countermeasures of growing interest to thwart privacy attacks in ML systems. Central to many collaborative training protocols, in the line of PATE, is majority voting aggregation. Thus, in this paper, we design SHIELD, a probabilistic approximate majority voting operator which is faster when homomorphically executed than existing approaches based on exact argmax computation over an histogram of votes. As an additional benefit, the inaccuracy of SHIELD is used as a feature to provably enable DP guarantees. Although SHIELD may have other applications, we focus here on one setting and seamlessly integrate it in the SPEED collaborative training framework from \cite{grivet2021speed} to improve its computational efficiency. After thoroughly describing the FHE implementation of our algorithm and its DP analysis, we present experimental results. To the best of our knowledge, it is the first work in which relaxing the accuracy of an algorithm is constructively usable as a degree of freedom to achieve better FHE performances
    corecore