341 research outputs found

    Reverse-Safe Data Structures for Text Indexing

    Get PDF
    We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z-reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D. The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z, we propose an algorithm which constructs a z-reverse-safe data structure that has size O(n) and answers pattern matching queries of length at most d optimally, where d is maximal for any such z-reverse-safe data structure. The construction algorithm takes O(n ω log d) time, where ω is the matrix multiplication exponent. We show that, despite the n ω factor, our engineered implementation takes only a few minutes to finish for million-letter texts. We further show that plugging our method in data analysis applications gives insignificant or no data utility loss. Finally, we show how our technique can be extended to support applications under a realistic adversary model

    Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy

    Full text link
    Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result, such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results, while satisfying the privacy guarantees. Previous work, notably \cite{LHR+10}, has suggested that with an appropriate strategy, processing a batch of correlated queries as a whole achieves considerably higher accuracy than answering them individually. However, to our knowledge there is currently no practical solution to find such a strategy for an arbitrary query batch; existing methods either return strategies of poor quality (often worse than naive methods) or require prohibitively expensive computations for even moderately large domains. Motivated by this, we propose low-rank mechanism (LRM), the first practical differentially private technique for answering batch linear queries with high accuracy. LRM works for both exact (i.e., ϵ\epsilon-) and approximate (i.e., (ϵ\epsilon, δ\delta)-) differential privacy definitions. We derive the utility guarantees of LRM, and provide guidance on how to set the privacy parameters given the user's utility expectation. Extensive experiments using real data demonstrate that our proposed method consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins.Comment: ACM Transactions on Database Systems (ACM TODS). arXiv admin note: text overlap with arXiv:1212.230

    Adversarial Analysis of the Differentially-Private Federated Learning in Cyber-Physical Critical Infrastructures

    Full text link
    Differential privacy (DP) is considered to be an effective privacy-preservation method to secure the promising distributed machine learning (ML) paradigm-federated learning (FL) from privacy attacks (e.g., membership inference attack). Nevertheless, while the DP mechanism greatly alleviates privacy concerns, recent studies have shown that it can be exploited to conduct security attacks (e.g., false data injection attacks). To address such attacks on FL-based applications in critical infrastructures, in this paper, we perform the first systematic study on the DP-exploited poisoning attacks from an adversarial point of view. We demonstrate that the DP method, despite providing a level of privacy guarantee, can effectively open a new poisoning attack vector for the adversary. Our theoretical analysis and empirical evaluation of a smart grid dataset show the FL performance degradation (sub-optimal model generation) scenario due to the differential noise-exploited selective model poisoning attacks. As a countermeasure, we propose a reinforcement learning-based differential privacy level selection (rDP) process. The rDP process utilizes the differential privacy parameters (privacy loss, information leakage probability, etc.) and the losses to intelligently generate an optimal privacy level for the nodes. The evaluation shows the accumulated reward and errors of the proposed technique converge to an optimal privacy policy.Comment: 11 pages, 5 figures, 4 tables. This work has been submitted to IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Ensemble Nonlinear Model Predictive Control for Residential Solar Battery Energy Management

    Get PDF
    In a dynamic distribution market environment, residential prosumers with solar power generation and battery energy storage devices can flexibly interact with the power grid via power exchange. Providing a schedule of this bidirectional power dispatch can facilitate the operational planning for the grid operator and bring additional benefits to the prosumers with some economic incentives. However, the major obstacle to achieving this win-win situation is the difficulty in 1) predicting the nonlinear behaviors of battery degradation under unknown operating conditions and 2) addressing the highly uncertain generation/load patterns, in a computationally viable way. This paper thus establishes a robust short-term dispatch framework for residential prosumers equipped with rooftop solar photovoltaic panels and household batteries. The objective is to achieve the minimum-cost operation under the dynamic distribution energy market environment with stipulated dispatch rules. A general nonlinear optimization problem is formulated, taking into consideration the operating costs due to electricity trading, battery degradation, and various operating constraints. The optimization problem is solved in real-time using a proposed ensemble nonlinear model predictive control-based economic dispatch strategy, where the uncertainty in the forecast has been addressed adequately albeit with limited local data. The effectiveness of the proposed algorithm has been validated using real-world prosumer datasets

    k-Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data

    Full text link
    Data Mining has wide applications in many areas such as banking, medicine, scientific research and among government agencies. Classification is one of the commonly used tasks in data mining applications. For the past decade, due to the rise of various privacy issues, many theoretical and practical solutions to the classification problem have been proposed under different security models. However, with the recent popularity of cloud computing, users now have the opportunity to outsource their data, in encrypted form, as well as the data mining tasks to the cloud. Since the data on the cloud is in encrypted form, existing privacy preserving classification techniques are not applicable. In this paper, we focus on solving the classification problem over encrypted data. In particular, we propose a secure k-NN classifier over encrypted data in the cloud. The proposed k-NN protocol protects the confidentiality of the data, user's input query, and data access patterns. To the best of our knowledge, our work is the first to develop a secure k-NN classifier over encrypted data under the semi-honest model. Also, we empirically analyze the efficiency of our solution through various experiments.Comment: 29 pages, 2 figures, 3 tables arXiv admin note: substantial text overlap with arXiv:1307.482

    PanCast: Listening to Bluetooth Beacons for Epidemic Risk Mitigation

    Full text link
    During the ongoing COVID-19 pandemic, there have been burgeoning efforts to develop and deploy smartphone apps to expedite contact tracing and risk notification. Most of these apps track pairwise encounters between individuals via Bluetooth and then use these tracked encounters to identify and notify those who might have been in proximity of a contagious individual. Unfortunately, these apps have not yet proven sufficiently effective, partly owing to low adoption rates, but also due to the difficult tradeoff between utility and privacy and the fact that, in COVID-19, most individuals do not infect anyone but a few superspreaders infect many in superspreading events. In this paper, we proposePanCast, a privacy-preserving and inclusive system for epidemic risk assessment and notification that scales gracefully with adoption rates, utilizes location and environmental information to increase utility without tracking its users, and can be used to identify superspreading events. To this end, rather than capturing pairwise encounters between smartphones, our system utilizes Bluetooth encounters between beacons placed in strategic locations where superspreading events are most likely to occur and inexpensive, zero-maintenance, small devices that users can attach to their keyring. PanCast allows healthy individuals to use the system in a purely passive "radio" mode, and can assist and benefit from other digital and manual contact tracing systems. Finally, PanCast can be gracefully dismantled at the end of the pandemic, minimizing abuse from any malevolent government or entity

    Checking global usage of resources handled with local policies

    Get PDF
    We present a methodology to reason about resource usage (acquisition, release, revision, and so on) and, in particular, to predict bad usage of resources. Keeping in mind the interplay between local and global information that occur in application-resource interactions, we model resources as entities with local policies and we study global properties that govern overall interactions. Formally, our model is an extension of π-calculus with primitives to manage resources. To predict possible bad usage of resources, we develop a Control Flow Analysis that computes a static over-approximation of process behaviour

    Ekiden: A Platform for Confidentiality-Preserving, Trustworthy, and Performant Smart Contract Execution

    Full text link
    Smart contracts are applications that execute on blockchains. Today they manage billions of dollars in value and motivate visionary plans for pervasive blockchain deployment. While smart contracts inherit the availability and other security assurances of blockchains, however, they are impeded by blockchains' lack of confidentiality and poor performance. We present Ekiden, a system that addresses these critical gaps by combining blockchains with Trusted Execution Environments (TEEs). Ekiden leverages a novel architecture that separates consensus from execution, enabling efficient TEE-backed confidentiality-preserving smart-contracts and high scalability. Our prototype (with Tendermint as the consensus layer) achieves example performance of 600x more throughput and 400x less latency at 1000x less cost than the Ethereum mainnet. Another contribution of this paper is that we systematically identify and treat the pitfalls arising from harmonizing TEEs and blockchains. Treated separately, both TEEs and blockchains provide powerful guarantees, but hybridized, though, they engender new attacks. For example, in naive designs, privacy in TEE-backed contracts can be jeopardized by forgery of blocks, a seemingly unrelated attack vector. We believe the insights learned from Ekiden will prove to be of broad importance in hybridized TEE-blockchain systems
    corecore