11,122 research outputs found

    Building Confidential and Efficient Query Services in the Cloud with RASP Data Perturbation

    Full text link
    With the wide deployment of public cloud computing infrastructures, using clouds to host data query services has become an appealing solution for the advantages on scalability and cost-saving. However, some data might be sensitive that the data owner does not want to move to the cloud unless the data confidentiality and query privacy are guaranteed. On the other hand, a secured query service should still provide efficient query processing and significantly reduce the in-house workload to fully realize the benefits of cloud computing. We propose the RASP data perturbation method to provide secure and efficient range query and kNN query services for protected data in the cloud. The RASP data perturbation method combines order preserving encryption, dimensionality expansion, random noise injection, and random projection, to provide strong resilience to attacks on the perturbed data and queries. It also preserves multidimensional ranges, which allows existing indexing techniques to be applied to speedup range query processing. The kNN-R algorithm is designed to work with the RASP range query algorithm to process the kNN queries. We have carefully analyzed the attacks on data and queries under a precisely defined threat model and realistic security assumptions. Extensive experiments have been conducted to show the advantages of this approach on efficiency and security.Comment: 18 pages, to appear in IEEE TKDE, accepted in December 201

    GPU-based Private Information Retrieval for On-Device Machine Learning Inference

    Full text link
    On-device machine learning (ML) inference can enable the use of private user data on user devices without revealing them to remote servers. However, a pure on-device solution to private ML inference is impractical for many applications that rely on embedding tables that are too large to be stored on-device. In particular, recommendation models typically use multiple embedding tables each on the order of 1-10 GBs of data, making them impractical to store on-device. To overcome this barrier, we propose the use of private information retrieval (PIR) to efficiently and privately retrieve embeddings from servers without sharing any private information. As off-the-shelf PIR algorithms are usually too computationally intensive to directly use for latency-sensitive inference tasks, we 1) propose novel GPU-based acceleration of PIR, and 2) co-design PIR with the downstream ML application to obtain further speedup. Our GPU acceleration strategy improves system throughput by more than 20×20 \times over an optimized CPU PIR implementation, and our PIR-ML co-design provides an over 5×5 \times additional throughput improvement at fixed model quality. Together, for various on-device ML applications such as recommendation and language modeling, our system on a single V100 GPU can serve up to 100,000100,000 queries per second -- a >100×>100 \times throughput improvement over a CPU-based baseline -- while maintaining model accuracy

    Private Information Retrieval with Private Noisy Side Information

    Full text link
    Consider Private Information Retrieval (PIR), where a client wants to retrieve one file out of KK files that are replicated in NN different servers and the client selection must remain private when up to TT servers may collude. Additionally, suppose that the client has noisy side information about each of the KK files, and the side information about a specific file is obtained by passing this file through one of DD possible discrete memoryless test channels, where D≤KD\le K. While the statistics of the test channels are known by the client and by all the servers, the specific mapping M\boldsymbol{\mathcal{M}} between the files and the test channels is unknown to the servers. We study this problem under two different privacy metrics. Under the first privacy metric, the client wants to preserve the privacy of its desired file selection and the mapping M\boldsymbol{\mathcal{M}}. Under the second privacy metric, the client wants to preserve the privacy of its desired file and the mapping M\boldsymbol{\mathcal{M}}, but is willing to reveal the index of the test channel that is associated to its desired file. For both of these two privacy metrics, we derive the optimal normalized download cost. Our problem setup generalizes PIR with colluding servers, PIR with private noiseless side information, and PIR with private side information under storage constraints

    Design of Theoretical Framework: Global and Local Parameters Requirements for Libraries

    Get PDF
    Library is one of the important aspect in modern reading environment. Theoretical framework is an inevitable and indispensable for each and every library in the field of automated and digital library system. In this original research paper all the parameters have selected on the basis of global recommendations and local requirements for libraries in six theoretical sections. Designing the theoretical framework in the following areas such as (i) Theoretical framework of integrated library system cluster (ii) Theoretical framework of community communication and interaction (iii) Theoretical framework of digital media archiving cluster (iv) Theoretical framework of content management system (v) Theoretical framework of learning content management system (vi) Theoretical framework of federated search system. Integrated library system cluster two things are more important development of ILS and open source ILS software. On the other hand it also crafted the requirement of parameters selection and it can be developed in three ways such as basic parameters settings, theoretical framework for housekeeping operations, and theoretical framework for information retrieval system. Software selection and parameter selection is also an pivotal tasks in the field or theoretical framework of community communication and interaction. Theoretical framework of digital media archiving cluster can be developed in three sections such as selection of software, selection of standards, and metadata selection for all the libraries. Content management system can be developed in three ways such as workflow of content management system, software selection in CMS cluster, and parameters selection in CMS cluster. Development of theoretical framework of learning content management system for libraries in three sections such as Components of Learning Content Management System , Software selection in LCMS cluster, and Parameters selection in LCMS cluster. Software selection and parameters selection is also an important components in the federated search system theoretical framework for the development of single window based interface
    • …
    corecore