11 research outputs found
Recommended from our members
A parallelized disjunctive query based searchable encryption scheme for big data
Searchable Encryption (SE) allows a client to search over large amounts of encrypted data outsourced to the Cloud. Although, this helps to maintain the confidentiality of the outsourced data but achieving privacy is a difficult and resource intensive task. With the increase in the query effectiveness, i.e., by shifting from single keyword SE to multi-keyword SE there is a notable drop in the efficiency. This motivates to make use of the advances in the multi-core architectures and multiple threads where the search can be delegated across different threads to perform search in a parallel fashion. The proposed scheme is based on probabilistic trapdoors that are formed by making use of the property of modular inverses. The use of probabilistic trapdoors helps resist distinguishability attacks. The rigorous security analysis helps us to appreciate the advantage of having a probabilistic trapdoor. Furthermore, to validate the performance of the proposed scheme, it is implemented and deployed onto the British Telecommunication's Public Cloud offering and tested over a real speech corpus. The implementation is also extended to anticipate the performance gain by using the multi-core architecture that helps to maintain the lightweight property of the scheme
Security, Trust and Privacy in Cyber (STPCyber): Future trends and challenges
© 2020 Today's world experiences massively interconnected devices to share information across variety of platforms between traditional computers (machines), Smart IoT devices used across smart homes, smart interconnected vehicles etc. and of course the social networks apps such as Facebook, Linkdn, twitter etc. We experience the growth has been skyrocketing and the trend will continue exponentially to the future. At one end, we find life becomes easier with such developments and at the other end; we experience more and more cyber threats on our privacy, security and trustworthiness with organizations holding our data. In this special issue, we summarize contributions by authors in advanced topics related to security, trust and privacy based on a range of applications and present a selection of the most recent research efforts in these areas
Private Eyes: Zero-Leakage Iris Searchable Encryption
Biometric databases are being deployed with few cryptographic protections. Because of the nature of biometrics, privacy breaches affect users for their entire life.
This work introduces Private Eyes, the first zero-leakage biometric database. The only leakage of the system is unavoidable: 1) the log of the dataset size and 2) the fact that a query occurred. Private Eyes is built from symmetric searchable encryption. Proximity queries are the required functionality: given a noisy reading of a biometric, the goal is to retrieve all stored records that are close enough according to a distance metric.
Private Eyes combines locality sensitive-hashing or LSHs (Indyk and Motwani, STOC 1998) and encrypted maps. One searches for the disjunction of the LSHs of a noisy biometric reading. The underlying encrypted map needs to efficiently answer disjunction queries.
We focus on the iris biometric. Iris biometric data requires a large number of LSHs, approximately 1000. The most relevant prior work is in zero-leakage k-nearest-neighbor search (Boldyreva and Tang, PoPETS 2021), but that work is designed for a small number of LSHs.
Our main cryptographic tool is a zero-leakage disjunctive map designed for the setting when most clauses do not match any records. For the iris, on average at most 6% of LSHs match any stored value.
To aid in evaluation, we produce a synthetic iris generation tool to evaluate sizes beyond available iris datasets. This generation tool is a simple generative adversarial network. Accurate statistics are crucial to optimizing the cryptographic primitives so this tool may be of independent interest.
Our scheme is implemented and open-sourced. For the largest tested parameters of 5000 stored irises, search requires 26 rounds of communication and 26 minutes of single-threaded computation
Practical yet Provably Secure: Complex Database Query Execution over Encrypted Data
Encrypted databases provide security for outsourced data. In this work novel encryption schemes supporting different database query types are presented enabling complex database queries over encrypted data. For specific constructions enabling exact keyword queries, range queries, database joins and substring queries over encrypted data we prove security in a formal framework, present a theoretical runtime analysis and provide an assessment of practical performance characteristics
A Systematic Review on the Status and Progress of Homomorphic Encryption Technologies
With the emergence of big data and the continued growth in cloud computing applications, serious security and privacy concerns emerged. Consequently, several researchers and cybersecurity experts have embarked on a quest to extend data encryption to big data systems and cloud computing applications. As most cloud users turn to using public cloud services, confidentiality becomes and even more complicated issue. Cloud clients storing their data on a public cloud always seek solutions to confidentiality problem. Homomorphic encryption emerged as a possible solution where client’s data is encrypted on the cloud in a way that allows some search and manipulation operations without proper decryption.
In this paper, we present a systematic review of research paper published in the field of homomorphic encryption. This paper uses PRISMA checklist alongside some items of Cochrane’s Quality Assessment to review studies retrieved from various resources. It was highly noticeable in the reviewed papers that security in big data and cloud computing has received most attention. Most papers suggested the use of homomorphic encryption although the thematic analysis has identified other potential concerns. Regarding the quality of the articles, 38% of the articles failed to meet three checklist items, including explicit statement of research objectives, procedure recognition and sources of funding used in the study. The review also presented compendium textual analysis of different homomorphic encryption algorithms, application areas, and areas of future developments. Results of the evaluation through PRISMA and the Cochrane tool showed that a majority of research articles discussed the potential
use and application of Homomorphic Encryption as a solution to the growing demands of big data and absence of security and privacy mechanisms therein. This was evident from 26 of the total 59 articles that met the inclusion criteria. The term Homomorphic Encryption appeared 1802 times in the word cloud derived from the selected articles, which speaks of its potential to ensure security and privacy, while also preserving the CIA triad in the context of big data and cloud computing
Recommended from our members
Multi-Version Search and Cache-Conscious Ranking Optimization
Organizations and companies archive many versions of digital data such as web pages, internal emails and so on. Such data is critical for internal investigation, regulatory compliance, and electronic discovery. It is estimated that electronic discovery market that leverages archival data will reach $9.9 billions globally in 2017. It is not uncommon for many businesses to retain archived collections for 10 to 15 years. How to archive these versioned data is worth to study and we are facing many challenges including 1) traditional index occupies too much space for versioned data, 2) traditional search is too slow on versioned data, and 3) how to guarantee high accuracy when improving efficiency in new architecture.In this dissertation, we take the opportunity of the fast development of information retrieval and tackle the problem by proposing a new multi-version search architecture with cache-conscious ranking optimization framework. Specifically, we will first discuss our new versioned search architecture. Then, we will talk about a cache-conscious online ranking algorithm to improve the online part. Finally, we will describe a framework to select best blocking methods and parameters for our algorithm to achieve best performance.Firstly, we present our new multi-version search architecture. We propose an approach that uses cluster-based retrieval to quickly narrow the search scope guided by version representatives at Phase 1 and develops a hybrid index structure with adaptive runtime data traversal to speed up Phase 2 search. The hybrid scheme exploits the advantages of forward index and inverted index based on the term characteristics to minimize the time in extracting positional and other feature information during runtime search. We compare several indexing and data traversal options with different time and space tradeoffs and describe evaluation results to demonstrate their effectiveness. The experiment results show that the proposed scheme can be up-to about 4x as fast as the previous work on solid state drives while retaining good relevance.Secondly, we talk about our 2D blocking algorithm to optimize the online ranking part of the system. Multi-tree ensemble models have been proven to be effective for document ranking. Using a large number of trees can improve accuracy, but it takes time to calculate ranking scores of matched documents. We investigate data traversal methods for fast score calculation with a large ensemble and propose a 2D blocking scheme for better cache utilization with simpler code structure compared to previous work. The experiments with several benchmarks show significant acceleration in score calculation without loss of ranking accuracy.Lastly, we describe a framework to fast select best blocking methods and parameters for our 2D blocking algorithm with the help of a full cache analysis. 2D blocking method is very helpful to improve online search efficiency. However, different traversal methods and blocking parameter settings can exhibit different cache and cost behavior depending on data and architectural characteristics. It is very time-consuming to conduct exhaustive search for performance comparison and optimum selection. We provide an analytic comparison of cache blocking methods on their data access performance for an approximation and propose a fast guided sampling scheme to select a traversal method and blocking parameters for effective use of memory hierarchy. The evaluation studies with three datasets show that within a reasonable amount of time, the proposed scheme can identify a highly competitive solution that significantly accelerates score calculation.In summary, we have proposed a new multi-version search architecture with cache-conscious ranking optimization for the online search part and a framework to help fast select best blocking methods and parameters with full cache analysis for the 2D blocking method. By proposing this new versioned search system, we can meet challenges from scalability, efficiency and accuracy in multi-version search, and we believe this work would be useful to future researchers in this direction
HIR-CP-ABE: Hierarchical Identity Revocable Ciphertext-Policy Attribute-Based Encryption for Secure and Flexible Data Sharing
Ciphertext Policy Attribute-Based Encryption (CP- ABE) has been proposed to implement the attribute-based access control model. In CP-ABE, data owners encrypt the data with a certain access policy such that only data users whose attributes satisfy the access policy could obtain the corresponding private decryption key from a trusted authority. Therefore, CP-ABE is considered as a promising fine-grained access control mechanism for data sharing where no centralized trusted third party exists, for example, cloud computing, mobile ad hoc networks (MANET), Peer-to-Peer (P2P) networks, information centric networks (ICN), etc.. As promising as it is, user revocation is a cumbersome problem in CP-ABE, thus impeding its application in practice. To solve this problem, we propose a new scheme named HIR-CP-ABE, which implements hierarchical identity- based user revocation from the perceptive of encryption. In particular, the revocation is implemented by data owners directly without any help from any third party. Compared with previous attribute-based revocation solutions, our scheme provides the following nice properties. First, the trusted authority could be offline after system setup and key distribution, thus making it applicable in mobile ad hoc networks, P2P networks, etc., where the nodes in the network are unable to connect to the trusted authority after system deployment. Second, a user does not need to update the private key when user revocation occurs. Therefore, key management overhead is much lower in HIR-CP-ABE for both the users and the trusted authority. Third, the revocation mechanism enables to revoke a group of users affiliated with the same organization in a batch without influencing any other users. To the best of our knowledge, HIR-CP-ABE is the first CP-ABE scheme to provide affiliation-based revocation functionality for data owners. Through security analysis and performance evaluation, we show that the proposed scheme is secure and efficient in terms of computation, communication and storage
Recommended from our members
Secure Computation in Heterogeneous Environments: How to Bring Multiparty Computation Closer to Practice?
Many services that people use daily require computation that depends on the private data of multiple parties. While the utility of the final result of such interactions outweighs the privacy concerns related to output release, the inputs for such computations are much more sensitive and need to be protected. Secure multiparty computation (MPC) considers the question of constructing computation protocols that reveal nothing more about their inputs than what is inherently leaked by the output. There have been strong theoretical results that demonstrate that every functionality can be computed securely. However, these protocols remain unused in practical solutions since they introduce efficiency overhead prohibitive for most applications. Generic multiparty computation techniques address homogeneous setups with respect to the resources available to the participants and the adversarial model. On the other hand, realistic scenarios present a wide diversity of heterogeneous environments where different participants have different available resources and different incentives to misbehave and collude. In this thesis we introduce techniques for multiparty computation that focus on heterogeneous settings. We present solutions tailored to address different types of asymmetric constraints and improve the efficiency of existing approaches in these scenarios. We tackle the question from three main directions: New Computational Models for MPC - We explore different computational models that enable us to overcome inherent inefficiencies of generic MPC solutions using circuit representation for the evaluated functionality. First, we show how we can use random access machines to construct MPC protocols that add only polylogarithmic overhead to the running time of the insecure version of the underlying functionality. This allows to achieve MPC constructions with computational complexity sublinear in the size for their inputs, which is very important for computations that use large databases. We also consider multivariate polynomials which yield more succinct representations for the functionalities they implement than circuits, and at the same time a large collection of problems are naturally and efficiently expressed as multivariate polynomials. We construct an MPC protocol for multivariate polynomials, which improves the communication complexity of corresponding circuit solutions, and provides currently the most efficient solution for multiparty set intersection in the fully malicious case. Outsourcing Computation - The goal in this setting is to utilize the resources of a single powerful service provider for the work that computationally weak clients need to perform on their data. We present a new paradigm for constructing verifiable computation (VC) schemes, which enables a computationally limited client to verify efficiently the result of a large computation. Our construction is based on attribute-based encryption and avoids expensive primitives such as fully homomorphic encryption andprobabilistically checkable proofs underlying existing VC schemes. Additionally our solution enjoys two new useful properties: public delegation and verification. We further introduce the model of server-aided computation where we utilize the computational power of an outsourcing party to assist the execution and improve the efficiency of MPC protocols. For this purpose we define a new adversarial model of non-collusion, which provides room for more efficient constructions that rely almost completely only on symmetric key operations, and at the same time captures realistic settings for adversarial behavior. In this model we propose protocols for generic secure computation that offload the work of most of the parties to the computation server. We also construct a specialized server-aided two party set intersection protocol that achieves better efficiencies for the two participants than existing solutions. Outsourcing in many cases concerns only data storage and while outsourcing the data of a single party is useful, providing a way for data sharing among different clients of the service is the more interesting and useful setup. However, this scenario brings new challenges for access control since the access control rules and data accesses become private data for the clients with respect to the service provide. We propose an approach that offers trade-offs between the privacy provided for the clients and the communication overhead incurred for each data access. Efficient Private Search in Practice - We consider the question of private search from a different perspective compared to traditional settings for MPC. We start with strict efficiency requirements motivated by speeds of available hardware and what is considered acceptable overhead from practical point of view. Then we adopt relaxed definitions of privacy, which still provide meaningful security guarantees while allowing us to meet the efficiency requirements. In this setting we design a security architecture and implement a system for data sharing based on encrypted search, which achieves only 30% overhead compared to non-secure solutions on realistic workloads
Towards Practical Privacy-Preserving Protocols
Protecting users' privacy in digital systems becomes more complex and challenging over time, as the amount of stored and exchanged data grows steadily and systems become increasingly involved and connected. Two techniques that try to approach this issue are Secure Multi-Party Computation (MPC) and Private Information Retrieval (PIR), which aim to enable practical computation while simultaneously keeping sensitive data private. In this thesis we present results showing how real-world applications can be executed in a privacy-preserving way. This is not only desired by users of such applications, but since 2018 also based on a strong legal foundation with the General Data Protection Regulation (GDPR) in the European Union, that forces companies to protect the privacy of user data by design.
This thesis' contributions are split into three parts and can be summarized as follows:
MPC Tools
Generic MPC requires in-depth background knowledge about a complex research field. To approach this, we provide tools that are efficient and usable at the same time, and serve as a foundation for follow-up work as they allow cryptographers, researchers and developers to implement, test and deploy MPC applications. We provide an implementation framework that abstracts from the underlying protocols, optimized building blocks generated from hardware synthesis tools, and allow the direct processing of Hardware Definition Languages (HDLs). Finally, we present an automated compiler for efficient hybrid protocols from ANSI C.
MPC Applications
MPC was for a long time deemed too expensive to be used in practice. We show several use cases of real-world applications that can operate in a privacy-preserving, yet practical way when engineered properly and built on top of suitable MPC protocols. Use cases presented in this thesis are from the domain of route computation using BGP on the Internet or at Internet Exchange Points (IXPs). In both cases our protocols protect sensitive business information that is used to determine routing decisions. Another use case focuses on genomics, which is particularly critical as the human genome is connected to everyone during their entire lifespan and cannot be altered. Our system enables federated genomic databases, where several institutions can privately outsource their genome data and where research institutes can query this data in a privacy-preserving manner.
PIR and Applications
Privately retrieving data from a database is a crucial requirement for user privacy and metadata protection, and is enabled amongst others by a technique called Private Information Retrieval (PIR). We present improvements and a generalization of a well-known multi-server PIR scheme of Chor et al., and an implementation and evaluation thereof. We also design and implement an efficient anonymous messaging system built on top of PIR. Furthermore we provide a scalable solution for private contact discovery that utilizes ideas from efficient two-server PIR built from Distributed Point Functions (DPFs) in combination with Private Set Intersection (PSI)
Sublinear Computation Paradigm
This open access book gives an overview of cutting-edge work on a new paradigm called the “sublinear computation paradigm,” which was proposed in the large multiyear academic research project “Foundations of Innovative Algorithms for Big Data.” That project ran from October 2014 to March 2020, in Japan. To handle the unprecedented explosion of big data sets in research, industry, and other areas of society, there is an urgent need to develop novel methods and approaches for big data analysis. To meet this need, innovative changes in algorithm theory for big data are being pursued. For example, polynomial-time algorithms have thus far been regarded as “fast,” but if a quadratic-time algorithm is applied to a petabyte-scale or larger big data set, problems are encountered in terms of computational resources or running time. To deal with this critical computational and algorithmic bottleneck, linear, sublinear, and constant time algorithms are required. The sublinear computation paradigm is proposed here in order to support innovation in the big data era. A foundation of innovative algorithms has been created by developing computational procedures, data structures, and modelling techniques for big data. The project is organized into three teams that focus on sublinear algorithms, sublinear data structures, and sublinear modelling. The work has provided high-level academic research results of strong computational and algorithmic interest, which are presented in this book. The book consists of five parts: Part I, which consists of a single chapter on the concept of the sublinear computation paradigm; Parts II, III, and IV review results on sublinear algorithms, sublinear data structures, and sublinear modelling, respectively; Part V presents application results. The information presented here will inspire the researchers who work in the field of modern algorithms