23,154 research outputs found

    Private search over big data leveraging distributed file system and parallel processing

    Get PDF
    In this work, we identify the security and privacy problems associated with a certain Big Data application, namely secure keyword-based search over encrypted cloud data and emphasize the actual challenges and technical difficulties in the Big Data setting. More specifically, we provide definitions from which privacy requirements can be derived. In addition, we adapt an existing work on privacy-preserving keyword-based search method to the Big Data setting, in which, not only data is huge but also changing and accumulating very fast. Our proposal is scalable in the sense that it can leverage distributed file systems and parallel programming techniques such as the Hadoop Distributed File System (HDFS) and the MapReduce programming model, to work with very large data sets. We also propose a lazy idf-updating method that can efficiently handle the relevancy scores of the documents in a dynamically changing, large data set. We empirically show the efficiency and accuracy of the method through extensive set of experiments on real data

    Private search over big data leveraging distributed file system and parallel processing

    Get PDF
    As the new technologies recently became widespread, enormous amount of data started to be generated in very high speeds and stored in untrusted servers. The big data concept covers not only the exceptional size of the datasets, but also high data generation rate and large variety of data types. Although the Big Data provides very tempting benefits, the security issues are still an open problem. In this thesis, we identify security and privacy problems associated with a certain big data application, namely secure keyword-based search over encrypted cloud data and emphasize the actual challenges and technical difficulties in the big data setting. More specifically, we provide definitions from which privacy requirements can be derived. In addition, we adapt an existing work on privacy-preserving keyword-based search method, which is one of the fundamental operations that can be performed over encrypted data, to the big data setting, in which, not only data is huge but also changing and accumulating very fast. Therefore, in the big data setting, a secure index that allows search over encrypted data should be constructed and updated very fast in addition to an efficient and effective keyword-based search operation method. Our proposal is scalable in the sense that it can leverage distributed file systems and parallel programming techniques such as the Hadoop Distributed File System (HDFS) and the MapReduce programming model to work with very large datasets. We also propose a lazy idf-updating method that can efficiently handle the relevancy scores of the documents in dynamically changing and large datasets. We empirically show the efficiency and accuracy of the method through extensive set of experiments on real dat

    Secure Data Communication in Autonomous V2X Systems

    Get PDF
    In Vehicle-to-Everything (V2X) communication systems, vehicles as well as infrastructure devices can interact and exchange data with each other. This capability is used to implement intelligent transportation systems applications. Data confidentiality and integrity need to be preserved in unverified and untrusted environments. In this paper, we propose a solution that provides (a) role-based and attribute-based access control to encrypted data and (b) encrypted search over encrypted data. Vehicle Records contain sensitive information about the owners and vehicles in encrypted form with attached access control policies and policy enforcement engine. Our solution supports decentralized and distributed data exchange, which is essential in V2X systems, where a Central Authority is not required to enforce access control policies. Furthermore, we facilitate querying encrypted Vehicle Records through Structured Query Language (SQL) queries. Vehicle Records are stored in a database in untrusted V2X cloud environment that is prone to provide the attackers with a large attack surface. Big datasets, stored in cloud, can be used for data analysis, such as traffic pattern analysis. Our solution protects sensitive vehicle and owner information from curious or malicious information cloud administrators. Support of indexing improves performance of queries that are forwarded to relevant encrypted Vehicle Records, which are stored in the cloud. We measure the performance overhead of our security solution based on self-protecting Vehicle Records with encrypted search capabilities in V2X communication systems and analyze the effect of security over safety

    Optimizing Trees for Static Searchable Encryption

    Get PDF
    Searchable symmetric encryption (SSE) enables data owners to conduct searches over encrypted data stored by an untrusted server, retrieving only those encrypted files that match the search queries. Several recent schemes employ a server-side encrypted index in the form of a search tree where each node stores a bit vector denoting for each keyword whether any file in its subtree contains that keyword. Our work is motivated by the observation that the way data is distributed in such a search tree has a big impact on the cost of searches. For single-keyword queries, it impacts the number of different paths that must be followed to find all the matching files; for multi-keyword queries, the arrangement of the tree also impacts the number of nodes visited during the search on paths that do not lead to any satisfying data elements. We present three algorithms that improve the performance of SSE schemes based on tree indexes and prove that for cases where the search cost is high, the cost of our algorithms converges to the cost of the optimal tree. In our experiments, the resulting search trees outperform the arbitrary search trees used in previous works by a factor of up to tw
    corecore