34,499 research outputs found

    Data Protection: Combining Fragmentation, Encryption, and Dispersion, a final report

    Full text link
    Hardening data protection using multiple methods rather than 'just' encryption is of paramount importance when considering continuous and powerful attacks in order to observe, steal, alter, or even destroy private and confidential information.Our purpose is to look at cost effective data protection by way of combining fragmentation, encryption, and dispersion over several physical machines. This involves deriving general schemes to protect data everywhere throughout a network of machines where they are being processed, transmitted, and stored during their entire life cycle. This is being enabled by a number of parallel and distributed architectures using various set of cores or machines ranging from General Purpose GPUs to multiple clouds. In this report, we first present a general and conceptual description of what should be a fragmentation, encryption, and dispersion system (FEDS) including a number of high level requirements such systems ought to meet. Then, we focus on two kind of fragmentation. First, a selective separation of information in two fragments a public one and a private one. We describe a family of processes and address not only the question of performance but also the questions of memory occupation, integrity or quality of the restitution of the information, and of course we conclude with an analysis of the level of security provided by our algorithms. Then, we analyze works first on general dispersion systems in a bit wise manner without data structure consideration; second on fragmentation of information considering data defined along an object oriented data structure or along a record structure to be stored in a relational database

    Data fragmentation for parallel transitive closure strategies

    Get PDF
    Addresses the problem of fragmenting a relation to make the parallel computation of the transitive closure efficient, based on the disconnection set approach. To better understand this design problem, the authors focus on transportation networks. These are characterized by loosely interconnected clusters of nodes with a high internal connectivity rate. Three requirements that have to be fulfilled by a fragmentation are formulated, and three different fragmentation strategies are presented, each emphasizing one of these requirements. Some test results are presented to show the performance of the various fragmentation strategie

    Review on Fragment Allocation by using Clustering Technique in Distributed Database System

    Full text link
    Considerable Progress has been made in the last few years in improving the performance of the distributed database systems. The development of Fragment allocation models in Distributed database is becoming difficult due to the complexity of huge number of sites and their communication considerations. Under such conditions, simulation of clustering and data allocation is adequate tools for understanding and evaluating the performance of data allocation in Distributed databases. Clustering sites and fragment allocation are key challenges in Distributed database performance, and are considered to be efficient methods that have a major role in reducing transferred and accessed data during the execution of applications. In this paper a review on Fragment allocation by using Clustering technique is given in Distributed Database System.Comment: 9 pages,3 figure

    Partout: A Distributed Engine for Efficient RDF Processing

    Full text link
    The increasing interest in Semantic Web technologies has led not only to a rapid growth of semantic data on the Web but also to an increasing number of backend applications with already more than a trillion triples in some cases. Confronted with such huge amounts of data and the future growth, existing state-of-the-art systems for storing RDF and processing SPARQL queries are no longer sufficient. In this paper, we introduce Partout, a distributed engine for efficient RDF processing in a cluster of machines. We propose an effective approach for fragmenting RDF data sets based on a query log, allocating the fragments to nodes in a cluster, and finding the optimal configuration. Partout can efficiently handle updates and its query optimizer produces efficient query execution plans for ad-hoc SPARQL queries. Our experiments show the superiority of our approach to state-of-the-art approaches for partitioning and distributed SPARQL query processing

    Privacy-preserving Health Data Sharing for Medical Cyber-Physical Systems

    Full text link
    The recent spades of cyber security attacks have compromised end users' data safety and privacy in Medical Cyber-Physical Systems (MCPS). Traditional standard encryption algorithms for data protection are designed based on a viewpoint of system architecture rather than a viewpoint of end users. As such encryption algorithms are transferring the protection on the data to the protection on the keys, data safety and privacy will be compromised once the key is exposed. In this paper, we propose a secure data storage and sharing method consisted by a selective encryption algorithm combined with fragmentation and dispersion to protect the data safety and privacy even when both transmission media (e.g. cloud servers) and keys are compromised. This method is based on a user-centric design that protects the data on a trusted device such as end user's smartphone and lets the end user to control the access for data sharing. We also evaluate the performance of the algorithm on a smartphone platform to prove the efficiency

    Fragment Allocation Configuration in Distributed Database Systems

    Full text link
    In distributed database (DDB) management systems, fragment allocation is one of the most important components that can directly affect the performance of DDB. In this research work, we will show that declarative programming languages, e.g. logic programming languages, can be used to represent different data fragment allocation techniques. Results indicate that, using declarative programming language significantly simplifies the representation of fragment allocation algorithm, thus opens door for any further developments and optimizations. The under consideration case study also show that our approach can be extended to be used in different areas of distributed systems

    The design and implementation of an infrastructure for multimedia digital libraries

    Get PDF
    We develop an infrastructure for managing, indexing and serving multimedia content in digital libraries. This infrastructure follows the model of the Web, and thereby is distributed in nature. We discuss the design of the Librarian, the component that manages meta data about the content. The management of meta data has been separated from the media servers that manage the content itself. Also, the extraction of the meta data is largely independent of the Librarian. We introduce our extensible data model and the daemon paradigm that are the core pieces of this architecture. We evaluate our initial implementation using a relational database. We conclude with a discussion of the lessons we learned in building this system, and proposals for improving the flexibility, reliability, and performance of the syste

    Scalable Reliable SD Erlang Design

    Get PDF
    This technical report presents the design of Scalable Distributed (SD) Erlang: a set of language-level changes that aims to enable Distributed Erlang to scale for server applications on commodity hardware with at most 100,000 cores. We cover a number of aspects, specifically anticipated architecture, anticipated failures, scalable data structures, and scalable computation. Other two components that guided us in the design of SD Erlang are design principles and typical Erlang applications. The design principles summarise the type of modifications we aim to allow Erlang scalability. Erlang exemplars help us to identify the main Erlang scalability issues and hypothetically validate the SD Erlang design

    LSM-based Storage Techniques: A Survey

    Full text link
    Recently, the Log-Structured Merge-tree (LSM-tree) has been widely adopted for use in the storage layer of modern NoSQL systems. Because of this, there have been a large number of research efforts, from both the database community and the operating systems community, that try to improve various aspects of LSM-trees. In this paper, we provide a survey of recent research efforts on LSM-trees so that readers can learn the state-of-the-art in LSM-based storage techniques. We provide a general taxonomy to classify the literature of LSM-trees, survey the efforts in detail, and discuss their strengths and trade-offs. We further survey several representative LSM-based open-source NoSQL systems and discuss some potential future research directions resulting from the survey.Comment: This is a pre-print of an article published in VLDB Journal. The final authenticated version is available online at: https://doi.org/10.1007/s00778-019-00555-

    TritanDB: Time-series Rapid Internet of Things Analytics

    Full text link
    The efficient management of data is an important prerequisite for realising the potential of the Internet of Things (IoT). Two issues given the large volume of structured time-series IoT data are, addressing the difficulties of data integration between heterogeneous Things and improving ingestion and query performance across databases on both resource-constrained Things and in the cloud. In this paper, we examine the structure of public IoT data and discover that the majority exhibit unique flat, wide and numerical characteristics with a mix of evenly and unevenly-spaced time-series. We investigate the advances in time-series databases for telemetry data and combine these findings with microbenchmarks to determine the best compression techniques and storage data structures to inform the design of a novel solution optimised for IoT data. A query translation method with low overhead even on resource-constrained Things allows us to utilise rich data models like the Resource Description Framework (RDF) for interoperability and data integration on top of the optimised storage. Our solution, TritanDB, shows an order of magnitude performance improvement across both Things and cloud hardware on many state-of-the-art databases within IoT scenarios. Finally, we describe how TritanDB supports various analyses of IoT time-series data like forecasting
    corecore