10 research outputs found

    An efficient confidentiality-preserving Proof of Ownership for deduplication

    Get PDF
    Data storage in the cloud is becoming widespread. Deduplication is a key mechanism to decrease the operating costs cloud providers face, due to the reduction of replicated data storage. Nonetheless, deduplication must deal with several security threats such as honest-but-curious servers or malicious users who may try to take ownership of files they are not entitled to. Unfortunately, state-of-the-art solutions present weaknesses such as not coping with honest-but-curious servers, deployment problems, or lacking a sound security analysis. In this paper we present a novel Proof of Ownership scheme that uses convergent encryption and requires neither trusted third parties nor complex key management. The experimental evaluation highlights the efficiency and feasibility of our proposal that is proven to be secure under the random oracle model in the bounded leakage setting. (C) 2015 Elsevier Ltd. All rights reserved

    Storage systems for mobile-cloud applications

    Get PDF
    Mobile devices have become the major computing platform in todays world. However, some apps on mobile devices still suffer from insufficient computing and energy resources. A key solution is to offload resource-demanding computing tasks from mobile devices to the cloud. This leads to a scenario where computing tasks in the same application run concurrently on both the mobile device and the cloud. This dissertation aims to ensure that the tasks in a mobile app that employs offloading can access and share files concurrently on the mobile and the cloud in a manner that is efficient, consistent, and transparent to locations. Existing distributed file systems and network file systems do not satisfy these requirements. Furthermore, current offloading platforms either do not support efficient file access for offloaded tasks or do not offload tasks with file accesses. The first part of the dissertation addresses this issue by designing and implementing an application-level file system named Overlay File System (OFS). OFS assumes a cloud surrogate is paired with each mobile device for task and storage offloading. To achieve high efficiency, OFS maintains and buffers local copies of data sets on both the surrogate and the mobile device. OFS ensures consistency and guarantees that all the reads get the latest data. To effectively reduce the network traffic and the execution delay, OFS uses a delayed-update mechanism, which combines write-invalidate and write-update policies. To guarantee location transparency, OFS creates a unified view of file data. The research tests OFS on Android OS with a real mobile application and real mobile user traces. Extensive experiments show that OFS can effectively support consistent file accesses from computation tasks, no matter where they run. In addition, OFS can effectively reduce both file access latency and network traffic incurred by file accesses. While OFS allows offloaded tasks to access the required files in a consistent and transparent manner, file accesses by offloaded tasks can be further improved. Instead of retrieving the required files from its associated mobile device, a surrogate can discover and retrieve identical or similar file(s) from the surrogates belonging to other users to meet its needs. This is based on two observations: 1) multiple users have the same or similar files, e.g., shared files or images/videos of same object; 2) the need for a certain file content in mobile apps can usually be described by context features of the content, e.g., location, objects in an image, etc.; thus, any file with the required context features can be used to satisfy the need. Since files may be retrieved from surrogates, this solution improves latency and saves wireless bandwidth and power on mobile devices. The second part of the dissertation proposes and develops a Context-Aware File Discovery Service (CAFDS) that implements the idea described above. CAFDS uses a self-organizing map and k-means clustering to classify files into file groups based on file contexts. It then uses an enhanced decision tree to locate and retrieve files based on the file contexts defined by apps. To support diverse file discovery demands from various mobile apps, CAFDS allows apps to add new file contexts and to update existing file contexts dynamically, without affecting the discovery process. To evaluate the effectiveness of CAFDS, the research has implemented a prototype on Android and Linux. The performance of CAFDS was tested against Chord, a DHT based lookup scheme, and SPOON, a P2P file sharing system. The experiments show that CAFDS provides lower end-to-end latency for file search than Chord and SPOON, while providing similar scalability to Chord

    On the design of efficient caching systems

    Get PDF
    Content distribution is currently the prevalent Internet use case, accounting for the majority of global Internet traffic and growing exponentially. There is general consensus that the most effective method to deal with the large amount of content demand is through the deployment of massively distributed caching infrastructures as the means to localise content delivery traffic. Solutions based on caching have been already widely deployed through Content Delivery Networks. Ubiquitous caching is also a fundamental aspect of the emerging Information-Centric Networking paradigm which aims to rethink the current Internet architecture for long term evolution. Distributed content caching systems are expected to grow substantially in the future, in terms of both footprint and traffic carried and, as such, will become substantially more complex and costly. This thesis addresses the problem of designing scalable and cost-effective distributed caching systems that will be able to efficiently support the expected massive growth of content traffic and makes three distinct contributions. First, it produces an extensive theoretical characterisation of sharding, which is a widely used technique to allocate data items to resources of a distributed system according to a hash function. Based on the findings unveiled by this analysis, two systems are designed contributing to the abovementioned objective. The first is a framework and related algorithms for enabling efficient load-balanced content caching. This solution provides qualitative advantages over previously proposed solutions, such as ease of modelling and availability of knobs to fine-tune performance, as well as quantitative advantages, such as 2x increase in cache hit ratio and 19-33% reduction in load imbalance while maintaining comparable latency to other approaches. The second is the design and implementation of a caching node enabling 20 Gbps speeds based on inexpensive commodity hardware. We believe these contributions advance significantly the state of the art in distributed caching systems

    A tunable proof of ownership scheme for deduplication using Bloom filters

    No full text

    Technologies and Applications for Big Data Value

    Get PDF
    This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part “Technologies and Methods” contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part “Processes and Applications” details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems

    Technologies and Applications for Big Data Value

    Get PDF
    This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part “Technologies and Methods” contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part “Processes and Applications” details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems

    Study on open science: The general state of the play in Open Science principles and practices at European life sciences institutes

    Get PDF
    Nowadays, open science is a hot topic on all levels and also is one of the priorities of the European Research Area. Components that are commonly associated with open science are open access, open data, open methodology, open source, open peer review, open science policies and citizen science. Open science may a great potential to connect and influence the practices of researchers, funding institutions and the public. In this paper, we evaluate the level of openness based on public surveys at four European life sciences institute

    Préserver la vie privée des individus grâce aux Systèmes Personnels de Gestion des Données

    Get PDF
    Riding the wave of smart disclosure initiatives and new privacy-protection regulations, the Personal Cloud paradigm is emerging through a myriad of solutions offered to users to let them gather and manage their whole digital life. On the bright side, this opens the way to novel value-added services when crossing multiple sources of data of a given person or crossing the data of multiple people. Yet this paradigm shift towards user empowerment raises fundamental questions with regards to the appropriateness of the functionalities and the data management and protection techniques which are offered by existing solutions to laymen users. Our work addresses these questions on three levels. First, we review, compare and analyze personal cloud alternatives in terms of the functionalities they provide and the threat models they target. From this analysis, we derive a general set of functionality and security requirements that any Personal Data Management System (PDMS) should consider. We then identify the challenges of implementing such a PDMS and propose a preliminary design for an extensive and secure PDMS reference architecture satisfying the considered requirements. Second, we focus on personal computations for a specific hardware PDMS instance (i.e., secure token with mass storage of NAND Flash). In this context, we propose a scalable embedded full-text search engine to index large document collections and manage tag-based access control policies. Third, we address the problem of collective computations in a fully-distributed architecture of PDMSs. We discuss the system and security requirements and propose protocols to enable distributed query processing with strong security guarantees against an attacker mastering many colluding corrupted nodes.Surfant sur la vague des initiatives de divulgation restreinte de données et des nouvelles réglementations en matière de protection de la vie privée, le paradigme du Cloud Personnel émerge à travers une myriade de solutions proposées aux utilisateurs leur permettant de rassembler et de gérer l'ensemble de leur vie numérique. Du côté positif, cela ouvre la voie à de nouveaux services à valeur ajoutée lors du croisement de plusieurs sources de données d'un individu ou du croisement des données de plusieurs personnes. Cependant, ce changement de paradigme vers la responsabilisation de l'utilisateur soulève des questions fondamentales quant à l'adéquation des fonctionnalités et des techniques de gestion et de protection des données proposées par les solutions existantes aux utilisateurs lambda. Notre travail aborde ces questions à trois niveaux. Tout d'abord, nous passons en revue, comparons et analysons les alternatives de cloud personnel au niveau des fonctionnalités fournies et des modèles de menaces ciblés. De cette analyse, nous déduisons un ensemble général d'exigences en matière de fonctionnalité et de sécurité que tout système personnel de gestion des données (PDMS) devrait prendre en compte. Nous identifions ensuite les défis liés à la mise en œuvre d'un tel PDMS et proposons une conception préliminaire pour une architecture PDMS étendue et sécurisée de référence répondant aux exigences considérées. Ensuite, nous nous concentrons sur les calculs personnels pour une instance matérielle spécifique du PDMS (à savoir, un dispositif personnel sécurisé avec un stockage de masse de type NAND Flash). Dans ce contexte, nous proposons un moteur de recherche plein texte embarqué et évolutif pour indexer de grandes collections de documents et gérer des politiques de contrôle d'accès basées sur des étiquettes. Troisièmement, nous abordons le problème des calculs collectifs dans une architecture entièrement distribuée de PDMS. Nous discutons des exigences d'architectures système et de sécurité et proposons des protocoles pour permettre le traitement distribué des requêtes avec de fortes garanties de sécurité contre un attaquant maîtrisant de nombreux nœuds corrompus
    corecore