Search CORE

32 research outputs found

Accelerating content-defined-chunking based data deduplication by exploiting parallelism

Author: Chang Victor
Feng Dan
Jiang Hong
Xia Wen
Zhang Yucheng
Zou Xiangyu
Publication venue: 'Elsevier BV'
Publication date: 29/03/2019
Field of study

Teeside University's Research Repository

A Survey on Data Deduplication

Author: Shubhanshi Singhal, Naresh Kumar
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/05/2017
Field of study

Now-a-days, the demand of data storage capacity is increasing drastically. Due to more demands of storage, the computer society is attracting toward cloud storage. Security of data and cost factors are important challenges in cloud storage. A duplicate file not only waste the storage, it also increases the access time. So the detection and removal of duplicate data is an essential task. Data deduplication, an efficient approach to data reduction, has gained increasing attention and popularity in large-scale storage systems. It eliminates redundant data at the file or subfile level and identifies duplicate content by its cryptographically secure hash signature. It is very tricky because neither duplicate files don?t have a common key nor they contain error. There are several approaches to identify and remove redundant data at file and chunk levels. In this paper, the background and key features of data deduplication is covered, then summarize and classify the data deduplication process according to the key workflow

International Journal on Recent and Innovation Trends in Computing and Communication

Intrusion Detection and Prevention in Cloud, Fog, and Internet of Things

Author: Li Shancang
Puthal Deepak
Qi Lianyong
Yuan Yuan
Zhang Xuyun
Zhou Zhili
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2019
Field of study

UWE Bristol Research Repository

Block-level De-duplication with Encrypted Data

Author: Melek Önen
Pasquale Puzio
Refik Molva
Sergio Loureiro
Publication venue: RonPub
Publication date: 01/01/2014
Field of study

Deduplication is a storage saving technique which has been adopted by many cloud storage providers such as Dropbox. The simple principle of deduplication is that duplicate data uploaded by different users are stored only once. Unfortunately, deduplication is not compatible with encryption. As a scheme that allows deduplication of encrypted data segments, we propose ClouDedup, a secure and efficient storage service which guarantees blocklevel deduplication and data confidentiality at the same time. ClouDedup strengthens convergent encryption by employing a component that implements an additional encryption operation and an access control mechanism. We also propose to introduce an additional component which is in charge of providing a key management system for data blocks together with the actual deduplication operation. We show that the overhead introduced by these new components is minimal and does not impact the overall storage and computational costs

RonPub -- Research Online Publishing

Efficient, Dependable Storage of Human Genome Sequencing Data

Author: Cogo Vinicius
Publication venue
Publication date: 01/05/2020
Field of study

A compreensão do genoma humano impacta várias áreas da vida. Os dados oriundos do genoma humano são enormes pois existem milhões de amostras a espera de serem sequenciadas e cada genoma humano sequenciado pode ocupar centenas de gigabytes de espaço de armazenamento. Os genomas humanos são críticos porque são extremamente valiosos para a investigação e porque podem fornecer informações delicadas sobre o estado de saúde dos indivíduos, identificar os seus dadores ou até mesmo revelar informações sobre os parentes destes. O tamanho e a criticidade destes genomas, para além da quantidade de dados produzidos por instituições médicas e de ciências da vida, exigem que os sistemas informáticos sejam escaláveis, ao mesmo tempo que sejam seguros, confiáveis, auditáveis e com custos acessíveis. As infraestruturas de armazenamento existentes são tão caras que não nos permitem ignorar a eficiência de custos no armazenamento de genomas humanos, assim como em geral estas não possuem o conhecimento e os mecanismos adequados para proteger a privacidade dos dadores de amostras biológicas. Esta tese propõe um sistema de armazenamento de genomas humanos eficiente, seguro e auditável para instituições médicas e de ciências da vida. Ele aprimora os ecossistemas de armazenamento tradicionais com técnicas de privacidade, redução do tamanho dos dados e auditabilidade a fim de permitir o uso eficiente e confiável de infraestruturas públicas de computação em nuvem para armazenar genomas humanos. As contribuições desta tese incluem (1) um estudo sobre a sensibilidade à privacidade dos genomas humanos; (2) um método para detetar sistematicamente as porções dos genomas que são sensíveis à privacidade; (3) algoritmos de redução do tamanho de dados, especializados para dados de genomas sequenciados; (4) um esquema de auditoria independente para armazenamento disperso e seguro de dados; e (5) um fluxo de armazenamento completo que obtém garantias razoáveis de proteção, segurança e confiabilidade a custos modestos (por exemplo, menos de

1/Genoma/Ano), integrando os mecanismos propostos a configurações de armazenamento apropriadasThe understanding of human genome impacts several areas of human life. Data from human genomes is massive because there are millions of samples to be sequenced, and each sequenced human genome may size hundreds of gigabytes. Human genomes are critical because they are extremely valuable to research and may provide hints on individuals’ health status, identify their donors, or reveal information about donors’ relatives. Their size and criticality, plus the amount of data being produced by medical and life-sciences institutions, require systems to scale while being secure, dependable, auditable, and affordable. Current storage infrastructures are too expensive to ignore cost efficiency in storing human genomes, and they lack the proper knowledge and mechanisms to protect the privacy of sample donors. This thesis proposes an efficient storage system for human genomes that medical and lifesciences institutions may trust and afford. It enhances traditional storage ecosystems with privacy-aware, data-reduction, and auditability techniques to enable the efficient, dependable use of multi-tenant infrastructures to store human genomes. Contributions from this thesis include (1) a study on the privacy-sensitivity of human genomes; (2) to detect genomes’ privacy-sensitive portions systematically; (3) specialised data reduction algorithms for sequencing data; (4) an independent auditability scheme for secure dispersed storage; and (5) a complete storage pipeline that obtains reasonable privacy protection, security, and dependability guarantees at modest costs (e.g., less than

1/Genome/Year) by integrating the proposed mechanisms with appropriate storage configurations

Universidade de Lisboa: Repositório.UL

A New Secure Protected De-duplication Structure With Upgraded Reliability

Author: Madhuri S.
Narasamba V.G.L
Sarma K.Yaswanth
Publication venue: Kakinada Institute of Engineering and Technology for Women
Publication date: 08/05/2017
Field of study

This makes the essential attempt to formalize the possibility of dispersed strong deduplication system. We propose new conveyed deduplication structures with higher unfaltering quality in which the data lumps are appropriated over different cloud servers. The security requirements of data protection and name consistency are in like manner achieved by introducing a deterministic puzzle sharing arrangement in appropriated stockpiling systems, as opposed to using simultaneous encryption as a piece of past deduplication structures. Security examination displays that our deduplication systems are secure the extent that the definitions decided in the proposed security illustrate. As a proof of thought, we complete the proposed systems and display that the procured overhead is especially limited in sensible circumstances

International Journal of Science Engineering and Advance Technology (IJSEAT)