Search CORE

10 research outputs found

A survey and classification of storage deduplication systems

Author: Anand Ashok
Arcangeli Andrea
Berliner Brian
Bolosky William J.
Broder Andrei
Chen Feng
Chute Christopher
Clements Austin T.
Collberg Christian
Debnath Biplob
Dong Wei
Douglis Fred
Douglis Fred
Dubnicki Cezary
Dutch
El-Shimi Ahmed
Eshghi Kave
Guo Fanglu
Gupta Aayush
Hong Bo
José Pereira
João Paulo
Kruus Erik
Liguori Anthony
Lillibridge Mark
Lu Guanlin
Manber Udi
Milos Grzegorz
Nath Partho
Ng Chun-Ho
Quinlan Sean
Rhea Sean
Shilane Philip
Srinivasan Kiran
Suzaki Kuniyasu
Tarasov Vasily
Ungureanu Cristian
Wright Jeff
Xia Wen
You Lawrence
Zhu Benjamin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/07/2014
Field of study

The automatic elimination of duplicate data in a storage system commonly known as deduplication is increasingly accepted as an effective technique to reduce storage costs. Thus, it has been applied to different storage types, including archives and backups, primary storage, within solid state disks, and even to random access memory. Although the general approach to deduplication is shared by all storage types, each poses specific challenges and leads to different trade-offs and solutions. This diversity is often misunderstood, thus underestimating the relevance of new research and development. The first contribution of this paper is a classification of deduplication systems according to six criteria that correspond to key design decisions: granularity, locality, timing, indexing, technique, and scope. This classification identifies and describes the different approaches used for each of them. As a second contribution, we describe which combinations of these design decisions have been proposed and found more useful for challenges in each storage type. Finally, outstanding research challenges and unexplored design points are identified and discussed.This work is funded by the European Regional Development Fund (EDRF) through the COMPETE Programme (operational programme for competitiveness) and by National Funds through the Fundacao para a Ciencia e a Tecnologia (FCT; Portuguese Foundation for Science and Technology) within project RED FCOMP-01-0124-FEDER-010156 and the FCT by PhD scholarship SFRH-BD-71372-2010

Universidade do Minho: RepositoriUM

Crossref

Evaluating the usefulness of content addressable storage for high-performance data intensive applications

Author: Anand Sivasubramaniam
Bhuvan Urgaonkar
Partho Nath
Publication venue: ACM
Publication date: 01/01/2008
Field of study

Content Addressable Storage (CAS) is a data representation technique that operates by partitioning a given data-set into non-intersecting units called chunks and then employing techniques to efficiently recognize chunks occurring multiple times. This allows CAS to eliminate duplicate instances of such chunks, resulting in reduced storage space compared to conventional representations of data. CAS is an attractive technique for reducing the storage and network bandwidth needs of performance-sensitive, data-intensive applications in a variety of domains. These include enterprise applications, Web-based e-commerce or entertainment services and highly parallel scientific/engineering applications and simulations, to name a few. In this paper, we conduct an empirical evaluation of the benefits offered by CAS to a variety of real-world data-intensive applications. The savings offered by CAS depend crucially on (i) the nature of the data-set itself and (ii) the chunk-size that CAS employs. We investigate the impact of both these factors on disk space savings, savings in network bandwidth, and error resilience of data. We find that a chunk-size of 1 KB can provide up to 84 % savings in disk space and even higher savings in network bandwidth whilst trading off error resilience and incurring 14 % CAS related overheads. Drawing upon lessons learned from our study, we provide insights on (i) the choice of the chunk-size for effective space savings and (ii) the use of selective data replication to counter the loss of error resilience caused by CAS

CiteSeerX

Crossref

Design Tradeoffs in Applying Content Addressable Storage

Author: David R. O’hallaron
Jan Harkes
M. Satyanarayanan
Matt Toups
Michael A. Kozuch
Niraj Tolia
Partho Nath
To Enterprise-Scale Systems
Publication venue
Publication date
Field of study

This paper analyzes the usage data from a live deployment of an enterprise client management system based on virtual machine (VM) technology. Over a period of seven months, twenty-three volunteers used VM-based computing environments hosted by the system and created over 800 checkpoints of VM state, where each checkpoint included the virtual memory and disk states. Using this data, we study the design tradeoffs in applying content addressable storage (CAS) to such VM-based systems. In particular, we explore the impact on storage requirements and network load of different privacy properties and data granularities in the design of the underlying CAS system. The study clearly demonstrates that relaxing privacy can reduce the resource requirements of the system, and identifies designs that provide reasonable compromises between privacy and resource demands

CiteSeerX

Design Tradeoffs in Applying Content Addressable Storage to Enterprise-scale Systems Based on Virtual Machines

Author: David R. O'Hallaron (5409044)
Jan Harkes (5415317)
Mahadev Satyanarayanan (3893374)
Matt Toups (5416181)
Michael A. Kozuch (5358098)
Niraj Tolia (5414678)
Partho Nath (5416184)
Publication venue
Publication date: 30/06/2018
Field of study

To Carry or To Find? Footloose on the Internet with a zero-pound laptop

Author: Adam Wolbach
Ajay Surie
Benjamin Gilbert
David R. O’hallaron Adrian Perrig
H. Andrés Lagar-cavilla
Jan Harkes
M. Satyanarayanan
Matt Toups
Michael A. Kozuch
Niraj Tolia
Partho Nath
Publication venue
Publication date
Field of study

Internet Suspend/Resume (ISR) is a new model of personal computing that cuts the tight binding between personal computing state and personal computing hardware. ISR is implemented by layering a virtual machine (VM) on distributed storage. The VM encapsulates execution and user customization state; distributed storage transports that state across space and time. In this paper, we explore the implications of ISR for an infrastructure-based approach to mobile computing. We report on our experience with three versions of ISR

CiteSeerX

COVID-19 infected ST-Elevation myocardial infarction in India (COSTA INDIA)

Author: Abdullakutty Jabir
Anshul Gupta
Anup Banerji
B. Kesavamoorthy
B.H. Natesh
Bhavesh Vajifdar
Bishwa Bhushan Bharti
Biswajit Majumder
Chandra Bhan Meena
Debabrata Roy
Deepthy Sadanandan
Dhiman Kahali
Dhurjati Prasad Sinha
Dipak Ranjan Das
Dipak Sharma
Dipankar Ghosh Dastidar
Dipankar Mukhapdhyay
Geevar Zachariah
Gurpreet Sing Wander
Harinder Kumar Bali
K.K. Sethi
Kamlesh Thakkar
L. Sameer Dani
L. Sridhar
Lincy Mathew
Manoj Kumar Agarwala
Meennahalli Palleda Girish
Mohit Dayal Gupta
Mrinal Kanti Das
Narendra Nath Khanna
Nitish Naik
P.K. Asokan
Padinhare Purayil Mohanan
Partho Sarathi Banerjee
Pathiyil Balagopalan Jayagopal
Pratap Chandra Rath
Praveen Nagula
Pravin K. Goel
Priya Kubendiran
Rabindra Nath Chakraborty
Rajendra Kumar Jain
Rakesh Yadav
S.M. Ashraf
Sanjeev Sharma
Satyavan Sharma
Satyendra Tewari
Sharad Chandra
Sivasubramanian Ramakrishnan
Smit Srivastava
Subrato Mandal
Sudeep Kumar
Suman Bhandari
Sundandan Sikdar
Tanuj Bhatia
Tom Devasia
Urmil Shah
Vijay Harikisan Bang
Vivek Gupta
Publication venue: Elsevier
Publication date: 01/07/2023
Field of study

Objective: To find out differences in the presentation, management and outcomes of COVID-19 infected STEMI patients compared to age and sex-matched non-infected STEMI patients treated during the same period. Methods: This was a retrospective multicentre observational registry in which we collected data of COVID-19 positive STEMI patients from selected tertiary care hospitals across India. For every COVID-19 positive STEMI patient, two age and sex-matched COVID-19 negative STEMI patients were enrolled as control. The primary endpoint was a composite of in-hospital mortality, re-infarction, heart failure, and stroke. Results: 410 COVID-19 positive STEMI cases were compared with 799 COVID-19 negative STEMI cases. The composite of death/reinfarction/stroke/heart failure was significantly higher among the COVID-19 positive STEMI patients compared with COVID-19 negative STEMI cases (27.1% vs 20.7% p value = 0.01); though mortality rate did not differ significantly (8.0% vs 5.8% p value = 0.13). Significantly lower proportion of COVID-19 positive STEMI patients received reperfusion treatment and primary PCI (60.7% vs 71.1% p value=< 0.001 and 15.4% vs 23.4% p value = 0.001 respectively). Rate of systematic early PCI (pharmaco-invasive treatment) was significantly lower in the COVID-19 positive group compared with COVID-19 negative group. There was no difference in the prevalence of high thrombus burden (14.5% and 12.0% p value = 0.55 among COVID-19 positive and negative patients respectively) Conclusions: In this large registry of STEMI patients, we did not find significant excess in in-hospital mortality among COVID-19 co-infected patients compared with non-infected patients despite lower rate of primary PCI and reperfusion treatment, though composite of in-hospital mortality, re-infarction, stroke and heart failure was higher

Directory of Open Access Journals