6,720 research outputs found

    Cold Storage Data Archives: More Than Just a Bunch of Tapes

    Full text link
    The abundance of available sensor and derived data from large scientific experiments, such as earth observation programs, radio astronomy sky surveys, and high-energy physics already exceeds the storage hardware globally fabricated per year. To that end, cold storage data archives are the---often overlooked---spearheads of modern big data analytics in scientific, data-intensive application domains. While high-performance data analytics has received much attention from the research community, the growing number of problems in designing and deploying cold storage archives has only received very little attention. In this paper, we take the first step towards bridging this gap in knowledge by presenting an analysis of four real-world cold storage archives from three different application domains. In doing so, we highlight (i) workload characteristics that differentiate these archives from traditional, performance-sensitive data analytics, (ii) design trade-offs involved in building cold storage systems for these archives, and (iii) deployment trade-offs with respect to migration to the public cloud. Based on our analysis, we discuss several other important research challenges that need to be addressed by the data management community

    Task 51 - Cloud-Optimized Format Study

    Get PDF
    The cloud infrastructure provides a number of capabilities that can dramatically improve access and use of Earth Observation data. However, in many cases, data may need to be reorganized and/or reformatted in order to make them tractable to support cloud-native analysis/access patterns. The purpose of this study is to examine the pros and cons of different formats for storing data on the cloud. The evaluation will focus on both enabling high-performance data access and usage as well as to meet the existing scientific data stewardship needs of EOSDIS

    Critical Analysis of Solutions to Hadoop Small File Problem

    Get PDF
    Hadoop big data platform is designed to process large volume of data Small file problem is a performance bottleneck in Hadoop processing Small files lower than the block size of Hadoop creates huge storage overhead at Namenode s and also wastes computational resources due to spawning of many map tasks Various solutions like merging small files mapping multiple map threads to same java virtual machine instance etc have been proposed to solve the small file problems in Hadoop This survey does a critical analysis of existing works addressing small file problems in Hadoop and its variant platforms like Spark The aim is to understand their effectiveness in reducing the storage computational overhead and identify the open issues for further researc

    A Study On Secure Data Storage In Public Clouds

    Get PDF
    This paper focuses on the study of various existing cloud storage mechanisms with their related security frameworks for realizing the efficient cloud storage in a secured environment. The key feature for the growing popularity of cloud computing relies on the efficient management of stored data in a highly secure way for remote accessing. Ensuring the integrity and availability of user’s data stored in the cloud has always been an important aspect for its quality of service. While storing data in cloud, lots of issues with respect to security are being cropping out as clients have no direct physical control over their outsourced data. In addition, their vulnerabilities to external threats are increasing as cloud provides storage and accessing services via world-wide domain networking. This study will help in identifying different performance measures for secure available of data in cloud storage mechanisms

    Electronic Discovery in the Cloud

    Get PDF
    Cloud Computing is poised to offer tremendous benefits to clients, including inexpensive access to seemingly limitless resources that are available instantly, anywhere. To prepare for the shift from computing environments dependent on dedicated hardware to Cloud Computing, the Federal Rules of Discovery should be amended to provide relevant guidelines and exceptions for particular types of shared data. Meanwhile, clients should ensure that service contracts with Cloud providers include safeguards against inadvertent discoveries and mechanisms for complying with the Rules. Without these adaptations, clients will be either reluctant or unprepared to adopt Cloud Computing services, and forgo their benefits
    • …
    corecore