92,137 research outputs found

    StackInsights: Cognitive Learning for Hybrid Cloud Readiness

    Full text link
    Hybrid cloud is an integrated cloud computing environment utilizing a mix of public cloud, private cloud, and on-premise traditional IT infrastructures. Workload awareness, defined as a detailed full range understanding of each individual workload, is essential in implementing the hybrid cloud. While it is critical to perform an accurate analysis to determine which workloads are appropriate for on-premise deployment versus which workloads can be migrated to a cloud off-premise, the assessment is mainly performed by rule or policy based approaches. In this paper, we introduce StackInsights, a novel cognitive system to automatically analyze and predict the cloud readiness of workloads for an enterprise. Our system harnesses the critical metrics across the entire stack: 1) infrastructure metrics, 2) data relevance metrics, and 3) application taxonomy, to identify workloads that have characteristics of a) low sensitivity with respect to business security, criticality and compliance, and b) low response time requirements and access patterns. Since the capture of the data relevance metrics involves an intrusive and in-depth scanning of the content of storage objects, a machine learning model is applied to perform the business relevance classification by learning from the meta level metrics harnessed across stack. In contrast to traditional methods, StackInsights significantly reduces the total time for hybrid cloud readiness assessment by orders of magnitude

    Cold Storage Data Archives: More Than Just a Bunch of Tapes

    Full text link
    The abundance of available sensor and derived data from large scientific experiments, such as earth observation programs, radio astronomy sky surveys, and high-energy physics already exceeds the storage hardware globally fabricated per year. To that end, cold storage data archives are the---often overlooked---spearheads of modern big data analytics in scientific, data-intensive application domains. While high-performance data analytics has received much attention from the research community, the growing number of problems in designing and deploying cold storage archives has only received very little attention. In this paper, we take the first step towards bridging this gap in knowledge by presenting an analysis of four real-world cold storage archives from three different application domains. In doing so, we highlight (i) workload characteristics that differentiate these archives from traditional, performance-sensitive data analytics, (ii) design trade-offs involved in building cold storage systems for these archives, and (iii) deployment trade-offs with respect to migration to the public cloud. Based on our analysis, we discuss several other important research challenges that need to be addressed by the data management community

    Survey and Analysis of Production Distributed Computing Infrastructures

    Full text link
    This report has two objectives. First, we describe a set of the production distributed infrastructures currently available, so that the reader has a basic understanding of them. This includes explaining why each infrastructure was created and made available and how it has succeeded and failed. The set is not complete, but we believe it is representative. Second, we describe the infrastructures in terms of their use, which is a combination of how they were designed to be used and how users have found ways to use them. Applications are often designed and created with specific infrastructures in mind, with both an appreciation of the existing capabilities provided by those infrastructures and an anticipation of their future capabilities. Here, the infrastructures we discuss were often designed and created with specific applications in mind, or at least specific types of applications. The reader should understand how the interplay between the infrastructure providers and the users leads to such usages, which we call usage modalities. These usage modalities are really abstractions that exist between the infrastructures and the applications; they influence the infrastructures by representing the applications, and they influence the ap- plications by representing the infrastructures
    • …
    corecore