92,137 research outputs found
StackInsights: Cognitive Learning for Hybrid Cloud Readiness
Hybrid cloud is an integrated cloud computing environment utilizing a mix of
public cloud, private cloud, and on-premise traditional IT infrastructures.
Workload awareness, defined as a detailed full range understanding of each
individual workload, is essential in implementing the hybrid cloud. While it is
critical to perform an accurate analysis to determine which workloads are
appropriate for on-premise deployment versus which workloads can be migrated to
a cloud off-premise, the assessment is mainly performed by rule or policy based
approaches. In this paper, we introduce StackInsights, a novel cognitive system
to automatically analyze and predict the cloud readiness of workloads for an
enterprise. Our system harnesses the critical metrics across the entire stack:
1) infrastructure metrics, 2) data relevance metrics, and 3) application
taxonomy, to identify workloads that have characteristics of a) low sensitivity
with respect to business security, criticality and compliance, and b) low
response time requirements and access patterns. Since the capture of the data
relevance metrics involves an intrusive and in-depth scanning of the content of
storage objects, a machine learning model is applied to perform the business
relevance classification by learning from the meta level metrics harnessed
across stack. In contrast to traditional methods, StackInsights significantly
reduces the total time for hybrid cloud readiness assessment by orders of
magnitude
Cold Storage Data Archives: More Than Just a Bunch of Tapes
The abundance of available sensor and derived data from large scientific
experiments, such as earth observation programs, radio astronomy sky surveys,
and high-energy physics already exceeds the storage hardware globally
fabricated per year. To that end, cold storage data archives are the---often
overlooked---spearheads of modern big data analytics in scientific,
data-intensive application domains. While high-performance data analytics has
received much attention from the research community, the growing number of
problems in designing and deploying cold storage archives has only received
very little attention.
In this paper, we take the first step towards bridging this gap in knowledge
by presenting an analysis of four real-world cold storage archives from three
different application domains. In doing so, we highlight (i) workload
characteristics that differentiate these archives from traditional,
performance-sensitive data analytics, (ii) design trade-offs involved in
building cold storage systems for these archives, and (iii) deployment
trade-offs with respect to migration to the public cloud. Based on our
analysis, we discuss several other important research challenges that need to
be addressed by the data management community
Survey and Analysis of Production Distributed Computing Infrastructures
This report has two objectives. First, we describe a set of the production
distributed infrastructures currently available, so that the reader has a basic
understanding of them. This includes explaining why each infrastructure was
created and made available and how it has succeeded and failed. The set is not
complete, but we believe it is representative.
Second, we describe the infrastructures in terms of their use, which is a
combination of how they were designed to be used and how users have found ways
to use them. Applications are often designed and created with specific
infrastructures in mind, with both an appreciation of the existing capabilities
provided by those infrastructures and an anticipation of their future
capabilities. Here, the infrastructures we discuss were often designed and
created with specific applications in mind, or at least specific types of
applications. The reader should understand how the interplay between the
infrastructure providers and the users leads to such usages, which we call
usage modalities. These usage modalities are really abstractions that exist
between the infrastructures and the applications; they influence the
infrastructures by representing the applications, and they influence the ap-
plications by representing the infrastructures
- …