2,807 research outputs found
A Comparative Taxonomy and Survey of Public Cloud Infrastructure Vendors
An increasing number of technology enterprises are adopting cloud-native
architectures to offer their web-based products, by moving away from
privately-owned data-centers and relying exclusively on cloud service
providers. As a result, cloud vendors have lately increased, along with the
estimated annual revenue they share. However, in the process of selecting a
provider's cloud service over the competition, we observe a lack of universal
common ground in terms of terminology, functionality of services and billing
models. This is an important gap especially under the new reality of the
industry where each cloud provider has moved towards his own service taxonomy,
while the number of specialized services has grown exponentially. This work
discusses cloud services offered by four dominant, in terms of their current
market share, cloud vendors. We provide a taxonomy of their services and
sub-services that designates major service families namely computing, storage,
databases, analytics, data pipelines, machine learning, and networking. The aim
of such clustering is to indicate similarities, common design approaches and
functional differences of the offered services. The outcomes are essential both
for individual researchers, and bigger enterprises in their attempt to identify
the set of cloud services that will utterly meet their needs without
compromises. While we acknowledge the fact that this is a dynamic industry,
where new services arise constantly, and old ones experience important updates,
this study paints a solid image of the current offerings and gives prominence
to the directions that cloud service providers are following
The ISTI Rapid Response on Exploring Cloud Computing 2018
This report describes eighteen projects that explored how commercial cloud
computing services can be utilized for scientific computation at national
laboratories. These demonstrations ranged from deploying proprietary software
in a cloud environment to leveraging established cloud-based analytics
workflows for processing scientific datasets. By and large, the projects were
successful and collectively they suggest that cloud computing can be a valuable
computational resource for scientific computation at national laboratories
Resource Management and Scheduling for Big Data Applications in Cloud Computing Environments
This chapter presents software architectures of the big data processing
platforms. It will provide an in-depth knowledge on resource management
techniques involved while deploying big data processing systems on cloud
environment. It starts from the very basics and gradually introduce the core
components of resource management which we have divided in multiple layers. It
covers the state-of-art practices and researches done in SLA-based resource
management with a specific focus on the job scheduling mechanisms.Comment: 27 pages, 9 figure
On-Demand Virtual Research Environments using Microservices
The computational demands for scientific applications are continuously
increasing. The emergence of cloud computing has enabled on-demand resource
allocation. However, relying solely on infrastructure as a service does not
achieve the degree of flexibility required by the scientific community. Here we
present a microservice-oriented methodology, where scientific applications run
in a distributed orchestration platform as software containers, referred to as
on-demand, virtual research environments. The methodology is vendor agnostic
and we provide an open source implementation that supports the major cloud
providers, offering scalable management of scientific pipelines. We demonstrate
applicability and scalability of our methodology in life science applications,
but the methodology is general and can be applied to other scientific domains
Analytics for the Internet of Things: A Survey
The Internet of Things (IoT) envisions a world-wide, interconnected network
of smart physical entities. These physical entities generate a large amount of
data in operation and as the IoT gains momentum in terms of deployment, the
combined scale of those data seems destined to continue to grow. Increasingly,
applications for the IoT involve analytics. Data analytics is the process of
deriving knowledge from data, generating value like actionable insights from
them. This article reviews work in the IoT and big data analytics from the
perspective of their utility in creating efficient, effective and innovative
applications and services for a wide spectrum of domains. We review the broad
vision for the IoT as it is shaped in various communities, examine the
application of data analytics across IoT domains, provide a categorisation of
analytic approaches and propose a layered taxonomy from IoT data to analytics.
This taxonomy provides us with insights on the appropriateness of analytical
techniques, which in turn shapes a survey of enabling technology and
infrastructure for IoT analytics. Finally, we look at some tradeoffs for
analytics in the IoT that can shape future research
Data Management in Industry 4.0: State of the Art and Open Challenges
Information and communication technologies are permeating all aspects of
industrial and manufacturing systems, expediting the generation of large
volumes of industrial data. This article surveys the recent literature on data
management as it applies to networked industrial environments and identifies
several open research challenges for the future. As a first step, we extract
important data properties (volume, variety, traffic, criticality) and identify
the corresponding data enabling technologies of diverse fundamental industrial
use cases, based on practical applications. Secondly, we provide a detailed
outline of recent industrial architectural designs with respect to their data
management philosophy (data presence, data coordination, data computation) and
the extent of their distributiveness. Then, we conduct a holistic survey of the
recent literature from which we derive a taxonomy of the latest advances on
industrial data enabling technologies and data centric services, spanning all
the way from the field level deep in the physical deployments, up to the cloud
and applications level. Finally, motivated by the rich conclusions of this
critical analysis, we identify interesting open challenges for future research.
The concepts presented in this article thematically cover the largest part of
the industrial automation pyramid layers. Our approach is multidisciplinary, as
the selected publications were drawn from two fields; the communications,
networking and computation field as well as the industrial, manufacturing and
automation field. The article can help the readers to deeply understand how
data management is currently applied in networked industrial environments, and
select interesting open research opportunities to pursue
OCCAM: a flexible, multi-purpose and extendable HPC cluster
The Open Computing Cluster for Advanced data Manipulation (OCCAM) is a
multi-purpose flexible HPC cluster designed and operated by a collaboration
between the University of Torino and the Sezione di Torino of the Istituto
Nazionale di Fisica Nucleare. It is aimed at providing a flexible,
reconfigurable and extendable infrastructure to cater to a wide range of
different scientific computing use cases, including ones from solid-state
chemistry, high-energy physics, computer science, big data analytics,
computational biology, genomics and many others. Furthermore, it will serve as
a platform for R&D activities on computational technologies themselves, with
topics ranging from GPU acceleration to Cloud Computing technologies. A
heterogeneous and reconfigurable system like this poses a number of challenges
related to the frequency at which heterogeneous hardware resources might change
their availability and shareability status, which in turn affect methods and
means to allocate, manage, optimize, bill, monitor VMs, containers, virtual
farms, jobs, interactive bare-metal sessions, etc. This work describes some of
the use cases that prompted the design and construction of the HPC cluster, its
architecture and resource provisioning model, along with a first
characterization of its performance by some synthetic benchmark tools and a few
realistic use-case tests.Comment: Accepted for publication in the Proceedings of CHEP2016, San
Francisco, US
IBM Deep Learning Service
Deep learning driven by large neural network models is overtaking traditional
machine learning methods for understanding unstructured and perceptual data
domains such as speech, text, and vision. At the same time, the
"as-a-Service"-based business model on the cloud is fundamentally transforming
the information technology industry. These two trends: deep learning, and
"as-a-service" are colliding to give rise to a new business model for cognitive
application delivery: deep learning as a service in the cloud. In this paper,
we will discuss the details of the software architecture behind IBM's deep
learning as a service (DLaaS). DLaaS provides developers the flexibility to use
popular deep learning libraries such as Caffe, Torch and TensorFlow, in the
cloud in a scalable and resilient manner with minimal effort. The platform uses
a distribution and orchestration layer that facilitates learning from a large
amount of data in a reasonable amount of time across compute nodes. A resource
provisioning layer enables flexible job management on heterogeneous resources,
such as graphics processing units (GPUs) and central processing units (CPUs),
in an infrastructure as a service (IaaS) cloud
Medical data processing and analysis for remote health and activities monitoring
Recent developments in sensor technology, wearable computing, Internet of Things (IoT), and wireless communication have given rise to research in ubiquitous healthcare and remote monitoring of human\u2019s health and activities. Health monitoring systems involve processing and analysis of data retrieved from smartphones, smart watches, smart bracelets, as well as various sensors and wearable devices. Such systems enable continuous monitoring of patients psychological and health conditions by sensing and transmitting measurements such as heart rate, electrocardiogram, body temperature, respiratory rate, chest sounds, or blood pressure. Pervasive healthcare, as a relevant application domain in this context, aims at revolutionizing the delivery of medical services through a medical assistive environment and facilitates the independent living of patients. In this chapter, we discuss (1) data collection, fusion, ownership and privacy issues; (2) models, technologies and solutions for medical data processing and analysis; (3) big medical data analytics for remote health monitoring; (4) research challenges and opportunities in medical data analytics; (5) examples of case studies and practical solutions
Serving deep learning models in a serverless platform
Serverless computing has emerged as a compelling paradigm for the development
and deployment of a wide range of event based cloud applications. At the same
time, cloud providers and enterprise companies are heavily adopting machine
learning and Artificial Intelligence to either differentiate themselves, or
provide their customers with value added services. In this work we evaluate the
suitability of a serverless computing environment for the inferencing of large
neural network models. Our experimental evaluations are executed on the AWS
Lambda environment using the MxNet deep learning framework. Our experimental
results show that while the inferencing latency can be within an acceptable
range, longer delays due to cold starts can skew the latency distribution and
hence risk violating more stringent SLAs
- …