Search CORE

2,807 research outputs found

A Comparative Taxonomy and Survey of Public Cloud Infrastructure Vendors

Author: Devetsikiotis Michael
Papapanagiotou Ioannis
Rimal Bhaskar Prasad
Sikeridis Dimitrios
Publication venue
Publication date: 28/01/2018
Field of study

An increasing number of technology enterprises are adopting cloud-native architectures to offer their web-based products, by moving away from privately-owned data-centers and relying exclusively on cloud service providers. As a result, cloud vendors have lately increased, along with the estimated annual revenue they share. However, in the process of selecting a provider's cloud service over the competition, we observe a lack of universal common ground in terms of terminology, functionality of services and billing models. This is an important gap especially under the new reality of the industry where each cloud provider has moved towards his own service taxonomy, while the number of specialized services has grown exponentially. This work discusses cloud services offered by four dominant, in terms of their current market share, cloud vendors. We provide a taxonomy of their services and sub-services that designates major service families namely computing, storage, databases, analytics, data pipelines, machine learning, and networking. The aim of such clustering is to indicate similarities, common design approaches and functional differences of the offered services. The outcomes are essential both for individual researchers, and bigger enterprises in their attempt to identify the set of cloud services that will utterly meet their needs without compromises. While we acknowledge the fact that this is a dynamic industry, where new services arise constantly, and old ones experience important updates, this study paints a solid image of the current offerings and gives prominence to the directions that cloud service providers are following

arXiv.org e-Print Archive

The ISTI Rapid Response on Exploring Cloud Computing 2018

This report describes eighteen projects that explored how commercial cloud computing services can be utilized for scientific computation at national laboratories. These demonstrations ranged from deploying proprietary software in a cloud environment to leveraging established cloud-based analytics workflows for processing scientific datasets. By and large, the projects were successful and collectively they suggest that cloud computing can be a valuable computational resource for scientific computation at national laboratories

arXiv.org e-Print Archive

Resource Management and Scheduling for Big Data Applications in Cloud Computing Environments

Author: Buyya Rajkumar
Islam Muhammed Tawfiqul
Publication venue
Publication date: 30/12/2018
Field of study

This chapter presents software architectures of the big data processing platforms. It will provide an in-depth knowledge on resource management techniques involved while deploying big data processing systems on cloud environment. It starts from the very basics and gradually introduce the core components of resource management which we have divided in multiple layers. It covers the state-of-art practices and researches done in SLA-based resource management with a specific focus on the job scheduling mechanisms.Comment: 27 pages, 9 figure

arXiv.org e-Print Archive

On-Demand Virtual Research Environments using Microservices

Author: Capuccini Marco
Carone Matteo
Gao Jianliang
Larsson Anders
Novella Jon Ander
Sadawi Noureddin
Spjuth Ola
Toor Salman
Publication venue: 'PeerJ'
Publication date: 10/05/2019
Field of study

The computational demands for scientific applications are continuously increasing. The emergence of cloud computing has enabled on-demand resource allocation. However, relying solely on infrastructure as a service does not achieve the degree of flexibility required by the scientific community. Here we present a microservice-oriented methodology, where scientific applications run in a distributed orchestration platform as software containers, referred to as on-demand, virtual research environments. The methodology is vendor agnostic and we provide an open source implementation that supports the major cloud providers, offering scalable management of scientific pipelines. We demonstrate applicability and scalability of our methodology in life science applications, but the methodology is general and can be applied to other scientific domains

arXiv.org e-Print Archive

Analytics for the Internet of Things: A Survey

Author: Hall Wendy
Siow Eugene
Tiropanis Thanassis
Publication venue
Publication date: 03/07/2018
Field of study

The Internet of Things (IoT) envisions a world-wide, interconnected network of smart physical entities. These physical entities generate a large amount of data in operation and as the IoT gains momentum in terms of deployment, the combined scale of those data seems destined to continue to grow. Increasingly, applications for the IoT involve analytics. Data analytics is the process of deriving knowledge from data, generating value like actionable insights from them. This article reviews work in the IoT and big data analytics from the perspective of their utility in creating efficient, effective and innovative applications and services for a wide spectrum of domains. We review the broad vision for the IoT as it is shaped in various communities, examine the application of data analytics across IoT domains, provide a categorisation of analytic approaches and propose a layered taxonomy from IoT data to analytics. This taxonomy provides us with insights on the appropriateness of analytical techniques, which in turn shapes a survey of enabling technology and infrastructure for IoT analytics. Finally, we look at some tradeoffs for analytics in the IoT that can shape future research

arXiv.org e-Print Archive

Data Management in Industry 4.0: State of the Art and Open Challenges

Author: Conti Marco
Passarella Andrea
Raptis Theofanis P.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/05/2019
Field of study

Information and communication technologies are permeating all aspects of industrial and manufacturing systems, expediting the generation of large volumes of industrial data. This article surveys the recent literature on data management as it applies to networked industrial environments and identifies several open research challenges for the future. As a first step, we extract important data properties (volume, variety, traffic, criticality) and identify the corresponding data enabling technologies of diverse fundamental industrial use cases, based on practical applications. Secondly, we provide a detailed outline of recent industrial architectural designs with respect to their data management philosophy (data presence, data coordination, data computation) and the extent of their distributiveness. Then, we conduct a holistic survey of the recent literature from which we derive a taxonomy of the latest advances on industrial data enabling technologies and data centric services, spanning all the way from the field level deep in the physical deployments, up to the cloud and applications level. Finally, motivated by the rich conclusions of this critical analysis, we identify interesting open challenges for future research. The concepts presented in this article thematically cover the largest part of the industrial automation pyramid layers. Our approach is multidisciplinary, as the selected publications were drawn from two fields; the communications, networking and computation field as well as the industrial, manufacturing and automation field. The article can help the readers to deeply understand how data management is currently applied in networked industrial environments, and select interesting open research opportunities to pursue

arXiv.org e-Print Archive

OCCAM: a flexible, multi-purpose and extendable HPC cluster

Author: Aldinucci Marco
Bagnasco Stefano
Lusso Stefano
Pasteris Paolo
Rabellino Sergio
Vallero Sara
Publication venue: 'IOP Publishing'
Publication date: 12/09/2017
Field of study

The Open Computing Cluster for Advanced data Manipulation (OCCAM) is a multi-purpose flexible HPC cluster designed and operated by a collaboration between the University of Torino and the Sezione di Torino of the Istituto Nazionale di Fisica Nucleare. It is aimed at providing a flexible, reconfigurable and extendable infrastructure to cater to a wide range of different scientific computing use cases, including ones from solid-state chemistry, high-energy physics, computer science, big data analytics, computational biology, genomics and many others. Furthermore, it will serve as a platform for R&D activities on computational technologies themselves, with topics ranging from GPU acceleration to Cloud Computing technologies. A heterogeneous and reconfigurable system like this poses a number of challenges related to the frequency at which heterogeneous hardware resources might change their availability and shareability status, which in turn affect methods and means to allocate, manage, optimize, bill, monitor VMs, containers, virtual farms, jobs, interactive bare-metal sessions, etc. This work describes some of the use cases that prompted the design and construction of the HPC cluster, its architecture and resource provisioning model, along with a first characterization of its performance by some synthetic benchmark tools and a few realistic use-case tests.Comment: Accepted for publication in the Proceedings of CHEP2016, San Francisco, US

arXiv.org e-Print Archive

IBM Deep Learning Service

Author: Bhattacharjee Bishwaranjan
Boag Scott
Doshi Chandani
Dube Parijat
Herta Ben
Ishakian Vatche
Jayaram K. R.
Khalaf Rania
Krishna Avesh
Li Yu Bo
Muthusamy Vinod
Puri Ruchir
Ren Yufei
Rosenberg Florian
Seelam Seetharami R.
Wang Yandong
Zhang Jian Ming
Zhang Li
Publication venue
Publication date: 18/09/2017
Field of study

Deep learning driven by large neural network models is overtaking traditional machine learning methods for understanding unstructured and perceptual data domains such as speech, text, and vision. At the same time, the "as-a-Service"-based business model on the cloud is fundamentally transforming the information technology industry. These two trends: deep learning, and "as-a-service" are colliding to give rise to a new business model for cognitive application delivery: deep learning as a service in the cloud. In this paper, we will discuss the details of the software architecture behind IBM's deep learning as a service (DLaaS). DLaaS provides developers the flexibility to use popular deep learning libraries such as Caffe, Torch and TensorFlow, in the cloud in a scalable and resilient manner with minimal effort. The platform uses a distribution and orchestration layer that facilitates learning from a large amount of data in a reasonable amount of time across compute nodes. A resource provisioning layer enables flexible job management on heterogeneous resources, such as graphics processing units (GPUs) and central processing units (CPUs), in an infrastructure as a service (IaaS) cloud

arXiv.org e-Print Archive

Medical data processing and analysis for remote health and activities monitoring

Author: Hosseinpour F.
Ilic A.S.
Jakobik A.
Jarynowski A.
Krzyszton M.
Marks M.
Moldovan D.
Molina J.M.
Pllana S.
Pop C.
Respicio A.
Salomie I.
Sikora A.
Stojanovic D.
Vitabile S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Recent developments in sensor technology, wearable computing, Internet of Things (IoT), and wireless communication have given rise to research in ubiquitous healthcare and remote monitoring of human\u2019s health and activities. Health monitoring systems involve processing and analysis of data retrieved from smartphones, smart watches, smart bracelets, as well as various sensors and wearable devices. Such systems enable continuous monitoring of patients psychological and health conditions by sensing and transmitting measurements such as heart rate, electrocardiogram, body temperature, respiratory rate, chest sounds, or blood pressure. Pervasive healthcare, as a relevant application domain in this context, aims at revolutionizing the delivery of medical services through a medical assistive environment and facilitates the independent living of patients. In this chapter, we discuss (1) data collection, fusion, ownership and privacy issues; (2) models, technologies and solutions for medical data processing and analysis; (3) big medical data analytics for remote health monitoring; (4) research challenges and opportunities in medical data analytics; (5) examples of case studies and practical solutions

Archivio istituzionale della ricerca - Università di Palermo

Serving deep learning models in a serverless platform

Author: Ishakian Vatche
Muthusamy Vinod
Slominski Aleksander
Publication venue
Publication date: 09/02/2018
Field of study

Serverless computing has emerged as a compelling paradigm for the development and deployment of a wide range of event based cloud applications. At the same time, cloud providers and enterprise companies are heavily adopting machine learning and Artificial Intelligence to either differentiate themselves, or provide their customers with value added services. In this work we evaluate the suitability of a serverless computing environment for the inferencing of large neural network models. Our experimental evaluations are executed on the AWS Lambda environment using the MxNet deep learning framework. Our experimental results show that while the inferencing latency can be within an acceptable range, longer delays due to cold starts can skew the latency distribution and hence risk violating more stringent SLAs

arXiv.org e-Print Archive