Search CORE

19,608 research outputs found

Monetary Cost Optimizations for Hosting Workflow-as-a-Service in IaaS Clouds

Author: He Bingsheng
Liu Cheng
Zhou Amelie Chi
Publication venue
Publication date: 29/04/2014
Field of study

Recently, we have witnessed workflows from science and other data-intensive applications emerging on Infrastructure-asa-Service (IaaS) clouds, and many workflow service providers offering workflow as a service (WaaS). The major concern of WaaS providers is to minimize the monetary cost of executing workflows in the IaaS cloud. While there have been previous studies on this concern, most of them assume static task execution time and static pricing scheme, and have the QoS notion of satisfying a deterministic deadline. However, cloud environment is dynamic, with performance dynamics caused by the interference from concurrent executions and price dynamics like spot prices offered by Amazon EC2. Therefore, we argue that WaaS providers should have the notion of offering probabilistic performance guarantees for individual workflows on IaaS clouds. We develop a probabilistic scheduling framework called Dyna to minimize the monetary cost while offering probabilistic deadline guarantees. The framework includes an A*-based instance configuration method for performance dynamics, and a hybrid instance configuration refinement for utilizing spot instances. Experimental results with three real-world scientific workflow applications on Amazon EC2 demonstrate (1) the accuracy of our framework on satisfying the probabilistic deadline guarantees required by the users; (2) the effectiveness of our framework on reducing monetary cost in comparison with the existing approaches

arXiv.org e-Print Archive

Fog computing and convolutional neural network enabled prognosis for machining process optimization

Author: Li Weidong
Liang Yuchen
Lu Xin
Wang Sheng
Publication venue: 'Elsevier BV'
Publication date: 01/07/2019
Field of study

Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

Author: Alam Mansaf
Ali Syed Arshad
Khan Samiya
Liu Xiufeng
Publication venue
Publication date: 01/01/2019
Field of study

Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

arXiv.org e-Print Archive

Multiple Workflows Scheduling in Multi-tenant Distributed Systems: A Taxonomy and Future Directions

Author: Buyya Rajkumar
Hilman Muhammad H.
Rodriguez Maria A.
Publication venue
Publication date: 21/05/2019
Field of study

The workflow is a general notion representing the automated processes along with the flow of data. The automation ensures the processes being executed in the order. Therefore, this feature attracts users from various background to build the workflow. However, the computational requirements are enormous and investing for a dedicated infrastructure for these workflows is not always feasible. To cater to the broader needs, multi-tenant platforms for executing workflows were began to be built. In this paper, we identify the problems and challenges in the multiple workflows scheduling that adhere to the platforms. We present a detailed taxonomy from the existing solutions on scheduling and resource provisioning aspects followed by the survey of relevant works in this area. We open up the problems and challenges to shove up the research on multiple workflows scheduling in multi-tenant distributed systems.Comment: Several changes has been done based on reviewers' comments after first round review. This is a pre-print for paper (currently under second round review) submitted to ACM Computing Survey

arXiv.org e-Print Archive

Synchronized Multi-Load Balancer with Fault Tolerance in Cloud

Author: Babu K R Remesh
S Sreelekshmi
Publication venue
Publication date: 03/11/2018
Field of study

In this method, service of one load balancer can be borrowed or shared among other load balancers when any correction is needed in the estimation of the load.Comment: 8 Pages, 10 figure

arXiv.org e-Print Archive

Early Accurate Results for Advanced Analytics on MapReduce

Author: Laptev Nikolay
Zaniolo Carlo
Zeng Kai
Publication venue
Publication date: 01/01/2012
Field of study

Approximate results based on samples often provide the only way in which advanced analytical applications on very massive data sets can satisfy their time and resource constraints. Unfortunately, methods and tools for the computation of accurate early results are currently not supported in MapReduce-oriented systems although these are intended for `big data'. Therefore, we proposed and implemented a non-parametric extension of Hadoop which allows the incremental computation of early results for arbitrary work-flows, along with reliable on-line estimates of the degree of accuracy achieved so far in the computation. These estimates are based on a technique called bootstrapping that has been widely employed in statistics and can be applied to arbitrary functions and data distributions. In this paper, we describe our Early Accurate Result Library (EARL) for Hadoop that was designed to minimize the changes required to the MapReduce framework. Various tests of EARL of Hadoop are presented to characterize the frequent situations where EARL can provide major speed-ups over the current version of Hadoop.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

AutoTiering: Automatic Data Placement Manager in Multi-Tier All-Flash Datacenter

Author: Andrews Allen
Bhimani Janki
Bolt Rory
Evans David
Hoseinzadeh Morteza
Mayers Clay
Mi Ningfang
Swanson Steven
Yang Zhengyu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/05/2018
Field of study

In the year of 2017, the capital expenditure of Flash-based Solid State Drivers (SSDs) keeps declining and the storage capacity of SSDs keeps increasing. As a result, the "selling point" of traditional spinning Hard Disk Drives (HDDs) as a backend storage - low cost and large capacity - is no longer unique, and eventually they will be replaced by low-end SSDs which have large capacity but perform orders of magnitude better than HDDs. Thus, it is widely believed that all-flash multi-tier storage systems will be adopted in the enterprise datacenters in the near future. However, existing caching or tiering solutions for SSD-HDD hybrid storage systems are not suitable for all-flash storage systems. This is because that all-flash storage systems do not have a large speed difference (e.g., 10x) among each tier. Instead, different specialties (such as high performance, high capacity, etc.) of each tier should be taken into consideration. Motivated by this, we develop an automatic data placement manager called "AutoTiering" to handle virtual machine disk files (VMDK) allocation and migration in an all-flash multi-tier datacenter to best utilize the storage resource, optimize the performance, and reduce the migration overhead. AutoTiering is based on an optimization framework, whose core technique is to predict VM's performance change on different tiers with different specialties without conducting real migration. As far as we know, AutoTiering is the first optimization solution designed for all-flash multi-tier datacenters. We implement AutoTiering on VMware ESXi, and experimental results show that it can significantly improve the I/O performance compared to existing solutions

arXiv.org e-Print Archive

Two stage cluster for resource optimization with Apache Mesos

Author: Govindaraju Madhusudhan
Rattihalli Gourav
Saha Pankaj
Tiwari Devesh
Publication venue
Publication date: 22/05/2019
Field of study

As resource estimation for jobs is difficult, users often overestimate their requirements. Both commercial clouds and academic campus clusters suffer from low resource utilization and long wait times as the resource estimates for jobs, provided by users, is inaccurate. We present an approach to statistically estimate the actual resource requirement of a job in a Little cluster before the run in a Big cluster. The initial estimation on the little cluster gives us a view of how much actual resources a job requires. This initial estimate allows us to accurately allocate resources for the pending jobs in the queue and thereby improve throughput and resource utilization. In our experiments, we determined resource utilization estimates with an average accuracy of 90% for memory and 94% for CPU, while we make better utilization of memory by an average of 22% and CPU by 53%, compared to the default job submission methods on Apache Aurora and Apache Mesos.Comment: MTAGS17:10th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputer

arXiv.org e-Print Archive

FPGA-based Accelerators of Deep Learning Networks for Learning and Classification: A Review

Author: El-Maleh Aiman
Sait Sadiq M.
Shawahna Ahmad
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Due to recent advances in digital technologies, and availability of credible data, an area of artificial intelligence, deep learning, has emerged, and has demonstrated its ability and effectiveness in solving complex learning problems not possible before. In particular, convolution neural networks (CNNs) have demonstrated their effectiveness in image detection and recognition applications. However, they require intensive CPU operations and memory bandwidth that make general CPUs fail to achieve desired performance levels. Consequently, hardware accelerators that use application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), and graphic processing units (GPUs) have been employed to improve the throughput of CNNs. More precisely, FPGAs have been recently adopted for accelerating the implementation of deep learning networks due to their ability to maximize parallelism as well as due to their energy efficiency. In this paper, we review recent existing techniques for accelerating deep learning networks on FPGAs. We highlight the key features employed by the various techniques for improving the acceleration performance. In addition, we provide recommendations for enhancing the utilization of FPGAs for CNNs acceleration. The techniques investigated in this paper represent the recent trends in FPGA-based accelerators of deep learning networks. Thus, this review is expected to direct the future advances on efficient hardware accelerators and to be useful for deep learning researchers.Comment: This article has been accepted for publication in IEEE Access (December, 2018

arXiv.org e-Print Archive

Reconfigurable Hardware Accelerators: Opportunities, Trends, and Challenges

Author: Gong Lei
Hu Yahui
Jin Lihui
Li Xi
Lou Wenqi
Tan Luchao
Wang Chao
Zhou Xuehai
Publication venue
Publication date: 13/12/2017
Field of study

With the emerging big data applications of Machine Learning, Speech Recognition, Artificial Intelligence, and DNA Sequencing in recent years, computer architecture research communities are facing the explosive scale of various data explosion. To achieve high efficiency of data-intensive computing, studies of heterogeneous accelerators which focus on latest applications, have become a hot issue in computer architecture domain. At present, the implementation of heterogeneous accelerators mainly relies on heterogeneous computing units such as Application-specific Integrated Circuit (ASIC), Graphics Processing Unit (GPU), and Field Programmable Gate Array (FPGA). Among the typical heterogeneous architectures above, FPGA-based reconfigurable accelerators have two merits as follows: First, FPGA architecture contains a large number of reconfigurable circuits, which satisfy requirements of high performance and low power consumption when specific applications are running. Second, the reconfigurable architectures of employing FPGA performs prototype systems rapidly and features excellent customizability and reconfigurability. Nowadays, in top-tier conferences of computer architecture, emerging a batch of accelerating works based on FPGA or other reconfigurable architectures. To better review the related work of reconfigurable computing accelerators recently, this survey reserves latest high-level research products of reconfigurable accelerator architectures and algorithm applications as the basis. In this survey, we compare hot research issues and concern domains, furthermore, analyze and illuminate advantages, disadvantages, and challenges of reconfigurable accelerators. In the end, we prospect the development tendency of accelerator architectures in the future, hoping to provide a reference for computer architecture researchers

arXiv.org e-Print Archive