Search CORE

486 research outputs found

MLPerf Inference Benchmark

Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability.Comment: ISCA 202

arXiv.org e-Print Archive

Crossref

The state of SQL-on-Hadoop in the cloud

Author: Berral García Josep Lluís
Blakeley Jose
Carrera Pérez David
Fenech Thomas
Minhas Umar F.
Poggi Nicolas
Vujic Nikola
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Managed Hadoop in the cloud, especially SQL-on-Hadoop, has been gaining attention recently. On Platform-as-a-Service (PaaS), analytical services like Hive and Spark come preconfigured for general-purpose and ready to use. Thus, giving companies a quick entry and on-demand deployment of ready SQL-like solutions for their big data needs. This study evaluates cloud services from an end-user perspective, comparing providers including: Microsoft Azure, Amazon Web Services, Google Cloud, and Rackspace. The study focuses on performance, readiness, scalability, and cost-effectiveness of the different solutions at entry/test level clusters sizes. Results are based on over 15,000 Hive queries derived from the industry standard TPC-H benchmark. The study is framed within the ALOJA research project, which features an open source benchmarking and analysis platform that has been recently extended to support SQL-on-Hadoop engines. The ALOJA Project aims to lower the total cost of ownership (TCO) of big data deployments and study their performance characteristics for optimization. The study benchmarks cloud providers across a diverse range instance types, and uses input data scales from 1GB to 1TB, in order to survey the popular entry-level PaaS SQL-on-Hadoop solutions, thereby establishing a common results-base upon which subsequent research can be carried out by the project. Initial results already show the main performance trends to both hardware and software configuration, pricing, similarities and architectural differences of the evaluated PaaS solutions. Whereas some providers focus on decoupling storage and computing resources while offering network-based elastic storage, others choose to keep the local processing model from Hadoop for high performance, but reducing flexibility. Results also show the importance of application-level tuning and how keeping up-to-date hardware and software stacks can influence performance even more than replicating the on-premises model in the cloud.This work is partially supported by the Microsoft Azure for Research program, the European Research Council (ERC) under the EUs Horizon 2020 programme (GA 639595), the Spanish Ministry of Education (TIN2015-65316-P), and the Generalitat de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Enabling virtualization technologies for enhanced cloud computing

Author: Qazi Kashifuddin
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/2015
Field of study

Cloud Computing is a ubiquitous technology that offers various services for individual users, small businesses, as well as large scale organizations. Data-center owners maintain clusters of thousands of machines and lease out resources like CPU, memory, network bandwidth, and storage to clients. For organizations, cloud computing provides the means to offload server infrastructure and obtain resources on demand, which reduces setup costs as well as maintenance overheads. For individuals, cloud computing offers platforms, resources and services that would otherwise be unavailable to them. At the core of cloud computing are various virtualization technologies and the resulting Virtual Machines (VMs). Virtualization enables cloud providers to host multiple VMs on a single Physical Machine (PM). The hallmark of VMs is the inability of the end-user to distinguish them from actual PMs. VMs allow cloud owners such essential features as live migration, which is the process of moving a VM from one PM to another while the VM is running, for various reasons. Features of the cloud such as fault tolerance, geographical server placement, energy management, resource management, big data processing, parallel computing, etc. depend heavily on virtualization technologies. Improvements and breakthroughs in these technologies directly lead to introduction of new possibilities in the cloud. This thesis identifies and proposes innovations for such underlying VM technologies and tests their performance on a cluster of 16 machines with real world benchmarks. Specifically the issues of server load prediction, VM consolidation, live migration, and memory sharing are attempted. First, a unique VM resource load prediction mechanism based on Chaos Theory is introduced that predicts server workloads with high accuracy. Based on these predictions, VMs are dynamically and autonomously relocated to different PMs in the cluster in an attempt to conserve energy. Experimental evaluations with a prototype on real world data- center load traces show that up to 80% of the unused PMs can be freed up and repurposed, with Service Level Objective (SLO) violations as little as 3%. Second, issues in live migration of VMs are analyzed, based on which a new distributed approach is presented that allows network-efficient live migration of VMs. The approach amortizes the transfer of memory pages over the life of the VM, thus reducing network traffic during critical live migration. The prototype reduces network usage by up to 45% and lowers required time by up to 40% for live migration on various real-world loads. Finally, a memory sharing and management approach called ACE-M is demonstrated that enables VMs to share and utilize all the memory available in the cluster remotely. Along with predictions on network and memory, this approach allows VMs to run applications with memory requirements much higher than physically available locally. It is experimentally shown that ACE-M reduces the memory performance degradation by about 75% and achieves a 40% lower network response time for memory intensive VMs. A combination of these innovations to the virtualization technologies can minimize performance degradation of various VM attributes, which will ultimately lead to a better end-user experience

Digital Commons @ New Jersey Institute of Technology (NJIT)

Hybrid mobile computing for connected autonomous vehicles

Author: Wei Jian
Publication venue
Publication date
Field of study

With increasing urbanization and the number of cars on road, there are many global issues on modern transport systems, Autonomous driving and connected vehicles are the most promising technologies to tackle these issues. The so-called integrated technology connected autonomous vehicles (CAV) can provide a wide range of safety applications for safer, greener and more efficient intelligent transport systems (ITS). As computing is an extreme component for CAV systems,various mobile computing models including mobile local computing, mobile edge computing and mobile cloud computing are proposed. However it is believed that none of these models fits all CAV applications, which have highly diverse quality of service (QoS) requirements such as communication delay, data rate, accuracy, reliability and/or computing latency.In this thesis, we are motivated to propose a hybrid mobile computing model with objective of overcoming limitations of individual models and maximizing the performances for CAV applications.In proposed hybrid mobile computing model three basic computing models and/or their combinations are chosen and applied to different CAV applications, which include mobile local computing, mobile edge computing and mobile cloud computing. Different computing models and their combinations are selected according to the QoS requirements of the CAV applications.Following the idea, we first investigate the job offloading and allocation of computing and communication resources at the local hosts and external computing centers with QoS aware and resource awareness. Distributed admission control and resource allocation algorithms are proposed including two baseline non-cooperative algorithms and a matching theory based cooperative algorithm. Experiment results demonstrate the feasibility of the hybrid mobile computing model and show large improvement on the service quality and capacity over existing individual computing models. The matching algorithm also largely outperforms the baseline non-cooperative algorithms.In addition, two specific use cases of the hybrid mobile computing for CAV applications are investigated: object detection with mobile local computing where only local computing resources are used, and movie recommendation with mobile cloud computing where remote cloud resources are used. For object detection, we focus on the challenges of detecting vehicles, pedestrians and cyclists in driving environment and propose three methods to an existing CNN based object detector. Large detection performance improvement is obtained over the KITTI benchmark test dataset. For movie recommendation we propose two recommendation models based on a general framework of integrating machine learning and collaborative filtering approach.The experiment results on Netix movie dataset show that our models are very effective for cold start items recommendatio

Aston Publications Explorer

Understanding and Improving the Performance of Read Operations Across the Storage Stack

Author: Borge Chavez Maria Fernanda
Publication venue: Faculty of Engineering, School of Computer Science
Publication date: 01/12/2019
Field of study

We live in a data-driven era, large amounts of data are generated and collected every day. Storage systems are the backbone of this era, as they store and retrieve data. To cope with increasing data demands (e.g., diversity, scalability), storage systems are experiencing changes across the stack. As other computer systems, storage systems rely on layering and modularity, to allow rapid development. Unfortunately, this can hinder performance clarity and introduce degradations (e.g., tail latency), due to unexpected interactions between components of the stack. In this thesis, we first perform a study to understand the behavior across different layers of the storage stack. We focus on sequential read workloads, a common I/O pattern in distributed le systems (e.g., HDFS, GFS). We analyze the interaction between read workloads, local le systems (i.e., ext4), and storage media (i.e., SSDs). We perform the same experiment over different periods of time (e.g., le lifetime). We uncover 3 slowdowns, all of which occur in the lower layers. When combined, these slowdowns can degrade throughput by 30%. We find that increased parallelism on the local le system mitigates these slowdowns, showing the need for adaptability in storage stacks. Given the fact that performance instabilities can occur at any layer of the stack, it is important that upper-layer systems are able to react. We propose smart hedging, a novel technique to manage high-percentile (tail) latency variations in read operations. Smart hedging considers production challenges, such as massive scalability, heterogeneity, and ease of deployment and maintainability. Our technique establishes a dynamic threshold by tracking latencies on the client-side. If a read operation exceeds the threshold, a new hedged request is issued, in an exponential back-off manner. We implement our technique in HDFS and evaluate it on 70k servers in 3 datacenters. Our technique reduces average tail latency, without generating excessive system load

Sydney eScholarship

Back-end reference architecture for smart water meter data gathering service

Author: Jokio J. (Jussi)
Publication venue: University of Oulu
Publication date: 14/02/2020
Field of study

Abstract. The Finnish waterworks industry is on the brink of digitalization. Currently, many of them have started to convert their water meters to smart water meters. However, there is yet no suitable solution for gathering the IoT data from these smart water meters. To answer their arising needs, many pilots and workshops have been conducted. Those pilots have yielded some basic ground rules for their use cases. In this study, those ground rules have been gathered as a set of requirement categories. The categories are studied and analyzed in order to establish a reference architecture for IoT data-gathering systems suitable for waterworks. Using the requirements and the reference architecture, an information system, Dataservice, was implemented by Vesitieto Oy. The system gathers the IoT data and visualizes it to waterworks’ employees. The System was deployed in Microsoft’s cloud service, but other cloud vendors were examined as well. The system has a two-folded database system, the data required by the system, like users and user groups, are held in the SQL database. The IoT-data is held in a NoSQL database. The selected NoSQL database provider was MongoDB as it could be integrated with the cloud provider.Etäluettavien vesimittareiden datapalvelun viitearkkitehtuuri. Tiivistelmä. Suomen vesihuolto on digitalisaation partaalla. Tällä hetkellä monet vesilaitokset ovat alkaneet vaihtaa vanhoja analogisia vesimittareitaan älykkäiksi vesimittareiksi. Vesihuoltolaitokset eivät kuitenkaan ole löytäneet kaikille sopivaa ratkaisua IoT-tiedon keräämiseen älykkäistä vesimittareista. Vastatakseen vesilaitosten tarpeisiin, monia pilotteja ja työpajoja on järjestetty eri yhteistyökumppaneiden kanssa. Näistä eri piloteista on muodostunut käsitys siitä, kuinka vesimittareiden digitalisaatio voidaan ratkaista vesilaitoksilla. Tässä tutkimuksessa eri laitosten väliset perussäännöt on koottu ohjelmistovaatimusluokiksi. Näitä luokkia tutkitaan ja analysoidaan vesilaitoksille sopivan IoT-tiedonkeruujärjestelmän viitearkkitehtuurin luomiseksi. Vaatimuksia ja viitearkkitehtuuria hyödyntäen Vesitieto Oy toteutti tietojärjestelmän nimeltään ”Dataservice”. Järjestelmä kerää IoT-tiedot ja visualisoi ne vesilaitosten työntekijöille. Järjestelmä otettiin käyttöön Microsoftin pilvipalvelussa, mutta myös muita pilvipalvelun palveluntarjoajia tutkittiin. Järjestelmässä on kaksiportainen tietokantajärjestelmä. Järjestelmän tarvitsemat tiedot kuten käyttäjät sekä käyttäjäryhmät pidetään SQLtietokannassa ja IoT-tiedot pidetään NoSQL-tietokannassa. Valittu NoSQL tietokantajärjestelmä oli MongoDB, koska se voitiin integroida pilvipalveluntarjoajan kanssa

University of Oulu Repository - Jultika

DeltaFS: Pursuing Zero Update Overhead via Metadata-Enabled Delta Compression for Log-structured File System on Mobile Devices

Author: Guo Weichao
Ji Cheng
Pan Riwei
Wang Yanzhi
Wu Chao
Yu Chao
Yuan Geng
Zhu Zongwei
Publication venue
Publication date: 06/10/2022
Field of study

Data compression has been widely adopted to release mobile devices from intensive write pressure. Delta compression is particularly promising for its high compression efficacy over conventional compression methods. However, this method suffers from non-trivial system overheads incurred by delta maintenance and read penalty, which prevents its applicability on mobile devices. To this end, this paper proposes DeltaFS, a metadata-enabled Delta compression on log-structured File System for mobile devices, to achieve utmost compressing efficiency and zero hardware costs. DeltaFS smartly exploits the out-of-place updating ability of Log-structured File System (LFS) to alleviate the problems of write amplification, which is the key bottleneck for delta compression implementation. Specifically, DeltaFS utilizes the inline area in file inodes for delta maintenance with zero hardware cost, and integrates an inline area manage strategy to improve the utilization of constrained inline area. Moreover, a complimentary delta maintenance strategy is incorporated, which selectively maintains delta chunks in the main data area to break through the limitation of constrained inline area. Experimental results show that DeltaFS substantially reduces write traffics by up to 64.8\%, and improves the I/O performance by up to 37.3\%

arXiv.org e-Print Archive

A Competition-based Pricing Strategy in Cloud Markets using Regret Minimization Techniques

Author: Dehghan M.
Ghasemi S.
Meybodi M. R.
Rahmani A. M.
Publication venue
Publication date: 20/09/2023
Field of study

Cloud computing as a fairly new commercial paradigm, widely investigated by different researchers, already has a great range of challenges. Pricing is a major problem in Cloud computing marketplace; as providers are competing to attract more customers without knowing the pricing policies of each other. To overcome this lack of knowledge, we model their competition by an incomplete-information game. Considering the issue, this work proposes a pricing policy related to the regret minimization algorithm and applies it to the considered incomplete-information game. Based on the competition based marketplace of the Cloud, providers update the distribution of their strategies using the experienced regret. The idea of iteratively applying the algorithm for updating probabilities of strategies causes the regret get minimized faster. The experimental results show much more increase in profits of the providers in comparison with other pricing policies. Besides, the efficiency of a variety of regret minimization techniques in a simulated marketplace of Cloud are discussed which have not been observed in the studied literature. Moreover, return on investment of providers in considered organizations is studied and promising results appeared

arXiv.org e-Print Archive