Search CORE

3,820 research outputs found

Auto-tuning Distributed Stream Processing Systems using Reinforcement Learning

Author: Cuadrado Felix
Vaquero Luis M.
Publication venue
Publication date: 14/09/2018
Field of study

Fine tuning distributed systems is considered to be a craftsmanship, relying on intuition and experience. This becomes even more challenging when the systems need to react in near real time, as streaming engines have to do to maintain pre-agreed service quality metrics. In this article, we present an automated approach that builds on a combination of supervised and reinforcement learning methods to recommend the most appropriate lever configurations based on previous load. With this, streaming engines can be automatically tuned without requiring a human to determine the right way and proper time to deploy them. This opens the door to new configurations that are not being applied today since the complexity of managing these systems has surpassed the abilities of human experts. We show how reinforcement learning systems can find substantially better configurations in less time than their human counterparts and adapt to changing workloads

arXiv.org e-Print Archive

Explore Bristol Research

Recommended from our members

Versatile stochastic dot product circuits based on nonvolatile memories for high performance neurocomputing and neurooptimization.

Author: Mahmoodi MR
Prezioso M
Strukov DB
Publication venue: eScholarship, University of California
Publication date: 01/11/2019
Field of study

The key operation in stochastic neural networks, which have become the state-of-the-art approach for solving problems in machine learning, information theory, and statistics, is a stochastic dot-product. While there have been many demonstrations of dot-product circuits and, separately, of stochastic neurons, the efficient hardware implementation combining both functionalities is still missing. Here we report compact, fast, energy-efficient, and scalable stochastic dot-product circuits based on either passively integrated metal-oxide memristors or embedded floating-gate memories. The circuit's high performance is due to mixed-signal implementation, while the efficient stochastic operation is achieved by utilizing circuit's noise, intrinsic and/or extrinsic to the memory cell array. The dynamic scaling of weights, enabled by analog memory devices, allows for efficient realization of different annealing approaches to improve functionality. The proposed approach is experimentally verified for two representative applications, namely by implementing neural network for solving a four-node graph-partitioning problem, and a Boltzmann machine with 10-input and 8-hidden neurons

eScholarship - University of California

Comparative Analysis of Decision Tree Algorithms for Data Warehouse Fragmentation

Author: Hernández Giner Alor
López Chau Asdrúbal
Rodríguez Mazahua Lisbeth
Rodríguez Mazahua Nidia
Publication venue: 'Fundacion Universitaria Ceipa'
Publication date: 01/12/2020
Field of study

One of the main problems faced by Data Warehouse designers is fragmentation.Several studies have proposed data mining-based horizontal fragmentation methods.However, not exists a horizontal fragmentation technique that uses a decision tree. This paper presents the analysis of different decision tree algorithms to select the best one to implement the fragmentation method. Such analysis was performed under version 3.9.4 of Weka, considering four evaluation metrics (Precision, ROC Area, Recall and F-measure) for different selected data sets using the Star Schema Benchmark. The results showed that the two best algorithms were J48 and Random Forest in most cases; nevertheless, J48 was selected because it is more efficient in building the model.One of the main problems faced by Data Warehouse designers is fragmentation.Several studies have proposed data mining-based horizontal fragmentation methods.However, not exists a horizontal fragmentation technique that uses a decision tree. This paper presents the analysis of different decision tree algorithms to select the best one to implement the fragmentation method. Such analysis was performed under version 3.9.4 of Weka, considering four evaluation metrics (Precision, ROC Area, Recall and F-measure) for different selected data sets using the Star Schema Benchmark. The results showed that the two best algorithms were J48 and Random Forest in most cases; nevertheless, J48 was selected because it is more efficient in building the model

Revistas CEIPA (CEIPA Business School)

Qd-tree: Learning Data Layouts for Big Data Analytics

Author: Agrawal Sanjay
Bruno Nicolas
Espeholt Lasse
Idreos Stratos
Liang Eric
Marcus Ryan
Moritz Philipp
Sun Liwen
Theo
Zilio Daniel C
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/04/2020
Field of study

Corporations today collect data at an unprecedented and accelerating scale, making the need to run queries on large datasets increasingly important. Technologies such as columnar block-based data organization and compression have become standard practice in most commercial database systems. However, the problem of best assigning records to data blocks on storage is still open. For example, today's systems usually partition data by arrival time into row groups, or range/hash partition the data based on selected fields. For a given workload, however, such techniques are unable to optimize for the important metric of the number of blocks accessed by a query. This metric directly relates to the I/O cost, and therefore performance, of most analytical queries. Further, they are unable to exploit additional available storage to drive this metric down further. In this paper, we propose a new framework called a query-data routing tree, or qd-tree, to address this problem, and propose two algorithms for their construction based on greedy and deep reinforcement learning techniques. Experiments over benchmark and real workloads show that a qd-tree can provide physical speedups of more than an order of magnitude compared to current blocking schemes, and can reach within 2X of the lower bound for data skipping based on selectivity, while providing complete semantic descriptions of created blocks.Comment: ACM SIGMOD 202

arXiv.org e-Print Archive

Crossref

Recommended from our members

Image-Based Modeling of Bridges and Its Applications to Evaluating Resiliency of Transportation Networks

Author: Cetiner Barbaros
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Modern urban areas are heavily dependent on transportation networks to sustain their economic life. Hence, when vital components of a regional network are disrupted, economic losses are inevitable. As evidenced by 1989, Loma Prieta and 1994, Northridge earthquakes, the seismic damages experienced by bridges alone result in extensive traffic delays and rerouting, not only hindering emergency response but also causing indirect economic losses that far surpass the direct cost of damage to infrastructure. Nevertheless, in many areas of the U.S., transportation networks lack the resilience required to sustain the potential demands of natural hazards. Traditional hazard assessment methods, in theory, provide the tools required for predicting the vulnerabilities associated with natural hazards. Nonetheless, due to their abstractions of the complex infrastructure and the coupled regional behavior, they often fall short of that expectation. This study proposes a semi-automated image-based model generation framework for producing structure-specific models and fragility functions of bridges. The framework effectively fuses geometric and semantic information extracted from Google Street View images with centerline curve geometry, surface topology, and various relevant metadata to construct extremely accurate geometric representations of bridges. Then, using class statistics available in the literature for bridge structural properties, the framework generates structural models. Both the performance of the geometry extraction procedure and the structural modeling method proposed here are validated by comparison against the structural model of a real-life bridge developed based on as-built drawings.In principle, these models can be utilized to assess physical damage for any type of hazard, but in this study, the focus is limited to seismic applications. Thus to relate the damage resulting from seismic demands from ground shaking, bridge-specific fragility functions are developed for 100 bridge structures in the immediate surroundings of Ports of Los Angeles and Long Beach. Using these fragility curves, the physical damage resulting from a magnitude 7.3 scenario earthquake on Palos Verdes fault is predicted. Subsequently, the effects of the bridge infrastructure damage to the transportation patterns in the Los Angeles metropolitan area are investigated in terms of various resilience metrics

eScholarship - University of California