58 research outputs found
Optimizing the Replication of Multi-Quality Web Applications Using ACO and WoLF
This thesis presents the adaptation of Ant Colony Optimization to a new NP-hard problem involving the replication of multi-quality database-driven web applications (DAs) by a large application service provider (ASP). The ASP must assign DA replicas to its network of heterogeneous servers so that user demand is satisfied and replica update loads are minimized. The algorithm proposed, AntDA, for solving this problem is novel in several respects: ants traverse a bipartite graph in both directions as they construct solutions, pheromone is used for traversing from one side of the bipartite graph to the other and back again, heuristic edge values change as ants construct solutions, and ants may sometimes produce infeasible solutions. Experiments show that AntDA outperforms several other solution methods, but there was room for improvement in the convergence rates of the ants. Therefore, in an attempt to achieve the goals of faster convergence and better solution values for larger problems, AntDA was combined with the variable-step policy hill-climbing algorithm called Win or Learn Fast (WoLF). In experimentation, the addition of this learning algorithm in AntDA provided for faster convergence while outperforming other solution methods
Query optimizers based on machine learning techniques
Dissertação de mestrado integrado em Engenharia InformáticaQuery optimizers are considered one of the most relevant and sophisticated components
in a database management system. However, despite currently producing nearly optimal
results, optimizers rely on statistical estimates and heuristics to reduce the search space
of alternative execution plans for a single query. As a result, for more complex queries,
errors may grow exponentially, often translating into sub-optimal plans resulting in less
than ideal performance. Recent advances in machine learning techniques have opened
new opportunities for many of the existing problems related to system optimization.
This document proposes a solution built on top of PostgreSQL that learns to select
the most efficient set of optimizer strategy settings for a particular query. Instead of
depending entirely on the optimizer’s estimates to compare different plans under different
configurations, it relies on a greedy selection algorithm that supports several types of
predictive modeling techniques, from more traditional modeling techniques to a deep
learning approach.
The system is evaluated experimentally with the standard TPC-H and Join Order ing Benchmark workloads to measure the cost and benefits of adding machine learning
capabilities to traditional query optimizers.Os otimizadores de queries são considerados um dos componentes de maior relevância e
complexidade num sistema de gestĂŁo de bases de dados. No entanto, apesar de atualmente
produzirem resultados quase Ăłtimos, os otimizadores dependem do uso de estimativas
estatĂsticas e de heurĂsticas para reduzir o espaço de procura de planos de execução alternativos para uma determinada query. Como resultado, para queries mais complexas, os erros podem crescer exponencialmente, o que geralmente se traduz em planos sub-Ăłtimos,
resultando num desempenho inferior ao ideal. Os recentes avanços nas técnicas de aprendizagem automática abriram novas oportunidades para muitos dos problemas existentes relacionados com otimização de sistemas.
Este documento propõe uma solução construĂda sobre o PostgreSQL que aprende a
selecionar o conjunto mais eficiente de configurações do otimizador para uma determinada
query. Em vez de depender inteiramente de estimativas do otimizador para comparar
planos de configurações diferentes, a solução baseia-se num algoritmo de seleção greedy que
suporta vários tipos de técnicas de modelagem preditiva, desde técnicas mais tradicionais
a uma abordagem de deep learning.
O sistema Ă© avaliado experimentalmente com os workloads TPC-H e Join Ordering
Benchmark para medir o custo e os benefĂcios de adicionar aprendizagem automática a
otimizadores de queries tradicionais.This work is financed by National Funds through the Portuguese funding agency, FCT
- Fundação para a Ciência e a Tecnologia, within project UIDB/50014/2020
Online failure prediction in air traffic control systems
This thesis introduces a novel approach to online failure prediction for mission critical distributed systems that has the distinctive features to be black-box, non-intrusive and online. The approach combines Complex Event Processing (CEP) and Hidden Markov Models (HMM) so as to analyze symptoms of failures that might occur in the form of anomalous conditions of performance metrics identified for such purpose. The thesis presents an architecture named CASPER, based on CEP and HMM, that relies on sniffed information from the communication network of a mission critical system, only, for predicting anomalies that can lead to software failures. An instance of Casper has been implemented, trained and tuned to monitor a real Air Traffic Control (ATC) system developed by Selex ES, a Finmeccanica Company. An extensive experimental evaluation of CASPER is presented. The obtained results show (i) a very low percentage of false positives over both normal and under stress conditions, and (ii) a sufficiently high failure prediction time that allows the system to apply appropriate recovery procedures
Online failure prediction in air traffic control systems
This thesis introduces a novel approach to online failure prediction for mission critical distributed systems that has the distinctive features to be black-box, non-intrusive and online. The approach combines Complex Event Processing (CEP) and Hidden Markov Models (HMM) so as to analyze symptoms of failures that might occur in the form of anomalous conditions of performance metrics identified for such purpose. The thesis presents an architecture named CASPER, based on CEP and HMM, that relies on sniffed information from the communication network of a mission critical system, only, for predicting anomalies that can lead to software failures. An instance of Casper has been implemented, trained and tuned to monitor a real Air Traffic Control (ATC) system developed by Selex ES, a Finmeccanica Company. An extensive experimental evaluation of CASPER is presented. The obtained results show (i) a very low percentage of false positives over both normal and under stress conditions, and (ii) a sufficiently high failure prediction time that allows the system to apply appropriate recovery procedures
Flexible and intelligent network programming for cloud networks
As modern online services are evolving promptly and involving larger amount of data and computation than ever, the demand for cloud networks keeps growing rapidly, which also brings new challenges to network programming.
Network programming is a complicated and crucial task for building high-performance cloud networks. Current network programming mainly presents two shortcomings: (1) it is inflexible as adding new data-plane features usually takes several years; (2) it is unintelligent as it heavily depends on human-designed heuristic algorithms to solve production-scale problems.
To overcome these shortcomings, this dissertation realizes flexible and intelligent network programming by leveraging the recent development of new technologies both in hardware and software. Specifically, it presents four systems with new performance features that cannot be achieved by conventional network programming:
(i) Harmonia: A new replicated storage architecture that provides near-linear scalability without sacrificing consistency. By exploiting the programming flexibility of new-generation programmable switches, Harmonia checks read-write conflicts in network for guaranteeing consistency, and enables any replica to serve reads for objects with no pending writes for near-linear scalability.
(ii) RackSched: A microsecond-scale scheduler for rack-scale computers. It proposes
a two-layer scheduling framework that integrates the inter-server scheduler in the top-of-rack (ToR) switch with intra-server schedulers on each server. The in-network inter-server scheduler is programmed to realize power-of-k-choices, ensure request affinity, and track server loads accurately and efficiently.
(iii) NetVRM: A network management system that supports dynamic register memory sharing in the network. It orchestrates the register memory allocation between multiple concurrent network applications to optimize the multiplexing benefits. This goal is achieved with three major features: a virtual register memory abstraction, a dynamic memory allocation algorithm, and a domain-specific programming language extension.
(iv) NeuroPlan: Automated and efficient network planning with deep reinforcement learning (RL). It leverages a two-stage hybrid approach that first uses deep RL to prune a large and complex search space and then uses an Integer Linear Programming (ILP) solver to find the final solution. Such an automated approach avoids human efforts to design heuristic algorithms manually and reduces network plan cost efficiently.
We have done theoretical analysis, built testbeds, and evaluated these systems with prototype experiments and simulations under realistic setups from production networks
Adaptive Asynchronous Control and Consistency in Distributed Data Exploration Systems
Advances in machine learning and streaming systems provide a backbone to transform vast arrays of raw data into valuable information. Leveraging distributed execution, analysis engines can process this information effectively within an iterative data exploration workflow to solve problems at unprecedented rates. However, with increased input dimensionality, a desire to simultaneously share and isolate information, as well as overlapping and dependent tasks, this process is becoming increasingly difficult to maintain. User interaction derails exploratory progress due to manual oversight on lower level tasks such as tuning parameters, adjusting filters, and monitoring queries. We identify human-in-the-loop management of data generation and distributed analysis as an inhibiting problem precluding efficient online, iterative data exploration which causes delays in knowledge discovery and decision making. The flexible and scalable systems implementing the exploration workflow require semi-autonomous methods integrated as architectural support to reduce human involvement. We, thus, argue that an abstraction layer providing adaptive asynchronous control and consistency management over a series of individual tasks coordinated to achieve a global objective can significantly improve data exploration effectiveness and efficiency. This thesis introduces methodologies which autonomously coordinate distributed execution at a lower level in order to synchronize multiple efforts as part of a common goal. We demonstrate the impact on data exploration through serverless simulation ensemble management and multi-model machine learning by showing improved performance and reduced resource utilization enabling a more productive semi-autonomous exploration workflow. We focus on the specific genres of molecular dynamics and personalized healthcare, however, the contributions are applicable to a wide variety of domains
Recommended from our members
End-to-end deep reinforcement learning in computer systems
Abstract
The growing complexity of data processing systems has long led systems designers to imagine systems (e.g. databases, schedulers) which can self-configure and adapt based on environmental cues. In this context, reinforcement learning (RL) methods have since their inception appealed to systems developers. They promise to acquire complex decision policies from raw feedback signals. Despite their conceptual popularity, RL methods are scarcely found in real-world data processing systems. Recently, RL has seen explosive growth in interest due to high profile successes when utilising large neural networks (deep reinforcement learning). Newly emerging machine learning frameworks and powerful hardware accelerators have given rise to a plethora of new potential applications.
In this dissertation, I first argue that in order to design and execute deep RL algorithms efficiently, novel software abstractions are required which can accommodate the distinct computational patterns of communication-intensive and fast-evolving algorithms. I propose an architecture which decouples logical algorithm construction from local and distributed execution semantics. I further present RLgraph, my proof-of-concept implementation of this architecture. In RLgraph, algorithm developers can explore novel designs by constructing a high-level data flow graph through combination of logical components. This dataflow graph is independent of specific backend frameworks or notions of execution, and is only later mapped to execution semantics via a staged build process. RLgraph enables high-performing algorithm implementations while maintaining flexibility for rapid prototyping.
Second, I investigate reasons for the scarcity of RL applications in systems themselves. I argue that progress in applied RL is hindered by a lack of tools for task model design which bridge the gap between systems and algorithms, and also by missing shared standards for evaluation of model capabilities. I introduce Wield, a first-of-its-kind tool for incremental model design in applied RL. Wield provides a small set of primitives which decouple systems interfaces and deployment-specific configuration from representation. Core to Wield is a novel instructive experiment protocol called progressive randomisation which helps practitioners to incrementally evaluate different dimensions of non-determinism. I demonstrate how Wield and progressive randomisation can be used to reproduce and assess prior work, and to guide implementation of novel RL applications
Technologies and Applications for Big Data Value
This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part “Technologies and Methods” contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part “Processes and Applications” details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems
- …