Search CORE

3 research outputs found

GetLB: Balanceamento de Carga Eficiente para o Escalonamento de Transações Eletrônicas Financeiras

Author: Andrade Alexandre
da Costa Cristiano André
Graebin Lucas
Jost Tiago
Righi Rodrigo da Rosa
Publication venue: 'Universidade Federal do Estado do Rio de Janeiro UNIRIO'
Publication date: 17/11/2014
Field of study

Este artigo tem como objetivo apresentar as idéias para o desenvolvimento de um framework para balanceamento de carga chamado GetLB. Considerando o contexto de transferência eletrônica de fundos (TEF), GetLB oferece uma nova forma de organizar as interações entre o chaveador e as máquinas processadoras. Esta organização permite que o chaveador combine informações atualizadas para a execução de um algoritmo de programação dinâmica em vez de usar a abordagem Round-Robin entre as máquinas de processamento. O algoritmo de agendamento de GetLB divide as transações em diferentes tipos, combinando suas necessidades de CPU, memória e disco de dados de máquinas processadoras para oferecer um balanceamento de carga eficiente. Implementou-se um protótipo com RMI e testes revelaram que o quadro é viável para processamento de transações sobre os ambientes homogêneos e heterogêneos. Além disso, a avaliação apresentou as vantagens da adoção de algoritmos GetLB em vez da abordagem Round-Robin tradicional

Universidade Federal do Estado do Rio de Janeiro: Portal de Revistas da UNIRIO

Runtime Prediction for Scale-Out Data Analytics

Author: Popescu Adrian Daniel
Publication venue: Lausanne, EPFL
Publication date: 22/06/2015
Field of study

Many analytics applications generate mixed workloads, i.e., workloads comprised of analytical tasks with different processing characteristics including data pre-processing, SQL, and iterative machine learning algorithms. Examples of such mixed workloads can be found in web data analysis, social media analysis, and graph analytics, where they are executed repetitively on large input datasets (e.g., "Find the average user time spent on the top 10 most popular web pages on the UK domain web graph."). Scale-out processing engines satisfy the needs of these applications by distributing the data and the processing task efficiently among multiple workers that are first reserved and then used to execute the task in parallel on a cluster of machines. Finding the resource allocation that can complete the workload execution within a given time constraint, and optimizing cluster resource allocations among multiple analytical workloads motivates the need for estimating the runtime of the workload before its actual execution. Predicting runtime of analytical workloads is a challenging problem as runtime depends on a large number of factors that are hard to model a priori execution. These factors can be summarized as workload characteristics (i.e., data statistics and processing costs), the execution configuration (i.e., deployment, resource allocation, and software settings), and the cost model that captures the interplay among all of the above parameters. While conventional cost models proposed in the context of query optimization can assess the relative order among alternative SQL query plans, they are not aimed to estimate absolute runtime. Additionally, conventional models are ill-equipped to estimate the runtime of iterative analytics that are executed repetitively until convergence and that of user defined data pre-processing operators which are not "owned" by the underlying data management system. This thesis demonstrates that runtime for data analytics can be predicted accurately by breaking the analytical tasks into multiple processing phases, collecting key input features during a reference execution on a sample of the dataset, and then using the features to build per-phase cost models. We develop prediction models for three categories of data analytics produced by social media applications: iterative machine learning, data pre-processing, and reporting SQL. The prediction framework for iterative analytics, PREDIcT, addresses the challenging problem of estimating the number of iterations, and per-iteration runtime for a class of iterative machine learning algorithms that are run repetitively until convergence. The hybrid prediction models we develop for data pre-processing tasks and for reporting SQL combine the benefits of analytical modeling with that of machine learning-based models. Through a training methodology and a pruning algorithm we reduce the cost of running training queries to a minimum while maintaining a good level of accuracy for the models

Infoscience - École polytechnique fédérale de Lausanne

Workload Management for Big Data Analytics

Author: Ashraf Aboulnaga
Shivnath Babu
Publication venue
Publication date: 16/12/2013
Field of study

repositor

CiteSeerX

Crossref