3 research outputs found

    GetLB: Balanceamento de Carga Eficiente para o Escalonamento de Transações Eletrônicas Financeiras

    Get PDF
    Este artigo tem como objetivo apresentar as idéias para o desenvolvimento de um framework para balanceamento de carga chamado GetLB. Considerando o contexto de transferência eletrônica de fundos (TEF), GetLB oferece uma nova forma de organizar as interações entre o chaveador e as máquinas processadoras. Esta organização permite que o chaveador combine informações atualizadas para a execução de um algoritmo de programação dinâmica em vez de usar a abordagem Round-Robin entre as máquinas de processamento. O algoritmo de agendamento de GetLB divide as transações em diferentes tipos, combinando suas necessidades de CPU, memória e disco de dados de máquinas processadoras para oferecer um balanceamento de carga eficiente. Implementou-se um protótipo com RMI e testes revelaram que o quadro é viável para processamento de transações sobre os ambientes homogêneos e heterogêneos. Além disso, a avaliação apresentou as vantagens da adoção de algoritmos GetLB em vez da abordagem Round-Robin tradicional

    Runtime Prediction for Scale-Out Data Analytics

    Get PDF
    Many analytics applications generate mixed workloads, i.e., workloads comprised of analytical tasks with different processing characteristics including data pre-processing, SQL, and iterative machine learning algorithms. Examples of such mixed workloads can be found in web data analysis, social media analysis, and graph analytics, where they are executed repetitively on large input datasets (e.g., "Find the average user time spent on the top 10 most popular web pages on the UK domain web graph."). Scale-out processing engines satisfy the needs of these applications by distributing the data and the processing task efficiently among multiple workers that are first reserved and then used to execute the task in parallel on a cluster of machines. Finding the resource allocation that can complete the workload execution within a given time constraint, and optimizing cluster resource allocations among multiple analytical workloads motivates the need for estimating the runtime of the workload before its actual execution. Predicting runtime of analytical workloads is a challenging problem as runtime depends on a large number of factors that are hard to model a priori execution. These factors can be summarized as workload characteristics (i.e., data statistics and processing costs), the execution configuration (i.e., deployment, resource allocation, and software settings), and the cost model that captures the interplay among all of the above parameters. While conventional cost models proposed in the context of query optimization can assess the relative order among alternative SQL query plans, they are not aimed to estimate absolute runtime. Additionally, conventional models are ill-equipped to estimate the runtime of iterative analytics that are executed repetitively until convergence and that of user defined data pre-processing operators which are not "owned" by the underlying data management system. This thesis demonstrates that runtime for data analytics can be predicted accurately by breaking the analytical tasks into multiple processing phases, collecting key input features during a reference execution on a sample of the dataset, and then using the features to build per-phase cost models. We develop prediction models for three categories of data analytics produced by social media applications: iterative machine learning, data pre-processing, and reporting SQL. The prediction framework for iterative analytics, PREDIcT, addresses the challenging problem of estimating the number of iterations, and per-iteration runtime for a class of iterative machine learning algorithms that are run repetitively until convergence. The hybrid prediction models we develop for data pre-processing tasks and for reporting SQL combine the benefits of analytical modeling with that of machine learning-based models. Through a training methodology and a pruning algorithm we reduce the cost of running training queries to a minimum while maintaining a good level of accuracy for the models

    Workload Management for Big Data Analytics

    No full text
    repositor
    corecore