91,643 research outputs found

    A framework for evaluating the impact of communication on performance in large-scale distributed urban simulations

    Get PDF
    A primary motivation for employing distributed simulation is to enable the execution of large-scale simulation workloads that cannot be handled by the resources of a single stand-alone computing node. To make execution possible, the workload is distributed among multiple computing nodes connected to one another via a communication network. The execution of a distributed simulation involves alternating phases of computation and communication to coordinate the co-operating nodes and ensure correctness of the resulting simulation outputs. Reliably estimating the execution performance of a distributed simulation can be difficult due to non-deterministic execution paths involved in alternating computation and communication operations. However, performance estimates are useful as a guide for the simulation time that can be expected when using a given set of computing resources. Performance estimates can support decisions to commit time and resources to running distributed simulations, especially where significant amounts of funds or computing resources are necessary. Various performance estimation approaches are employed in the distributed computing literature, including the influential Bulk Synchronous Parallel (BSP) and LogP models. Different approaches make various assumptions that render them more suitable for some applications than for others. Actual performance depends on characteristics inherent to each distributed simulation application. An important aspect of these individual characteristics is the dynamic relationship between the communication and computation phases of the distributed simulation application. This work develops a framework for estimating the performance of distributed simulation applications, focusing mainly on aspects relevant to the dynamic relationship between communication and computation during distributed simulation execution. The framework proposes a meta-simulation approach based on the Multi-Agent Simulation (MAS) paradigm. Using the approach proposed by the framework, meta-simulations can be developed to investigate the performance of specific distributed simulation applications. The proposed approach enables the ability to compare various what-if scenarios. This ability is useful for comparing the effects of various parameters and strategies such as the number of computing nodes, the communication strategy, and the workload-distribution strategy. The proposed meta-simulation approach can also aid a search for optimal parameters and strategies for specific distributed simulation applications. The framework is demonstrated by implementing a meta-simulation which is based on case studies from the Urban Simulation domain

    A framework for evaluating the impact of communication on performance in large-scale distributed urban simulations

    Get PDF
    A primary motivation for employing distributed simulation is to enable the execution of large-scale simulation workloads that cannot be handled by the resources of a single stand-alone computing node. To make execution possible, the workload is distributed among multiple computing nodes connected to one another via a communication network. The execution of a distributed simulation involves alternating phases of computation and communication to coordinate the co-operating nodes and ensure correctness of the resulting simulation outputs. Reliably estimating the execution performance of a distributed simulation can be difficult due to non-deterministic execution paths involved in alternating computation and communication operations. However, performance estimates are useful as a guide for the simulation time that can be expected when using a given set of computing resources. Performance estimates can support decisions to commit time and resources to running distributed simulations, especially where significant amounts of funds or computing resources are necessary. Various performance estimation approaches are employed in the distributed computing literature, including the influential Bulk Synchronous Parallel (BSP) and LogP models. Different approaches make various assumptions that render them more suitable for some applications than for others. Actual performance depends on characteristics inherent to each distributed simulation application. An important aspect of these individual characteristics is the dynamic relationship between the communication and computation phases of the distributed simulation application. This work develops a framework for estimating the performance of distributed simulation applications, focusing mainly on aspects relevant to the dynamic relationship between communication and computation during distributed simulation execution. The framework proposes a meta-simulation approach based on the Multi-Agent Simulation (MAS) paradigm. Using the approach proposed by the framework, meta-simulations can be developed to investigate the performance of specific distributed simulation applications. The proposed approach enables the ability to compare various what-if scenarios. This ability is useful for comparing the effects of various parameters and strategies such as the number of computing nodes, the communication strategy, and the workload-distribution strategy. The proposed meta-simulation approach can also aid a search for optimal parameters and strategies for specific distributed simulation applications. The framework is demonstrated by implementing a meta-simulation which is based on case studies from the Urban Simulation domain

    Speculative Approximations for Terascale Analytics

    Full text link
    Model calibration is a major challenge faced by the plethora of statistical analytics packages that are increasingly used in Big Data applications. Identifying the optimal model parameters is a time-consuming process that has to be executed from scratch for every dataset/model combination even by experienced data scientists. We argue that the incapacity to evaluate multiple parameter configurations simultaneously and the lack of support to quickly identify sub-optimal configurations are the principal causes. In this paper, we develop two database-inspired techniques for efficient model calibration. Speculative parameter testing applies advanced parallel multi-query processing methods to evaluate several configurations concurrently. The number of configurations is determined adaptively at runtime, while the configurations themselves are extracted from a distribution that is continuously learned following a Bayesian process. Online aggregation is applied to identify sub-optimal configurations early in the processing by incrementally sampling the training dataset and estimating the objective function corresponding to each configuration. We design concurrent online aggregation estimators and define halting conditions to accurately and timely stop the execution. We apply the proposed techniques to distributed gradient descent optimization -- batch and incremental -- for support vector machines and logistic regression models. We implement the resulting solutions in GLADE PF-OLA -- a state-of-the-art Big Data analytics system -- and evaluate their performance over terascale-size synthetic and real datasets. The results confirm that as many as 32 configurations can be evaluated concurrently almost as fast as one, while sub-optimal configurations are detected accurately in as little as a 1/20th1/20^{\text{th}} fraction of the time

    Distributed top-k aggregation queries at large

    Get PDF
    Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network

    H-word: Supporting job scheduling in Hadoop with workload-driven data redistribution

    Get PDF
    The final publication is available at http://link.springer.com/chapter/10.1007/978-3-319-44039-2_21Today’s distributed data processing systems typically follow a query shipping approach and exploit data locality for reducing network traffic. In such systems the distribution of data over the cluster resources plays a significant role, and when skewed, it can harm the performance of executing applications. In this paper, we addressthe challenges of automatically adapting the distribution of data in a cluster to the workload imposed by the input applications. We propose a generic algorithm, named H-WorD, which, based on the estimated workload over resources, suggests alternative execution scenarios of tasks, and hence identifies required transfers of input data a priori, for timely bringing data close to the execution. We exemplify our algorithm in the context of MapReduce jobs in a Hadoop ecosystem. Finally, we evaluate our approach and demonstrate the performance gains of automatic data redistribution.Peer ReviewedPostprint (author's final draft

    Lower bound cost estimation for logic programs

    Get PDF
    It is generally recognized that information about the runtime cost of computations can be useful for a variety of applications, including program transformation, granularity control during parallel execution, and query optimization in deductive databases. Most of the work to date on compile-time cost estimation of logic programs has focused on the estimation of upper bounds on costs. However, in many applications, such as parallel implementations on distributed-memory machines, one would prefer to work with lower bounds instead. The problem with estimating lower bounds is that in general, it is necessary to account for the possibility of failure of head unification, leading to a trivial lower bound of 0. In this paper, we show how, given type and mode information about procedures in a logic program, it is possible to (semi-automatically) derive nontrivial lower bounds on their computational costs. We also discuss the cost analysis for the special and frequent case of divide-and-conquer programs and show how —as a pragmatic short-term solution —it may be possible to obtain useful results simply by identifying and treating divide-and-conquer programs specially

    Managing Uncertainty: A Case for Probabilistic Grid Scheduling

    Get PDF
    The Grid technology is evolving into a global, service-orientated architecture, a universal platform for delivering future high demand computational services. Strong adoption of the Grid and the utility computing concept is leading to an increasing number of Grid installations running a wide range of applications of different size and complexity. In this paper we address the problem of elivering deadline/economy based scheduling in a heterogeneous application environment using statistical properties of job historical executions and its associated meta-data. This approach is motivated by a study of six-month computational load generated by Grid applications in a multi-purpose Grid cluster serving a community of twenty e-Science projects. The observed job statistics, resource utilisation and user behaviour is discussed in the context of management approaches and models most suitable for supporting a probabilistic and autonomous scheduling architecture

    Early Accurate Results for Advanced Analytics on MapReduce

    Full text link
    Approximate results based on samples often provide the only way in which advanced analytical applications on very massive data sets can satisfy their time and resource constraints. Unfortunately, methods and tools for the computation of accurate early results are currently not supported in MapReduce-oriented systems although these are intended for `big data'. Therefore, we proposed and implemented a non-parametric extension of Hadoop which allows the incremental computation of early results for arbitrary work-flows, along with reliable on-line estimates of the degree of accuracy achieved so far in the computation. These estimates are based on a technique called bootstrapping that has been widely employed in statistics and can be applied to arbitrary functions and data distributions. In this paper, we describe our Early Accurate Result Library (EARL) for Hadoop that was designed to minimize the changes required to the MapReduce framework. Various tests of EARL of Hadoop are presented to characterize the frequent situations where EARL can provide major speed-ups over the current version of Hadoop.Comment: VLDB201
    corecore