35,684 research outputs found

    Evolutionary n-level hypergraph partitioning with adaptive coarsening

    Get PDF
    Hypergraph partitioning is an NP-hard problem that occurs in many computer science applications where it is necessary to reduce large problems into a number of smaller, computationally tractable sub-problems. Current techniques use a multilevel approach wherein an initial partitioning is performed after compressing the hypergraph to a predetermined level. This level is typically chosen to produce very coarse hypergraphs in which heuristic algorithms are fast and effective. This article presents a novel memetic algorithm which remains effective on larger initial hypergraphs. This enables the exploitation of information that can be lost during coarsening and results in improved final solution quality. We use this algorithm to present an empirical analysis of the space of possible initial hypergraphs in terms of its searchability at different levels of coarsening. We find that the best results arise at coarsening levels unique to each hypergraph. Based on this, we introduce an adaptive scheme that stops coarsening when the rate of information loss in a hypergraph becomes non-linear and show that this produces further improvements. The results show that we have identified a valuable role for evolutionary algorithms within the current state-of-the-art hypergraph partitioning framework

    The Simulation Model Partitioning Problem: an Adaptive Solution Based on Self-Clustering (Extended Version)

    Full text link
    This paper is about partitioning in parallel and distributed simulation. That means decomposing the simulation model into a numberof components and to properly allocate them on the execution units. An adaptive solution based on self-clustering, that considers both communication reduction and computational load-balancing, is proposed. The implementation of the proposed mechanism is tested using a simulation model that is challenging both in terms of structure and dynamicity. Various configurations of the simulation model and the execution environment have been considered. The obtained performance results are analyzed using a reference cost model. The results demonstrate that the proposed approach is promising and that it can reduce the simulation execution time in both parallel and distributed architectures

    Beyond Reuse Distance Analysis: Dynamic Analysis for Characterization of Data Locality Potential

    Get PDF
    Emerging computer architectures will feature drastically decreased flops/byte (ratio of peak processing rate to memory bandwidth) as highlighted by recent studies on Exascale architectural trends. Further, flops are getting cheaper while the energy cost of data movement is increasingly dominant. The understanding and characterization of data locality properties of computations is critical in order to guide efforts to enhance data locality. Reuse distance analysis of memory address traces is a valuable tool to perform data locality characterization of programs. A single reuse distance analysis can be used to estimate the number of cache misses in a fully associative LRU cache of any size, thereby providing estimates on the minimum bandwidth requirements at different levels of the memory hierarchy to avoid being bandwidth bound. However, such an analysis only holds for the particular execution order that produced the trace. It cannot estimate potential improvement in data locality through dependence preserving transformations that change the execution schedule of the operations in the computation. In this article, we develop a novel dynamic analysis approach to characterize the inherent locality properties of a computation and thereby assess the potential for data locality enhancement via dependence preserving transformations. The execution trace of a code is analyzed to extract a computational directed acyclic graph (CDAG) of the data dependences. The CDAG is then partitioned into convex subsets, and the convex partitioning is used to reorder the operations in the execution trace to enhance data locality. The approach enables us to go beyond reuse distance analysis of a single specific order of execution of the operations of a computation in characterization of its data locality properties. It can serve a valuable role in identifying promising code regions for manual transformation, as well as assessing the effectiveness of compiler transformations for data locality enhancement. We demonstrate the effectiveness of the approach using a number of benchmarks, including case studies where the potential shown by the analysis is exploited to achieve lower data movement costs and better performance.Comment: Transaction on Architecture and Code Optimization (2014

    Adaptive Partitioning for Large-Scale Dynamic Graphs

    Get PDF
    Abstract—In the last years, large-scale graph processing has gained increasing attention, with most recent systems placing particular emphasis on latency. One possible technique to improve runtime performance in a distributed graph processing system is to reduce network communication. The most notable way to achieve this goal is to partition the graph by minimizing the num-ber of edges that connect vertices assigned to different machines, while keeping the load balanced. However, real-world graphs are highly dynamic, with vertices and edges being constantly added and removed. Carefully updating the partitioning of the graph to reflect these changes is necessary to avoid the introduction of an extensive number of cut edges, which would gradually worsen computation performance. In this paper we show that performance degradation in dynamic graph processing systems can be avoided by adapting continuously the graph partitions as the graph changes. We present a novel highly scalable adaptive partitioning strategy, and show a number of refinements that make it work under the constraints of a large-scale distributed system. The partitioning strategy is based on iterative vertex migrations, relying only on local information. We have implemented the technique in a graph processing system, and we show through three real-world scenarios how adapting graph partitioning reduces execution time by over 50 % when compared to commonly used hash-partitioning. I

    Seeing Shapes in Clouds: On the Performance-Cost trade-off for Heterogeneous Infrastructure-as-a-Service

    Full text link
    In the near future FPGAs will be available by the hour, however this new Infrastructure as a Service (IaaS) usage mode presents both an opportunity and a challenge: The opportunity is that programmers can potentially trade resources for performance on a much larger scale, for much shorter periods of time than before. The challenge is in finding and traversing the trade-off for heterogeneous IaaS that guarantees increased resources result in the greatest possible increased performance. Such a trade-off is Pareto optimal. The Pareto optimal trade-off for clusters of heterogeneous resources can be found by solving multiple, multi-objective optimisation problems, resulting in an optimal allocation of tasks to the available platforms. Solving these optimisation programs can be done using simple heuristic approaches or formal Mixed Integer Linear Programming (MILP) techniques. When pricing 128 financial options using a Monte Carlo algorithm upon a heterogeneous cluster of Multicore CPU, GPU and FPGA platforms, the MILP approach produces a trade-off that is up to 110% faster than a heuristic approach, and over 50% cheaper. These results suggest that high quality performance-resource trade-offs of heterogeneous IaaS are best realised through a formal optimisation approach.Comment: Presented at Second International Workshop on FPGAs for Software Programmers (FSP 2015) (arXiv:1508.06320

    A statistical method for estimating activity uncertainty parameters to improve project forecasting

    Get PDF
    Just like any physical system, projects have entropy that must be managed by spending energy. The entropy is the project’s tendency to move to a state of disorder (schedule delays, cost overruns), and the energy process is an inherent part of any project management methodology. In order to manage the inherent uncertainty of these projects, accurate estimates (for durations, costs, resources, …) are crucial to make informed decisions. Without these estimates, managers have to fall back to their own intuition and experience, which are undoubtedly crucial for making decisions, but are are often subject to biases and hard to quantify. This paper builds further on two published calibration methods that aim to extract data from real projects and calibrate them to better estimate the parameters for the probability distributions of activity durations. Both methods rely on the lognormal distribution model to estimate uncertainty in activity durations and perform a sequence of statistical hypothesis tests that take the possible presence of two human biases into account. Based on these two existing methods, a new so-called statistical partitioning heuristic is presented that integrates the best elements of the two methods to further improve the accuracy of estimating the distribution of activity duration uncertainty. A computational experiment has been carried out on an empirical database of 83 empirical projects. The experiment shows that the new statistical partitioning method performs at least as good as, and often better than, the two existing calibration methods. The improvement will allow a better quantification of the activity duration uncertainty, which will eventually lead to a better prediction of the project schedule and more realistic expectations about the project outcomes. Consequently, the project manager will be able to better cope with the inherent uncertainty (entropy) of projects with a minimum managerial effort (energy)

    An optimization framework for solving capacitated multi-level lot-sizing problems with backlogging

    Get PDF
    This paper proposes two new mixed integer programming models for capacitated multi-level lot-sizing problems with backlogging, whose linear programming relaxations provide good lower bounds on the optimal solution value. We show that both of these strong formulations yield the same lower bounds. In addition to these theoretical results, we propose a new, effective optimization framework that achieves high quality solutions in reasonable computational time. Computational results show that the proposed optimization framework is superior to other well-known approaches on several important performance dimensions
    • …
    corecore