49,690 research outputs found

    GoFFish: A Sub-Graph Centric Framework for Large-Scale Graph Analytics

    Full text link
    Large scale graph processing is a major research area for Big Data exploration. Vertex centric programming models like Pregel are gaining traction due to their simple abstraction that allows for scalable execution on distributed systems naturally. However, there are limitations to this approach which cause vertex centric algorithms to under-perform due to poor compute to communication overhead ratio and slow convergence of iterative superstep. In this paper we introduce GoFFish a scalable sub-graph centric framework co-designed with a distributed persistent graph storage for large scale graph analytics on commodity clusters. We introduce a sub-graph centric programming abstraction that combines the scalability of a vertex centric approach with the flexibility of shared memory sub-graph computation. We map Connected Components, SSSP and PageRank algorithms to this model to illustrate its flexibility. Further, we empirically analyze GoFFish using several real world graphs and demonstrate its significant performance improvement, orders of magnitude in some cases, compared to Apache Giraph, the leading open source vertex centric implementation.Comment: Under review by a conference, 201

    Making extreme computations possible with virtual machines

    Full text link
    State-of-the-art algorithms generate scattering amplitudes for high-energy physics at leading order for high-multiplicity processes as compiled code (in Fortran, C or C++). For complicated processes the size of these libraries can become tremendous (many GiB). We show that amplitudes can be translated to byte-code instructions, which even reduce the size by one order of magnitude. The byte-code is interpreted by a Virtual Machine with runtimes comparable to compiled code and a better scaling with additional legs. We study the properties of this algorithm, as an extension of the Optimizing Matrix Element Generator (O'Mega). The bytecode matrix elements are available as alternative input for the event generator WHIZARD. The bytecode interpreter can be implemented very compactly, which will help with a future implementation on massively parallel GPUs.Comment: 5 pages, 2 figures. arXiv admin note: substantial text overlap with arXiv:1411.383

    A Survey on Load Balancing Algorithms for VM Placement in Cloud Computing

    Get PDF
    The emergence of cloud computing based on virtualization technologies brings huge opportunities to host virtual resource at low cost without the need of owning any infrastructure. Virtualization technologies enable users to acquire, configure and be charged on pay-per-use basis. However, Cloud data centers mostly comprise heterogeneous commodity servers hosting multiple virtual machines (VMs) with potential various specifications and fluctuating resource usages, which may cause imbalanced resource utilization within servers that may lead to performance degradation and service level agreements (SLAs) violations. To achieve efficient scheduling, these challenges should be addressed and solved by using load balancing strategies, which have been proved to be NP-hard problem. From multiple perspectives, this work identifies the challenges and analyzes existing algorithms for allocating VMs to PMs in infrastructure Clouds, especially focuses on load balancing. A detailed classification targeting load balancing algorithms for VM placement in cloud data centers is investigated and the surveyed algorithms are classified according to the classification. The goal of this paper is to provide a comprehensive and comparative understanding of existing literature and aid researchers by providing an insight for potential future enhancements.Comment: 22 Pages, 4 Figures, 4 Tables, in pres

    Simulation sample sizes for Monte Carlo partial EVPI calculations

    Get PDF
    Partial expected value of perfect information (EVPI) quantifies the value of removing uncertainty about unknown parameters in a decision model. EVPIs can be computed via Monte Carlo methods. An outer loop samples values of the parameters of interest, and an inner loop samples the remaining parameters from their conditional distribution. This nested Monte Carlo approach can result in biased estimates if small numbers of inner samples are used and can require a large number of model runs for accurate partial EVPI estimates. We present a simple algorithm to estimate the EVPI bias and confidence interval width for a specified number of inner and outer samples. The algorithm uses a relatively small number of model runs (we suggest approximately 600), is quick to compute, and can help determine how many outer and inner iterations are needed for a desired level of accuracy. We test our algorithm using three case studies. (C) 2010 Elsevier B.V. All rights reserved
    • …
    corecore