49,690 research outputs found
GoFFish: A Sub-Graph Centric Framework for Large-Scale Graph Analytics
Large scale graph processing is a major research area for Big Data
exploration. Vertex centric programming models like Pregel are gaining traction
due to their simple abstraction that allows for scalable execution on
distributed systems naturally. However, there are limitations to this approach
which cause vertex centric algorithms to under-perform due to poor compute to
communication overhead ratio and slow convergence of iterative superstep. In
this paper we introduce GoFFish a scalable sub-graph centric framework
co-designed with a distributed persistent graph storage for large scale graph
analytics on commodity clusters. We introduce a sub-graph centric programming
abstraction that combines the scalability of a vertex centric approach with the
flexibility of shared memory sub-graph computation. We map Connected
Components, SSSP and PageRank algorithms to this model to illustrate its
flexibility. Further, we empirically analyze GoFFish using several real world
graphs and demonstrate its significant performance improvement, orders of
magnitude in some cases, compared to Apache Giraph, the leading open source
vertex centric implementation.Comment: Under review by a conference, 201
Making extreme computations possible with virtual machines
State-of-the-art algorithms generate scattering amplitudes for high-energy
physics at leading order for high-multiplicity processes as compiled code (in
Fortran, C or C++). For complicated processes the size of these libraries can
become tremendous (many GiB). We show that amplitudes can be translated to
byte-code instructions, which even reduce the size by one order of magnitude.
The byte-code is interpreted by a Virtual Machine with runtimes comparable to
compiled code and a better scaling with additional legs. We study the
properties of this algorithm, as an extension of the Optimizing Matrix Element
Generator (O'Mega). The bytecode matrix elements are available as alternative
input for the event generator WHIZARD. The bytecode interpreter can be
implemented very compactly, which will help with a future implementation on
massively parallel GPUs.Comment: 5 pages, 2 figures. arXiv admin note: substantial text overlap with
arXiv:1411.383
A Survey on Load Balancing Algorithms for VM Placement in Cloud Computing
The emergence of cloud computing based on virtualization technologies brings
huge opportunities to host virtual resource at low cost without the need of
owning any infrastructure. Virtualization technologies enable users to acquire,
configure and be charged on pay-per-use basis. However, Cloud data centers
mostly comprise heterogeneous commodity servers hosting multiple virtual
machines (VMs) with potential various specifications and fluctuating resource
usages, which may cause imbalanced resource utilization within servers that may
lead to performance degradation and service level agreements (SLAs) violations.
To achieve efficient scheduling, these challenges should be addressed and
solved by using load balancing strategies, which have been proved to be NP-hard
problem. From multiple perspectives, this work identifies the challenges and
analyzes existing algorithms for allocating VMs to PMs in infrastructure
Clouds, especially focuses on load balancing. A detailed classification
targeting load balancing algorithms for VM placement in cloud data centers is
investigated and the surveyed algorithms are classified according to the
classification. The goal of this paper is to provide a comprehensive and
comparative understanding of existing literature and aid researchers by
providing an insight for potential future enhancements.Comment: 22 Pages, 4 Figures, 4 Tables, in pres
Simulation sample sizes for Monte Carlo partial EVPI calculations
Partial expected value of perfect information (EVPI) quantifies the value of removing uncertainty about unknown parameters in a decision model. EVPIs can be computed via Monte Carlo methods. An outer loop samples values of the parameters of interest, and an inner loop samples the remaining parameters from their conditional distribution. This nested Monte Carlo approach can result in biased estimates if small numbers of inner samples are used and can require a large number of model runs for accurate partial EVPI estimates. We present a simple algorithm to estimate the EVPI bias and confidence interval width for a specified number of inner and outer samples. The algorithm uses a relatively small number of model runs (we suggest approximately 600), is quick to compute, and can help determine how many outer and inner iterations are needed for a desired level of accuracy. We test our algorithm using three case studies. (C) 2010 Elsevier B.V. All rights reserved
- …