3,610 research outputs found
Enhanced molecular dynamics performance with a programmable graphics processor
Design considerations for molecular dynamics algorithms capable of taking
advantage of the computational power of a graphics processing unit (GPU) are
described. Accommodating the constraints of scalable streaming-multiprocessor
hardware necessitates a reformulation of the underlying algorithm. Performance
measurements demonstrate the considerable benefit and cost-effectiveness of
such an approach, which produces a factor of 2.5 speed improvement over
previous work for the case of the soft-sphere potential.Comment: 20 pages (v2: minor additions and changes; v3: corrected typos
Closing the Gap for Pseudo-Polynomial Strip Packing
Two-dimensional packing problems are a fundamental class of optimization problems and Strip Packing is one of the most natural and famous among them. Indeed it can be defined in just one sentence: Given a set of rectangular axis parallel items and a strip with bounded width and infinite height, the objective is to find a packing of the items into the strip minimizing the packing height. We speak of pseudo-polynomial Strip Packing if we consider algorithms with pseudo-polynomial running time with respect to the width of the strip. It is known that there is no pseudo-polynomial time algorithm for Strip Packing with a ratio better than 5/4 unless P = NP. The best algorithm so far has a ratio of 4/3 + epsilon. In this paper, we close the gap between inapproximability result and currently known algorithms by presenting an algorithm with approximation ratio 5/4 + epsilon. The algorithm relies on a new structural result which is the main accomplishment of this paper. It states that each optimal solution can be transformed with bounded loss in the objective such that it has one of a polynomial number of different forms thus making the problem tractable by standard techniques, i.e., dynamic programming. To show the conceptual strength of the approach, we extend our result to other problems as well, e.g., Strip Packing with 90 degree rotations and Contiguous Moldable Task Scheduling, and present algorithms with approximation ratio 5/4 + epsilon for these problems as well
Recommended from our members
Alternative methods for representing the inverse of linear programming basis matrices
Methods for representing the inverse of Linear Programming (LP) basis matrices are closely related to techniques for solving a system of sparse unsymmetric linear equations by direct methods. It is now well accepted that for these problems the static process of reordering the matrix in the lower block triangular (LBT) form constitutes the initial step. We introduce a combined static and dynamic factorisation of a basis matrix and derive its inverse which we call the partial elimination form of the inverse (PEFI). This factorization takes advantage of the LBT structure and produces a sparser representation of the inverse than the elimination form of the inverse (EFI). In this we make use of the original columns (of the constraint matrix) which are in the basis. To represent the factored inverse it is, however, necessary to introduce special data structures which are used in the forward and the backward transformations (the two major algorithmic steps) of the simplex method. These correspond to solving a system of equations and solving a system of equations with the transposed matrix respectively. In this paper we compare the nonzero build up of PEFI with that of EFI. We have also investigated alternative methods for updating the basis inverse in the PEFI representation. The results of our experimental investigation are presented in this pape
Efficiency of linked cell algorithms
The linked cell list algorithm is an essential part of molecular simulation
software, both molecular dynamics and Monte Carlo. Though it scales linearly
with the number of particles, there has been a constant interest in increasing
its efficiency, because a large part of CPU time is spent to identify the
interacting particles. Several recent publications proposed improvements to the
algorithm and investigated their efficiency by applying them to particular
setups. In this publication we develop a general method to evaluate the
efficiency of these algorithms, which is mostly independent of the parameters
of the simulation, and test it for a number of linked cell list algorithms. We
also propose a combination of linked cell reordering and interaction sorting
that shows a good efficiency for a broad range of simulation setups.Comment: Submitted to Computer Physics Communications on 22 December 2009,
still awaiting a referee repor
On-the-fly memory compression for multibody algorithms.
Memory and bandwidth demands challenge developers of particle-based codes that have to scale on new architectures, as the growth of concurrency outperforms improvements in memory access facilities, as the memory per core tends to stagnate, and as communication networks cannot increase bandwidth arbitrary. We propose to analyse each particle of such a code to find out whether a hierarchical data representation storing data with reduced precision caps the memory demands without exceeding given error bounds. For admissible candidates, we perform this compression and thus reduce the pressure on the memory subsystem, lower the total memory footprint and reduce the data to be exchanged via MPI. Notably, our analysis and transformation changes the data compression dynamically, i.e. the choice of data format follows the solution characteristics, and it does not require us to alter the core simulation code
Synthesis and Optimization of Reversible Circuits - A Survey
Reversible logic circuits have been historically motivated by theoretical
research in low-power electronics as well as practical improvement of
bit-manipulation transforms in cryptography and computer graphics. Recently,
reversible circuits have attracted interest as components of quantum
algorithms, as well as in photonic and nano-computing technologies where some
switching devices offer no signal gain. Research in generating reversible logic
distinguishes between circuit synthesis, post-synthesis optimization, and
technology mapping. In this survey, we review algorithmic paradigms ---
search-based, cycle-based, transformation-based, and BDD-based --- as well as
specific algorithms for reversible synthesis, both exact and heuristic. We
conclude the survey by outlining key open challenges in synthesis of reversible
and quantum logic, as well as most common misconceptions.Comment: 34 pages, 15 figures, 2 table
A Parallel Adaptive P3M code with Hierarchical Particle Reordering
We discuss the design and implementation of HYDRA_OMP a parallel
implementation of the Smoothed Particle Hydrodynamics-Adaptive P3M (SPH-AP3M)
code HYDRA. The code is designed primarily for conducting cosmological
hydrodynamic simulations and is written in Fortran77+OpenMP. A number of
optimizations for RISC processors and SMP-NUMA architectures have been
implemented, the most important optimization being hierarchical reordering of
particles within chaining cells, which greatly improves data locality thereby
removing the cache misses typically associated with linked lists. Parallel
scaling is good, with a minimum parallel scaling of 73% achieved on 32 nodes
for a variety of modern SMP architectures. We give performance data in terms of
the number of particle updates per second, which is a more useful performance
metric than raw MFlops. A basic version of the code will be made available to
the community in the near future.Comment: 34 pages, 12 figures, accepted for publication in Computer Physics
Communication
Ultimate Intelligence Part I: Physical Completeness and Objectivity of Induction
We propose that Solomonoff induction is complete in the physical sense via
several strong physical arguments. We also argue that Solomonoff induction is
fully applicable to quantum mechanics. We show how to choose an objective
reference machine for universal induction by defining a physical message
complexity and physical message probability, and argue that this choice
dissolves some well-known objections to universal induction. We also introduce
many more variants of physical message complexity based on energy and action,
and discuss the ramifications of our proposals.Comment: Under review at AGI-2015 conference. An early draft was submitted to
ALT-2014. This paper is now being split into two papers, one philosophical,
and one more technical. We intend that all installments of the paper series
will be on the arxi
- …