3,442 research outputs found
Parallelized Rigid Body Dynamics
Physics engines are collections of API-like software designed for video games, movies and scientific simulations. While physics engines often come in many shapes and designs, all engines can benefit from an increase in speed via parallelization. However, despite this need for increased speed, it is uncommon to encounter a parallelized physics engine today. Many engines are long-standing projects and changing them to support parallelization is too costly to consider as a practical matter. Parallelization needs to be considered from the design stages through completion to ensure adequate implementation. In this project we develop a realistic approach to simulate physics in a parallel environment. Utilizing many techniques we establish a practical approach to significantly reduce the run-time on a standard physics engine
Simulation of a Hard-Spherocylinder Liquid Crystal with the pe
The pe physics engine is validated through the simulation of a liquid crystal
model system consisting of hard spherocylinders. For this purpose we evaluate
several characteristic parameters of this system, namely the nematic order
parameter, the pressure, and the Frank elastic constants. We compare these to
the values reported in literature and find a very good agreement, which
demonstrates that the pe physics engine can accurately treat such densely
packed particle systems. Simultaneously we are able to examine the influence of
finite size effects, especially on the evaluation of the Frank elastic
constants, as we are far less restricted in system size than earlier
simulations
Cluster counting: The Hoshen-Kopelman algorithm vs. spanning tree approaches
Two basic approaches to the cluster counting task in the percolation and
related models are discussed. The Hoshen-Kopelman multiple labeling technique
for cluster statistics is redescribed. Modifications for random and aperiodic
lattices are sketched as well as some parallelised versions of the algorithm
are mentioned. The graph-theoretical basis for the spanning tree approaches is
given by describing the "breadth-first search" and "depth-first search"
procedures. Examples are given for extracting the elastic and geometric
"backbone" of a percolation cluster. An implementation of the "pebble game"
algorithm using a depth-first search method is also described.Comment: LaTeX, uses ijmpc1.sty(included), 18 pages, 3 figures, submitted to
Intern. J. of Modern Physics
Optimal Reconfiguration of Formation Flying Spacecraft--a Decentralized Approach
This paper introduces a hierarchical, decentralized,
and parallelizable method for dealing with optimization
problems with many agents. It is theoretically based on a hierarchical
optimization theorem that establishes the equivalence
of two forms of the problem, and this idea is implemented using
DMOC (Discrete Mechanics and Optimal Control). The result
is a method that is scalable to certain optimization problems
for large numbers of agents, whereas the usual “monolithic”
approach can only deal with systems with a rather small
number of degrees of freedom. The method is illustrated with
the example of deployment of spacecraft, motivated by the
Darwin (ESA) and Terrestrial Planet Finder (NASA) missions
An Efficient Solution Method for Multibody Systems with Loops Using Multiple Processors
This paper describes a multibody dynamics algorithm formulated for parallel implementation on multiprocessor computing platforms using the divide-and-conquer approach. The system of interest is a general topology of rigid and elastic articulated bodies with or without loops. The algorithm divides the multibody system into a number of smaller sets of bodies in chain or tree structures, called "branches" at convenient joints called "connection points", and uses an Order-N (O (N)) approach to formulate the dynamics of each branch in terms of the unknown spatial connection forces. The equations of motion for the branches, leaving the connection forces as unknowns, are implemented in separate processors in parallel for computational efficiency, and the equations for all the unknown connection forces are synthesized and solved in one or several processors. The performances of two implementations of this divide-and-conquer algorithm in multiple processors are compared with an existing method implemented on a single processor
Acceleration of Coarse Grain Molecular Dynamics on GPU Architectures
Coarse grain (CG) molecular models have been proposed to simulate complex sys- tems with lower computational overheads and longer timescales with respect to atom- istic level models. However, their acceleration on parallel architectures such as Graphic Processing Units (GPU) presents original challenges that must be carefully evaluated. The objective of this work is to characterize the impact of CG model features on parallel simulation performance. To achieve this, we implemented a GPU-accelerated version of a CG molecular dynamics simulator, to which we applied specic optimizations for CG models, such as dedicated data structures to handle dierent bead type interac- tions, obtaining a maximum speed-up of 14 on the NVIDIA GTX480 GPU with Fermi architecture. We provide a complete characterization and evaluation of algorithmic and simulated system features of CG models impacting the achievable speed-up and accuracy of results, using three dierent GPU architectures as case studie
Parallelization of a Six Degree of Freedom Entry Vehicle Trajectory Simulation Using OpenMP and OpenACC
The art and science of writing parallelized software, using methods such as Open Multi-Processing (OpenMP) and Open Accelerators (OpenACC), is dominated by computer scientists. Engineers and non-computer scientists looking to apply these techniques to their project applications face a steep learning curve, especially when looking to adapt their original single threaded software to run multi-threaded on graphics processing units (GPUs). There are significant changes in mindset that must occur; such as how to manage memory, the organization of instructions, and the use of if statements (also known as branching). The purpose of this work is twofold: 1) to demonstrate the applicability of parallelized coding methodologies, OpenMP and OpenACC, to tasks outside of the typical large scale matrix mathematics; and 2) to discuss, from an engineers perspective, the lessons learned from parallelizing software using these computer science techniques. This work applies OpenMP, on both multi-core central processing units (CPUs) and Intel Xeon Phi 7210, and OpenACC on GPUs. These parallelization techniques are used to tackle the simulation of thousands of entry vehicle trajectories through the integration of six degree of freedom (DoF) equations of motion (EoM). The forces and moments acting on the entry vehicle, and used by the EoM, are estimated using multiple models of varying levels of complexity. Several benchmark comparisons are made on the execution of six DoF trajectory simulation: single thread Intel Xeon E5-2670 CPU, multi-thread CPU using OpenMP, multi-thread Xeon Phi 7210 using OpenMP, and multi-thread NVIDIA Tesla K40 GPU using OpenACC. These benchmarks are run on the Pleiades Supercomputer Cluster at the National Aeronautics and Space Administration (NASA) Ames Research Center (ARC), and a Xeon Phi 7210 node at NASA Langley Research Center (LaRC)
- …