3,442 research outputs found

    Parallelized Rigid Body Dynamics

    Get PDF
    Physics engines are collections of API-like software designed for video games, movies and scientific simulations. While physics engines often come in many shapes and designs, all engines can benefit from an increase in speed via parallelization. However, despite this need for increased speed, it is uncommon to encounter a parallelized physics engine today. Many engines are long-standing projects and changing them to support parallelization is too costly to consider as a practical matter. Parallelization needs to be considered from the design stages through completion to ensure adequate implementation. In this project we develop a realistic approach to simulate physics in a parallel environment. Utilizing many techniques we establish a practical approach to significantly reduce the run-time on a standard physics engine

    Simulation of a Hard-Spherocylinder Liquid Crystal with the pe

    Full text link
    The pe physics engine is validated through the simulation of a liquid crystal model system consisting of hard spherocylinders. For this purpose we evaluate several characteristic parameters of this system, namely the nematic order parameter, the pressure, and the Frank elastic constants. We compare these to the values reported in literature and find a very good agreement, which demonstrates that the pe physics engine can accurately treat such densely packed particle systems. Simultaneously we are able to examine the influence of finite size effects, especially on the evaluation of the Frank elastic constants, as we are far less restricted in system size than earlier simulations

    Cluster counting: The Hoshen-Kopelman algorithm vs. spanning tree approaches

    Full text link
    Two basic approaches to the cluster counting task in the percolation and related models are discussed. The Hoshen-Kopelman multiple labeling technique for cluster statistics is redescribed. Modifications for random and aperiodic lattices are sketched as well as some parallelised versions of the algorithm are mentioned. The graph-theoretical basis for the spanning tree approaches is given by describing the "breadth-first search" and "depth-first search" procedures. Examples are given for extracting the elastic and geometric "backbone" of a percolation cluster. An implementation of the "pebble game" algorithm using a depth-first search method is also described.Comment: LaTeX, uses ijmpc1.sty(included), 18 pages, 3 figures, submitted to Intern. J. of Modern Physics

    Optimal Reconfiguration of Formation Flying Spacecraft--a Decentralized Approach

    Get PDF
    This paper introduces a hierarchical, decentralized, and parallelizable method for dealing with optimization problems with many agents. It is theoretically based on a hierarchical optimization theorem that establishes the equivalence of two forms of the problem, and this idea is implemented using DMOC (Discrete Mechanics and Optimal Control). The result is a method that is scalable to certain optimization problems for large numbers of agents, whereas the usual “monolithic” approach can only deal with systems with a rather small number of degrees of freedom. The method is illustrated with the example of deployment of spacecraft, motivated by the Darwin (ESA) and Terrestrial Planet Finder (NASA) missions

    An Efficient Solution Method for Multibody Systems with Loops Using Multiple Processors

    Get PDF
    This paper describes a multibody dynamics algorithm formulated for parallel implementation on multiprocessor computing platforms using the divide-and-conquer approach. The system of interest is a general topology of rigid and elastic articulated bodies with or without loops. The algorithm divides the multibody system into a number of smaller sets of bodies in chain or tree structures, called "branches" at convenient joints called "connection points", and uses an Order-N (O (N)) approach to formulate the dynamics of each branch in terms of the unknown spatial connection forces. The equations of motion for the branches, leaving the connection forces as unknowns, are implemented in separate processors in parallel for computational efficiency, and the equations for all the unknown connection forces are synthesized and solved in one or several processors. The performances of two implementations of this divide-and-conquer algorithm in multiple processors are compared with an existing method implemented on a single processor

    Acceleration of Coarse Grain Molecular Dynamics on GPU Architectures

    Get PDF
    Coarse grain (CG) molecular models have been proposed to simulate complex sys- tems with lower computational overheads and longer timescales with respect to atom- istic level models. However, their acceleration on parallel architectures such as Graphic Processing Units (GPU) presents original challenges that must be carefully evaluated. The objective of this work is to characterize the impact of CG model features on parallel simulation performance. To achieve this, we implemented a GPU-accelerated version of a CG molecular dynamics simulator, to which we applied specic optimizations for CG models, such as dedicated data structures to handle dierent bead type interac- tions, obtaining a maximum speed-up of 14 on the NVIDIA GTX480 GPU with Fermi architecture. We provide a complete characterization and evaluation of algorithmic and simulated system features of CG models impacting the achievable speed-up and accuracy of results, using three dierent GPU architectures as case studie

    Parallelization of a Six Degree of Freedom Entry Vehicle Trajectory Simulation Using OpenMP and OpenACC

    Get PDF
    The art and science of writing parallelized software, using methods such as Open Multi-Processing (OpenMP) and Open Accelerators (OpenACC), is dominated by computer scientists. Engineers and non-computer scientists looking to apply these techniques to their project applications face a steep learning curve, especially when looking to adapt their original single threaded software to run multi-threaded on graphics processing units (GPUs). There are significant changes in mindset that must occur; such as how to manage memory, the organization of instructions, and the use of if statements (also known as branching). The purpose of this work is twofold: 1) to demonstrate the applicability of parallelized coding methodologies, OpenMP and OpenACC, to tasks outside of the typical large scale matrix mathematics; and 2) to discuss, from an engineers perspective, the lessons learned from parallelizing software using these computer science techniques. This work applies OpenMP, on both multi-core central processing units (CPUs) and Intel Xeon Phi 7210, and OpenACC on GPUs. These parallelization techniques are used to tackle the simulation of thousands of entry vehicle trajectories through the integration of six degree of freedom (DoF) equations of motion (EoM). The forces and moments acting on the entry vehicle, and used by the EoM, are estimated using multiple models of varying levels of complexity. Several benchmark comparisons are made on the execution of six DoF trajectory simulation: single thread Intel Xeon E5-2670 CPU, multi-thread CPU using OpenMP, multi-thread Xeon Phi 7210 using OpenMP, and multi-thread NVIDIA Tesla K40 GPU using OpenACC. These benchmarks are run on the Pleiades Supercomputer Cluster at the National Aeronautics and Space Administration (NASA) Ames Research Center (ARC), and a Xeon Phi 7210 node at NASA Langley Research Center (LaRC)
    corecore