Search CORE

305 research outputs found

Realizing multioperations and multiprefixes in Thick Control Flow processors

Author: Forsell Martti
Leppänen Ville
Roivainen Jussi
Träff Jesper Larsson
Publication venue
Publication date: 01/04/2023
Field of study

Molecular Dynamics Simulations of DNA-Functionalized Nanoparticle Building Blocks on GPUs

Author: Fochtman Tyler Landon
Publication venue: ScholarWorks@UARK
Publication date: 01/05/2017
Field of study

This thesis discusses massively parallel molecular dynamics simulations of nBLOCKs using graphical processing units. nBLOCKs are nanoscale building blocks composed of gold nanoparticles functionalized with single-stranded DNA molecules. To explore greater simulation time scales we implement our nBLOCK computational model as an extension to the coarse grain molecular simulator oxDNA. oxDNA is parameterized to match the thermodynamics of DNA strand hybridization as well as the mechanics of single stranded DNA and double stranded DNA. In addition to an in-depth review of our implementation details we also provide results of the model validation and performance tests. These validation and performance tests are comprised of over a hundred separate simulations spanning in simulation length from one thousand to ten million times steps and with simulation sizes ranging from 16 to 27832 particles. Together these tests show the ability of our implementation to handle the full range of basic nBLOCK topologies in a diverse set of conditions. A selection of the utilities developed during the course of this thesis are also discussed. We provide descriptions of the scripting utilities which support nBLOCK assembly generation, simulation, and analysis

ScholarWorks@UARK

UARK (University of Arkansas )

Interactive message debugger for parallel message passing programs using Lam-Mpi

Author: Basu Hoimonti
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/2005
Field of study

Many complex and computation intensive problems can be solved efficiently using parallel programs on a network of processors. One of the most widely used software platforms for such cluster computing is LAM-MPI. To aid develop robust parallel programs using LAM-MPI we need efficient debugging tools. The challenges in debugging parallel programs are unique and different from those of sequential programs. Unfortunately available parallel debuggers do not address these challenges adequately; This thesis introduces IDLI, a parallel message debugger for LAM-MPI, designed on the concepts of multi-level debugging. IDLI provides a new paradigm for distributed debugging while avoiding many of the pitfalls of present tools of its genre. Through its powerful yet customizable query mechanism, adequate data abstraction, granularity, user-friendly interface, and a fast novel technique to simultaneously replay and sequentially debug one or more processes from a distributed application, IDLI provides an effective environment for debugging parallel LAM-MPI programs

University of Nevada, Las Vegas Repository

The particle track reconstruction based on deep learning neural networks

Author: Baranov Dmitriy
Goncharov Pavel
Mitsyn Sergey
Ososkov Gennady
Publication venue: 'EDP Sciences'
Publication date: 07/12/2018
Field of study

One of the most important problems of data processing in high energy and nuclear physics is the event reconstruction. Its main part is the track reconstruction procedure which consists in looking for all tracks that elementary particles leave when they pass through a detector among a huge number of points, so-called hits, produced when flying particles fire detector coordinate planes. Unfortunately, the tracking is seriously impeded by the famous shortcoming of multiwired, strip in GEM detectors due to the appearance in them a lot of fake hits caused by extra spurious crossings of fired strips. Since the number of those fakes is several orders of magnitude greater than for true hits, one faces with the quite serious difficulty to unravel possible track-candidates via true hits ignoring fakes. On the basis of our previous two-stage approach based on hits preprocessing using directed K-d tree search followed by a deep neural classifier we introduce here two new tracking algorithms. Both algorithms combine those two stages in one while using different types of deep neural nets. We show that both proposed deep networks do not require any special preprocessing stage, are more accurate, faster and can be easier parallelized. Preliminary results of our new approaches for simulated events are presented.Comment: 8 pages, 3 figures, CHEP 2018, the 23rd International Conference on Computing in High Energy and Nuclear Physics, Sofia, Bulgaria on July 9-13, 2018. arXiv admin note: text overlap with arXiv:1811.0600

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Task Activity Vectors: A Novel Metric for Temperature-Aware and Energy-Efficient Scheduling

Author: Merkel Andreas
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2010
Field of study

This thesis introduces the abstraction of the task activity vector to characterize applications by the processor resources they utilize. Based on activity vectors, the thesis introduces scheduling policies for improving the temperature distribution on the processor chip and for increasing energy efficiency by reducing the contention for shared resources of multicore and multithreaded processors

KITopen

Doctor of Philosophy in Computer Science

Author: Kopta Daniel
Publication venue: University of Utah
Publication date: 01/01/2016
Field of study

dissertationRay tracing is becoming more widely adopted in offline rendering systems due to its natural support for high quality lighting. Since quality is also a concern in most real time systems, we believe ray tracing would be a welcome change in the real time world, but is avoided due to insufficient performance. Since power consumption is one of the primary factors limiting the increase of processor performance, it must be addressed as a foremost concern in any future ray tracing system designs. This will require cooperating advances in both algorithms and architecture. In this dissertation I study ray tracing system designs from a data movement perspective, targeting the various memory resources that are the primary consumer of power on a modern processor. The result is high performance, low energy ray tracing architectures

The University of Utah: J. Willard Marriott Digital Library

System software for the finite element machine

Author: Crockett T. W.
Knott J. D.
Publication venue
Publication date
Field of study

The Finite Element Machine is an experimental parallel computer developed at Langley Research Center to investigate the application of concurrent processing to structural engineering analysis. This report describes system-level software which has been developed to facilitate use of the machine by applications researchers. The overall software design is outlined, and several important parallel processing issues are discussed in detail, including processor management, communication, synchronization, and input/output. Based on experience using the system, the hardware architecture and software design are critiqued, and areas for further work are suggested

NASA Technical Reports Server

Parallel programming environment for OpenMP

Author: Insung Park
Michael J Voss
Rudolf Eigenmann
Seon Wook Kim
Publication venue
Publication date: 11/04/2020
Field of study

We present our effort to provide a comprehensive parallel programming environment for the OpenMP parallel directive language. This environment includes a parallel programming methodology for the OpenMP programming model and a set of tools ( Ursa Minor and InterPol) that support this methodology. Our toolset provides automated and interactive assistance to parallel programmers in time-consuming tasks of the proposed methodology. The features provided by our tools include performance and program structure visualization, interactive optimization, support for performance modeling, and performance advising for finding and correcting performance problems. The presented evaluation demonstrates that our environment offers significant support in general parallel tuning efforts and that the toolset facilitates many common tasks in OpenMP parallel programming in an efficient manner

CiteSeerX