Search CORE

326 research outputs found

Status of Vectorized Monte Carlo for Particle Transport Analysis

Author: Brown F.B.
Brown F.B.
Brown F.B.
Brown F.B.
Brown F.B.
Brown F.B.
Bucher I.Y.
Calahan D.A.
Chauvet Y.
Chauvet Y.
Fisher K.J.
Hockney R.W.
Los Alamos Monte Carlo Group.
Martin W.R.
Martin W.R.
Martin W.R.
Martin W.R.
Straker E.W.
Troubetzkoy E.
Wan T.C.
Publication venue: 'SAGE Publications'
Publication date: 01/01/1987
Field of study

The conventional particle transport Monte Carlo algorithm is ill suited for modem vector supercomputers because the random nature of the particle transport process in the history based algorithm in hibits construction of vectors. An alterna tive, event-based algorithm is suitable for vectorization and has been used recently to achieve impressive gains in perfor mance on vector supercomputers. This re view describes the event-based algorithm and several variations of it Implementa tions of this algorithm for applications in particle transport are described, and their relative merits are discussed. The imple mentation of Monte Carlo methods on multiple vector parallel processors is con sidered, as is the potential of massively parallel processors for Monte Carlo par ticle transport simulations.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/67177/2/10.1177_109434208700100203.pd

Crossref

Deep Blue Documents at the University of Michigan

Dynamic Systolization for Developing Multiprocessor Supercomputers

Author: Hwang Kai
Xu Zhiwei
Publication venue: 'Purdue University (bepress)'
Publication date: 01/10/1984
Field of study

A dynamic network approach is introduced for developing reconfigurable, systolic arrays or wavefront processors; This allows one to design very powerful and flexible processors to be used in a general-purpose, reconfigurable, and fault-tolerant, multiprocessor computer system. The concepts of macro-dataflow and multitasking can be integrated to handle variable-resolution granularities in computationally intensive algorithms. A multiprocessor architecture, Remps, is proposed based on these design methodologies. The Remps architecture is generalized from the Cedar, HEP, Cray X- MP, Trac, NYU ultracomputer, S-l, Pumps, Chip, and SAM projects. Our goal is to provide a multiprocessor research model for developing design methodologies, multiprocessing and multitasking supports, dynamic systolic/wavefront array processors, interconnection networks, reconfiguration techniques, and performance analysis tools. These system design and operational techniques should be useful to those who are developing or evaluating multiprocessor supercomputers

Purdue E-Pubs

The 30th Anniversary of the Supercomputing Conference: Bringing the Future Closer - Supercomputing History and the Immortality of Now

Author: Dongarra J.
Dongarra J.
Getov Vladimir
Getov Vladimir
Walsh K.
Walsh K.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2018
Field of study

A panel of experts discusses historical reflections on the past 30 years of the Supercomputing (SC) conference, its leading role for the professional community and some exciting future challenges

Crossref

WestminsterResearch

Development of a Navier-Stokes algorithm for parallel-processing supercomputers

Author: Swisshelm Julie M.
Publication venue
Publication date
Field of study

An explicit flow solver, applicable to the hierarchy of model equations ranging from Euler to full Navier-Stokes, is combined with several techniques designed to reduce computational expense. The computational domain consists of local grid refinements embedded in a global coarse mesh, where the locations of these refinements are defined by the physics of the flow. Flow characteristics are also used to determine which set of model equations is appropriate for solution in each region, thereby reducing not only the number of grid points at which the solution must be obtained, but also the computational effort required to get that solution. Acceleration to steady-state is achieved by applying multigrid on each of the subgrids, regardless of the particular model equations being solved. Since each of these components is explicit, advantage can readily be taken of the vector- and parallel-processing capabilities of machines such as the Cray X-MP and Cray-2

NASA Technical Reports Server

Avalanche: A communication and memory architecture for scalable parallel computing

Author: Carter John B.
Kuo Chen-Chi
Publication venue: University of Utah
Publication date: 01/01/1995
Field of study

technical reportAs the gap between processor and memory speeds widens?? system designers will inevitably incorpo rate increasingly deep memory hierarchies to maintain the balance between processor and memory system performance At the same time?? most communication subsystems are permitted access only to main memory and not a processor s top level cache As memory latencies increase?? this lack of integration between the memory and communication systems will seriously impede interprocessor communication performance and limit e ective scalability In the Avalanche project we are re designing the memory architecture of a commercial RISC multiprocessor?? the HP PA RISC ?? to include a new multi level context sensitive cache that is tightly coupled to the communication fabric The primary goal of Avalanche s integrated cache and communication controller is attack ing end to end communication latency in all of its forms This includes cache misses induced by excessive invalidations and reloading of shared data by write invalidate coherence protocols and cache misses induced by depositing incoming message data in main memory and faulting it into the cache An execution driven simulation study of Avalanche s architecture indicates that it can reduce cache stalls by and overall execution times b

The University of Utah: J. Willard Marriott Digital Library

Earth and environmental science in the 1980's: Part 1: Environmental data systems, supercomputer facilities and networks

Author
Publication venue
Publication date
Field of study

Overview descriptions of on-line environmental data systems, supercomputer facilities, and networks are presented. Each description addresses the concepts of content, capability, and user access relevant to the point of view of potential utilization by the Earth and environmental science community. The information on similar systems or facilities is presented in parallel fashion to encourage and facilitate intercomparison. In addition, summary sheets are given for each description, and a summary table precedes each section

NASA Technical Reports Server

Vector coprocessor sharing techniques for multicores: performance and energy gains

Author: Beldianu Spiridon Florin
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2012
Field of study

Vector Processors (VPs) created the breakthroughs needed for the emergence of computational science many years ago. All commercial computing architectures on the market today contain some form of vector or SIMD processing. Many high-performance and embedded applications, often dealing with streams of data, cannot efficiently utilize dedicated vector processors for various reasons: limited percentage of sustained vector code due to substantial flow control; inherent small parallelism or the frequent involvement of operating system tasks; varying vector length across applications or within a single application; data dependencies within short sequences of instructions, a problem further exacerbated without loop unrolling or other compiler optimization techniques. Additionally, existing rigid SIMD architectures cannot tolerate efficiently dynamic application environments with many cores that may require the runtime adjustment of assigned vector resources in order to operate at desired energy/performance levels. To simultaneously alleviate these drawbacks of rigid lane-based VP architectures, while also releasing on-chip real estate for other important design choices, the first part of this research proposes three architectural contexts for the implementation of a shared vector coprocessor in multicore processors. Sharing an expensive resource among multiple cores increases the efficiency of the functional units and the overall system throughput. The second part of the dissertation regards the evaluation and characterization of the three proposed shared vector architectures from the performance and power perspectives on an FPGA (Field-Programmable Gate Array) prototype. The third part of this work introduces performance and power estimation models based on observations deduced from the experimental results. The results show the opportunity to adaptively adjust the number of vector lanes assigned to individual cores or processing threads in order to minimize various energy-performance metrics on modern vector- capable multicore processors that run applications with dynamic workloads. Therefore, the fourth part of this research focuses on the development of a fine-to-coarse grain power management technique and a relevant adaptive hardware/software infrastructure which dynamically adjusts the assigned VP resources (number of vector lanes) in order to minimize the energy consumption for applications with dynamic workloads. In order to remove the inherent limitations imposed by FPGA technologies, the fifth part of this work consists of implementing an ASIC (Application Specific Integrated Circuit) version of the shared VP towards precise performance-energy studies involving high- performance vector processing in multicore environments

Digital Commons @ New Jersey Institute of Technology (NJIT)

Simulating complex multi-core computing systems: techniques and tools

Author: Secchi Simone
Publication venue
Publication date: 07/03/2011
Field of study

Archivio istituzionale della ricerca - Università di Cagliari

UniCA Eprints

Computer vision algorithms on reconfigurable logic arrays

Author: A.K. Jain
N.K. Ratha
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Simulating complex multi-core computing systems: techniques and tools

Author: SECCHI SIMONE
Publication venue: Università degli Studi di Cagliari
Publication date: 07/03/2011
Field of study

Archivio istituzionale della ricerca - Università di Cagliari