Search CORE

125 research outputs found

Performance of MPI on the CRAY T3E-512

Author: Berger Holger
Bönisch Thomas
Rabenseifner Rolf
Resch Michael
Publication venue
Publication date: 07/02/2013
Field of study

The CRAY T3E-512 is currently the most powerful machine available at RUS/hww. Although it provides support for shared memory the natural programming model for the machine is message passing. Since RUS has decided to support primarily the MPI standard we have found it useful to test the performance of MPI on the machine for several standard message passing constructs

MPI Semantic Terms and Conventions Explained:The Big Idea: Understanding Semantic Terms and Conventions is Key to Using, Extending, and Implementing MPI Correctly

Author: Bangalore Purushotham V.
Blaas-Schenner Claudia
Holmes Daniel
Jaeger Julien
Mercier Guillaume
Rabenseifner Rolf
Skjellum Anthony
Publication venue
Publication date: 11/09/2019
Field of study

Edinburgh Research Explorer

Generalisation of Recursive Doubling for AllReduce: Now with Simulation

Author: Brooks
Chan
Culler
End
Hensgen
Hoefler
Hoefler
Hoefler
Hoefler
Mark Bull
Martin Ruefenacht
Pritchard
Rabenseifner
Stephen Booth
Thakur
Thakur
Publication venue: 'Elsevier BV'
Publication date: 18/08/2017
Field of study

Crossref

Edinburgh Research Explorer

Systolic and Hyper-Systolic Algorithms for the Gravitational N-Body Problem, with an Application to Brownian Motion

Author: Aarseth
Alexander
Allen
Backer
Barnes
Beckwith
Chakrabarty
David Merritt
Dehnen
Ernst Nils Dorband
Gebhardt
Genzel
Genzel
Ghez
Giersz
Heggie
Kent
Kung
Kung
Kustaanheimo
Lippert
Lippert
Lippert
Makino
Makino
Marc Hemsendorf
Merritt
Merritt
Merritt
Miller
Milosavljevic
Peebles
Plummer
Rabenseifner
Reid
Snir
Spurzem
Tremaine
Publication venue: 'Elsevier BV'
Publication date: 01/01/2001
Field of study

A systolic algorithm rhythmically computes and passes data through a network of processors. We investigate the performance of systolic algorithms for implementing the gravitational N-body problem on distributed-memory computers. Systolic algorithms minimize memory requirements by distributing the particles between processors. We show that the performance of systolic routines can be greatly enhanced by the use of non-blocking communication, which allows particle coordinates to be communicated at the same time that force calculations are being carried out. Hyper-systolic algorithms reduce the communication complexity at the expense of increased memory demands. As an example of an application requiring large N, we use the systolic algorithm to carry out direct-summation simulations using 10^6 particles of the Brownian motion of the supermassive black hole at the center of the Milky Way galaxy. We predict a 3D random velocity of 0.4 km/s for the black hole.Comment: 33 pages, 10 postscript figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Performance Evaluation of Supercomputers using HPCC and IMB Benchmarks

Author: Adamidis Panagiotis
Ciotti Robert
Dossa Don
Fatoohi Rod
Gunney Brian T. N.
Koniges Alice
Mueller Matthias
Rabenseifner Rolf
Saini Subhash
Spelce Thomas E.
Tiyyagura Sunil R.
Publication venue
Publication date: 01/01/2006
Field of study

The HPC Challenge (HPCC) benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading supercomputers - SGI Altix BX2, Cray XI, Cray Opteron Cluster, Dell Xeon cluster, and NEC SX-8. These five systems use five different networks (SGI NUMALINK4, Cray network, Myrinet, InfiniBand, and NEC IXS). The complete set of HPCC benchmarks are run on each of these systems. Additionally, we present Intel MPI Benchmarks (IMB) results to study the performance of 11 MPI communication functions on these systems

CiteSeerX

Crossref

NASA Technical Reports Server

Cell response to a newly developed Ti-10Ta-10Nb alloy and its sputtered nanoscale coating

Author: Albrektsson
Bagley
Berube
Black
Breme
Brumme
Cohen
Crochet
Fontaine
Hacking
Johansson
Kartsogiannis
Kasemo
Levine
Li
Limberger
Linter
Meenaghan
Niinomi
Rabenseifner
Shah
Shimko
Srinivasan
Tepe
Webster
Zardiackas
Zitter
Publication venue: The Korean Academy of Prosthodontics
Publication date: 01/01/2009
Field of study

Crossref

PubMed Central

Bandwidth optimal all-reduce algorithms for clusters of workstations

Author: Bar-Noy
Bar-Noy
Bruck
Bruck
Faraj
Faraj
Faraj
Gropp
Iannello
Karwande
Knodel
Lane
Patarasuk
Pitch Patarasuk
Rabenseifner
Rabenseifner
Thakur
van de Geijn
Xin Yuan
Yuan
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Scalability of Incompressible Flow Computations on Multi-GPU Clusters Using Dual-Level and Tri-Level Parallelism

Author: Balaji P.
Bova S. W.
Cappello F.
Cappello F.
Cappello F.
Cwire
Cwire
Cwire
Dong S.
Elsen E.
Goglin B.
Griebel M.
Gropp W.
Guermond J.L.L.
Göddeke D.
Hager G.
Hempel R.
Henty D. S.
Kindratenko V.
Luong P.
Lusk E.
Nakajima K.
Nakajima K.
Owens J.D.
Rabenseifner R.
Schive H.
Showerman M.
Simon H.
Thibault J. C.
Wan D.C.
Publication venue: 'IUScholarWorks'
Publication date: 04/01/2011
Field of study

High performance computing using graphics processing units (GPUs) is gaining popularity in the scientific computing field, with many large compute clusters being augmented with multiple GPUs in each node. We investigate hybrid tri-level (MPI-OpenMP-CUDA) parallel implementations to explore the efficiency and scalability of incompressible flow computations on GPU clusters up to 128 GPUS. This work details some of the unique issues faced when merging fine-grain parallelism on the GPU using CUDA with coarse-grain parallelism using OpenMP for intra-node and MPI for inter-node communication. Comparisons between the tri-level MPI-OpenMP-CUDA and dual-level MPI-CUDA implementations are shown using computationally large computational fluid dynamics (CFD) simulations. Our results demonstrate that a tri-level parallel implementation does not provide a significant advantage in performance over the dual-level implementation, however further research is needed to justify our conclusion for a cluster with a high GPU per node density or when using software that can utilize OpenMP’s fine-grain parallelism more effectively

Crossref

Boise State University - ScholarWorks

Proceedings of the 12th International Conference on Kinanthropology

Author: Agricola Adrián
Akelaitis Arturas
Antekolović Josipa
Antekolović Ljubomir
Babic Matej
Babić Matej
Bahenský Petr
Ban Maja
Bernaciková Martina
Beňuš Patrik
Blahutová Anna
Blazevic Mateo
Bokůvka Dominik
Borek Adam
Bozděch Michal
Brűnn David
Bugala Martin
Catlak Nikolina
Crhová Marie
Dukarić Vedran
Filipovic Ela
Foretic Nikola
Foretić Nikola
Franek Vladimír
Fudurić Martina
Gilic Barbara
Gimunová Marta
Gladović Neven
Gurín Daniel
Harčarik Gabriel
Hedbávný Petr
Hermann Tomáš
Hlinský Tomáš
Hnátová Iva
Holeček Pavel
Holienka Miroslav
Honcová Martina
Horáček Tomáš
Hrnčiříková Iva
Hubinák Andrej
Hulinková Mária
Jakubcová Kateřina
Jurečka Jan
Juříková Jana
Jánsky Kristián
Jůva Vladimír
Kalichová Miriam
Kaplan Aleš
Kapounková Kateřina
Karperová Miriam
Kolářová Kateřina
Komzák Martin
Kotlík Kamil
Kovačić Grgur
Koštial Ján
Kremnický Juraj
Králíková Jitka
Krška Peter
Kumstát Michal
Kuna Danijela
Kvesic Ivan
Leško Luka
Ljubičić Sanja
Líška David
Líška Dávid
Malátová Renata
Malátová Renáta
Marelić Nenad
Mareš Martin
Marić Dora
Marko David
Martincová Andrea
Mazúr Jakub
Mačura Peter
Mertová Klára
Modrić Toni
Nagy Nikolas
Novotná Nadežda
Nykodým Jiří
Ondračková Anna
Ondráček Jan
Očić Mateja
Peca Miloš
Peric Ivan
Pupiš Martin
Pupišová Zuzana
Rabenseifner Michal
Roček Michal
Ružbarský Pavel
Scholz Petr
Sebera Martin
Sedláček Jaromír
Sekot Aleš
Sekulic Damir
Skotáková Alena
Slačanac Kristijan
Slepička Pavel
Slepičková Irena
Spasic Miodrag
Stračárová Nikola
Strašilová Kateřina
Struhár Ivan
Střeštíková Radka
Svobodová Lenka
Svobodová Zora
Sýkora Jozef
Talašová Jana
Terzic Admir
Vacenovský Pavel
Vajda Petr
Vasilj Šimun
Vencúrik Tomáš
Veršić Šime
Vespalec Tomáš
Vičar Michal
Vodička Tomáš
Válková Hana
Zderčík Antonín
Zekić Robert
Zeljko Ivan
Zenic Natasa
Zháněl Jiří
Zvonař Martin
Čech Pavol
Čillík Ivan
Đurković Tomislav
Šoltés-Mertová Klára
Švantner Roman
Žugaj Nenad
Žák Michal
Publication venue: 'Masaryk University Press'
Publication date: 01/04/2022
Field of study

Proceedings of the 12th Conference of Sport and Quality of Life 2019 gatheres submissions of participants of the conference. Every submission is the result of positive evaluation by reviewers from the corresponding field. Conference is divided into sections – Analysis of human movement; Sport training, nutrition and regeneration; Sport and social sciences; Active ageing and sarcopenia; Strength and conditioning training; section for PhD students

Directory of Open Access Books (DOAB)

CMIP: a software package capable of reconstructing genome-wide regulatory networks using gene expression data

Author: A Honkela
AA Margolin
AA Margolin
AC Haury
AL Barabasi
AN Brooks
B Usadel
D Angeli
D Braha
D Marbach
D Marbach
F Liu
Guangyong Zheng
I Cantone
J Nickolls
J Zhao
JJ Faith
L Chen
LE Brown
Luonan Chen
M Chevalier
M Grieb
M Zou
N Friedman
N Kramer
PE Meyer
R Bonneau
R Liu
R Ming
R Rabenseifner
S Kauffman
TS Gardner
W Ma
X Yu
X Zhang
X Zhang
X Zhang
X Zhang
Xin-Guang Zhu
Xiujun Zhang
Y Artzy-Randrup
Y Wang
Yaochen Xu
Zhi-Ping Liu
Zhuo Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref