Search CORE

2,352 research outputs found

Hierarchical Parallelisation of Functional Renormalisation Group Calculations -- hp-fRG

Author: Rohe Daniel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

The functional renormalisation group (fRG) has evolved into a versatile tool in condensed matter theory for studying important aspects of correlated electron systems. Practical applications of the method often involve a high numerical effort, motivating the question in how far High Performance Computing (HPC) can leverage the approach. In this work we report on a multi-level parallelisation of the underlying computational machinery and show that this can speed up the code by several orders of magnitude. This in turn can extend the applicability of the method to otherwise inaccessible cases. We exploit three levels of parallelisation: Distributed computing by means of Message Passing (MPI), shared-memory computing using OpenMP, and vectorisation by means of SIMD units (single-instruction-multiple-data). Results are provided for two distinct High Performance Computing (HPC) platforms, namely the IBM-based BlueGene/Q system JUQUEEN and an Intel Sandy-Bridge-based development cluster. We discuss how certain issues and obstacles were overcome in the course of adapting the code. Most importantly, we conclude that this vast improvement can actually be accomplished by introducing only moderate changes to the code, such that this strategy may serve as a guideline for other researcher to likewise improve the efficiency of their codes

arXiv.org e-Print Archive

Juelich Shared Electronic Resources

Automatic Adaption of the Sampling Frequency for Detailed Performance Analysis

Author: Knüpfer Andreas
Wagner Michael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/07/2017
Field of study

One of the most urgent challenges in event based performance analysis is the enormous amount of collected data. Combining event tracing and periodic sampling has been a successful approach to allow a detailed event-based recording of MPI communication and a coarse recording of the remaining application with periodic sampling. In this paper, we present a novel approach to automatically adapt the sampling frequency during runtime to the given amount of buffer space, releasing users to find an appropriate sampling frequency themselves. This way, the entire measurement can be kept within a single memory buffer, which avoids disruptive intermediate memory buffer flushes, excessive data volumes, and measurement delays due to slow file system interaction. We describe our approach to sort and store samples based on their order of occurrence in an hierarchical array based on powers of two. Furthermore, we evaluate the feasibility as well as the overhead of the approach with the prototype implementation OTFX based on the Open Trace Format 2, a state-of-the-art Open Source event trace library used by the performance analysis tools Vampir, Scalasca, and Tau.This work is supported by the Spanish Ministry of Economy and Competitiveness under contract TIN2015-65316-P.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

The Parallelism Motifs of Genomic Data Analysis

Author: Awan Muaaz
Azad Ariful
Brock Benjamin
Buluc Aydin
Egan Rob
Ekanayake Saliya
Ellis Marquita
Georganas Evangelos
Guidi Giulia
Hofmeyr Steven
Oliker Leonid
Selvitopi Oguz
Teodoropol Cristina
Yelick Katherine
Publication venue: 'The Royal Society'
Publication date: 20/01/2020
Field of study

Genomic data sets are growing dramatically as the cost of sequencing continues to decline and small sequencing devices become available. Enormous community databases store and share this data with the research community, but some of these genomic data analysis problems require large scale computational platforms to meet both the memory and computational requirements. These applications differ from scientific simulations that dominate the workload on high end parallel systems today and place different requirements on programming support, software libraries, and parallel architectural design. For example, they involve irregular communication patterns such as asynchronous updates to shared data structures. We consider several problems in high performance genomics analysis, including alignment, profiling, clustering, and assembly for both single genomes and metagenomes. We identify some of the common computational patterns or motifs that help inform parallelization strategies and compare our motifs to some of the established lists, arguing that at least two key patterns, sorting and hashing, are missing

arXiv.org e-Print Archive

eScholarship - University of California

EGI user forum 2011 : book of abstracts

Author
Publication venue
Publication date: 01/01/2011
Field of study

Hochschulschriftenserver - Universität Frankfurt am Main

Parallel-FST: A feature selection library for multicore clusters

Author: Beceiro Bieito
González-Domínguez Jorge
Touriño Juan
Publication venue: 'Elsevier BV'
Publication date: 01/11/2022
Field of study

Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract]: Feature selection is a subfield of machine learning focused on reducing the dimensionality of datasets by performing a computationally intensive process. This work presents Parallel-FST, a publicly available parallel library for feature selection that includes seven methods which follow a hybrid MPI/multithreaded approach to reduce their runtime when executed on high performance computing systems. Performance tests were carried out on a 256-core cluster, where Parallel-FST obtained speedups of up to 229x for representative datasets and it was able to analyze a 512 GB dataset, which was not previously possible with a sequential counterpart library due to memory constraints.This research was supported by the Ministry of Science and Innovation of Spain (PID2019-104184RB-I00/AEI/10.13039/ 501100011033), by the Ministry of Universities of Spain under grant FPU20/00997, and by Xunta de Galicia and FEDER funds of the EU (CITIC, Centro de Investigación de Galicia accreditation 2019-2022, ref. ED431G 2019/01; Consolidation Program of Competitive Reference Groups, ED431C 2021/30).Xunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431C 2021/3

Repositorio da Universidade da Coruña