Search CORE

69,118 research outputs found

Development of modularity in the neural activity of children's brains

Author: Chen Man
Deem Michael W.
Publication venue: 'IOP Publishing'
Publication date: 28/01/2014
Field of study

We study how modularity of the human brain changes as children develop into adults. Theory suggests that modularity can enhance the response function of a networked system subject to changing external stimuli. Thus, greater cognitive performance might be achieved for more modular neural activity, and modularity might likely increase as children develop. The value of modularity calculated from fMRI data is observed to increase during childhood development and peak in young adulthood. Head motion is deconvolved from the fMRI data, and it is shown that the dependence of modularity on age is independent of the magnitude of head motion. A model is presented to illustrate how modularity can provide greater cognitive performance at short times, i.e.\ task switching. A fitness function is extracted from the model. Quasispecies theory is used to predict how the average modularity evolves with age, illustrating the increase of modularity during development from children to adults that arises from selection for rapid cognitive function in young adults. Experiments exploring the effect of modularity on cognitive performance are suggested. Modularity may be a potential biomarker for injury, rehabilitation, or disease.Comment: 29 pages, 11 figure

arXiv.org e-Print Archive

Elsevier - Publisher Connector

DSpace at Rice University

Empowering parallel computing with field programmable gate arrays

Author: D'Hollander Erik
Publication venue: 'IOS Press'
Publication date: 01/01/2020
Field of study

After more than 30 years, reconﬁgurable computing has grown from a concept to a mature ﬁeld of science and technology. The cornerstone of this evolution is the ﬁeld programmable gate array, a building block enabling the conﬁguration of a custom hardware architecture. The departure from static von Neumannlike architectures opens the way to eliminate the instruction overhead and to optimize the execution speed and power consumption. FPGAs now live in a growing ecosystem of development tools, enabling software programmers to map algorithms directly onto hardware. Applications abound in many directions, including data centers, IoT, AI, image processing and space exploration. The increasing success of FPGAs is largely due to an improved toolchain with solid high-level synthesis support as well as a better integration with processor and memory systems. On the other hand, long compile times and complex design exploration remain areas for improvement. In this paper we address the evolution of FPGAs towards advanced multi-functional accelerators, discuss different programming models and their HLS language implementations, as well as high-performance tuning of FPGAs integrated into a heterogeneous platform. We pinpoint fallacies and pitfalls, and identify opportunities for language enhancements and architectural reﬁnements

Ghent University Academic Bibliography

Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics

Author: Babich Ronald
Clark Michael A.
Joó Bálint
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Graphics Processing Units (GPUs) are having a transformational effect on numerical lattice quantum chromodynamics (LQCD) calculations of importance in nuclear and particle physics. The QUDA library provides a package of mixed precision sparse matrix linear solvers for LQCD applications, supporting single GPUs based on NVIDIA's Compute Unified Device Architecture (CUDA). This library, interfaced to the QDP++/Chroma framework for LQCD calculations, is currently in production use on the "9g" cluster at the Jefferson Laboratory, enabling unprecedented price/performance for a range of problems in LQCD. Nevertheless, memory constraints on current GPU devices limit the problem sizes that can be tackled. In this contribution we describe the parallelization of the QUDA library onto multiple GPUs using MPI, including strategies for the overlapping of communication and computation. We report on both weak and strong scaling for up to 32 GPUs interconnected by InfiniBand, on which we sustain in excess of 4 Tflops.Comment: 11 pages, 7 figures, to appear in the Proceedings of Supercomputing 2010 (submitted April 12, 2010

arXiv.org e-Print Archive

CiteSeerX

Acceleration of Coarse Grain Molecular Dynamics on GPU Architectures

Author: Anderson
Bauer
Berendsen
Brown
Brown
Colberg
Dullweber
Friedrichs
Ganesan
Gay
Harvey
Högberg
Liu
Liu
MacCallum
Mourtisen
Müller
Nguyen
Orsi
Orsi
Orsi
Orsi
Orsi
Orsi
Orsi
Orsi
Orsi
Phillips
Plimpton
Rapaport
Rapaport
Schmid
Stone
Stone
Stone
Sunarso
van Meel
Wang
Wohlert
Zhmurov
Publication venue: John Wiley & Sons Limited:1 Oldlands Way, Bognor Regis, P022 9SA United Kingdom:011 44 1243 779777, EMAIL: [email protected], INTERNET: http://www.wiley.co.uk, Fax: 011 44 1243 843232
Publication date: 01/01/2013
Field of study

Coarse grain (CG) molecular models have been proposed to simulate complex sys- tems with lower computational overheads and longer timescales with respect to atom- istic level models. However, their acceleration on parallel architectures such as Graphic Processing Units (GPU) presents original challenges that must be carefully evaluated. The objective of this work is to characterize the impact of CG model features on parallel simulation performance. To achieve this, we implemented a GPU-accelerated version of a CG molecular dynamics simulator, to which we applied specic optimizations for CG models, such as dedicated data structures to handle dierent bead type interac- tions, obtaining a maximum speed-up of 14 on the NVIDIA GTX480 GPU with Fermi architecture. We provide a complete characterization and evaluation of algorithmic and simulated system features of CG models impacting the achievable speed-up and accuracy of results, using three dierent GPU architectures as case studie

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

PORTO Publications Open Repository TOrino

Large-scale linear regression: Development of high-performance routines

Author: Bientinesi Paolo
Fabregat-Traver Diego
Frank Alvaro
Publication venue
Publication date: 01/01/2015
Field of study

In statistics, series of ordinary least squares problems (OLS) are used to study the linear correlation among sets of variables of interest; in many studies, the number of such variables is at least in the millions, and the corresponding datasets occupy terabytes of disk space. As the availability of large-scale datasets increases regularly, so does the challenge in dealing with them. Indeed, traditional solvers---which rely on the use of black-box" routines optimized for one single OLS---are highly inefficient and fail to provide a viable solution for big-data analyses. As a case study, in this paper we consider a linear regression consisting of two-dimensional grids of related OLS problems that arise in the context of genome-wide association analyses, and give a careful walkthrough for the development of {\sc ols-grid}, a high-performance routine for shared-memory architectures; analogous steps are relevant for tailoring OLS solvers to other applications. In particular, we first illustrate the design of efficient algorithms that exploit the structure of the OLS problems and eliminate redundant computations; then, we show how to effectively deal with datasets that do not fit in main memory; finally, we discuss how to cast the computation in terms of efficient kernels and how to achieve scalability. Importantly, each design decision along the way is justified by simple performance models. {\sc ols-grid} enables the solution of

10^{11}

correlated OLS problems operating on terabytes of data in a matter of hours

arXiv.org e-Print Archive

Publikationsserver der RWTH Aachen University

Structural dynamics branch research and accomplishments for fiscal year 1987

Author
Publication venue
Publication date
Field of study

This publication contains a collection of fiscal year 1987 research highlights from the Structural Dynamics Branch at NASA Lewis Research Center. Highlights from the branch's four major work areas, Aeroelasticity, Vibration Control, Dynamic Systems, and Computational Structural Methods, are included in the report as well as a complete listing of the FY87 branch publications

NASA Technical Reports Server

Quantum Monte Carlo for large chemical systems: Implementing efficient strategies for petascale platforms and beyond

Author: Caffarel Michel
Jalby William
Oseret Emmanuel
Scemama Anthony
Publication venue: 'Wiley'
Publication date: 01/10/2012
Field of study

Various strategies to implement efficiently QMC simulations for large chemical systems are presented. These include: i.) the introduction of an efficient algorithm to calculate the computationally expensive Slater matrices. This novel scheme is based on the use of the highly localized character of atomic Gaussian basis functions (not the molecular orbitals as usually done), ii.) the possibility of keeping the memory footprint minimal, iii.) the important enhancement of single-core performance when efficient optimization tools are employed, and iv.) the definition of a universal, dynamic, fault-tolerant, and load-balanced computational framework adapted to all kinds of computational platforms (massively parallel machines, clusters, or distributed grids). These strategies have been implemented in the QMC=Chem code developed at Toulouse and illustrated with numerical applications on small peptides of increasing sizes (158, 434, 1056 and 1731 electrons). Using 10k-80k computing cores of the Curie machine (GENCI-TGCC-CEA, France) QMC=Chem has been shown to be capable of running at the petascale level, thus demonstrating that for this machine a large part of the peak performance can be achieved. Implementation of large-scale QMC simulations for future exascale platforms with a comparable level of efficiency is expected to be feasible

arXiv.org e-Print Archive

HAL-INSA Toulouse

HAL UVSQ

A review of High Performance Computing foundations for scientists

Author: Cramer C. J.
Dongarra J.
Dror R. O.
Goldberg D.
Hager G.
Haoqiang J.
Hennessy J. L.
Marx D.
Moore G.
PABLO E. IBÁÑEZ
PABLO GARCÍA-RISUEÑO
Wilkinson B.
Woo D. H.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 23/05/2012
Field of study

The increase of existing computational capabilities has made simulation emerge as a third discipline of Science, lying midway between experimental and purely theoretical branches [1, 2]. Simulation enables the evaluation of quantities which otherwise would not be accessible, helps to improve experiments and provides new insights on systems which are analysed [3-6]. Knowing the fundamentals of computation can be very useful for scientists, for it can help them to improve the performance of their theoretical models and simulations. This review includes some technical essentials that can be useful to this end, and it is devised as a complement for researchers whose education is focused on scientific issues and not on technological respects. In this document we attempt to discuss the fundamentals of High Performance Computing (HPC) [7] in a way which is easy to understand without much previous background. We sketch the way standard computers and supercomputers work, as well as discuss distributed computing and discuss essential aspects to take into account when running scientific calculations in computers.Comment: 33 page

arXiv.org e-Print Archive

Crossref