Search CORE

1,281 research outputs found

Automatic CFD input decomposition for massively parallel systems

Author: Salminen Esa
Publication venue: Aalto-yliopisto
Publication date: 01/01/1995
Field of study

MEMO No CFD/THERMO-6-95 Date: 17. December 1995Abstract: This memorandum describes a computer program which divides CFD grids into smaller blocks and updates the boundary condition information and the flow solver control data accordingly. The program is designed to be used in conjunction with FINFLO flow solver. Main result: Computer program divp3d

Aaltodoc Publication Archive

Recommended from our members

Algorithm Based Fault Tolerance in Massively Parallel Systems

Author: Lerner Mark D.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1988
Field of study

An A complex computer system consists of billions of transistors, miles of wires, and many interactions with an unpredictable environment. Correct results must be produced despite faults that dynamically occur in some of these components. Many techniques have been developed for fault tolerant computation. General purpose methods are independent of the application, yet incur an overhead cost which may be unacceptable for massively parallel systems. Algorithm-specific methods, which can operate at lower cost, are a developing alternative [1, 72]. This paper first reviews the general-purpose approach and then focuses on the algorithm-specific method, with an eye toward massively parallel processors. Algorithm-based fault tolerance has the attraction of low overhead; furthermore it addresses both the detection and also the correction problems. The principle is to build low-cost checking and correcting mechanism based exclusively on the redundancies inherent in the system

Columbia University Academic Commons

Using Naming Strategies to Make Massively Parallel Systems Work

Author
Publication venue: 'Hindawi Limited'
Publication date: 01/01/1994
Field of study

Crossref

Shared Transaction Markov Chains for Fluid Analysis of Massively Parallel Systems

Author: Bradley JT
Hayden R
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Published versio

Crossref

Spiral - Imperial College Digital Repository

Software-based fault-tolerant routing algorithm in multidimensional networks

Author: Alzeidi N.
Fathy M.
Khonsari A.
Ould-Khaoua M.
Rezazad M.
Safaei F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Massively parallel computing systems are being built with hundreds or thousands of components such as nodes, links, memories, and connectors. The failure of a component in such systems will not only reduce the computational power but also alter the network's topology. The software-based fault-tolerant routing algorithm is a popular routing to achieve fault-tolerance capability in networks. This algorithm is initially proposed only for two dimensional networks (Suh et al., 2000). Since, higher dimensional networks have been widely employed in many contemporary massively parallel systems; this paper proposes an approach to extend this routing scheme to these indispensable higher dimensional networks. Deadlock and livelock freedom and the performance of presented algorithm, have been investigated for networks with different dimensionality and various fault regions. Furthermore, performance results have been presented through simulation experiments

Crossref

Enlighten

Graph analytics on modern massively parallel systems

Author: Vella Flavio
Publication venue
Publication date: 13/02/2017
Field of study

Graphs provide a very flexible abstraction for understanding and modeling complex systems in many fields such as physics, biology, neuroscience, engineering, and social science. Only in the last two decades, with the advent of Big Data era, supercomputers equipped by accelerators –i.e., Graphics Processing Unit (GPUs)–, advanced networking, and highly parallel file systems have been used to analyze graph properties such as reachability, diameter, connected components, centrality, and clustering coefficient. Today graphs of interest may be composed by millions, sometimes billions, of nodes and edges and exhibit a highly irregular structure. As a consequence, the design of efficient and scalable graph algorithms is an extraordinary challenge due to irregular communication and memory access patterns, high synchronization costs, and lack of data locality. In the present dissertation, we start off with a brief and gentle introduction for the reader to graph analytics and massively parallel systems. In particular, we present the intersection between graph analytics and parallel architectures in the current state-of-the-art and discuss the challenges encountered when solving such problems on large-scale graphs on these architectures (Chapter 1). In Chapter 2, some preliminary definitions and graph-theoretical notions are provided together with a description of the synthetic graphs used in the literature to model real-world networks. In Chapters 3-5, we present and tackle three different relevant problems in graph analysis: reachability (Chapter 3), Betweenness Centrality (Chapter 4), and clustering coefficient (Chapter 5). In detail, Chapter 3 tackles reachability problems by providing two scalable algorithms and implementations which efficiently solve st-connectivity problems on very large-scale graphs Chapter 4 considers the problem of identifying most relevant nodes in a network which plays a crucial role in several applications, including transportation and communication networks, social network analysis, and biological networks. In particular, we focus on a well-known centrality metrics, namely Betweenness Centrality (BC), and present two different distributed algorithms for the BC computation on unweighted and weighted graphs. For unweighted graphs, we present a new communication-efficient algorithm based on the combination of bi-dimensional (2D) decomposition and multi-level parallelism. Furthermore, new algorithms which exploit the underlying graph topology to reduce the time and space usage of betweenness centrality computations are described as well. Concerning weighted graphs, we provide a scalable algorithm based on an algebraic formulation of the problem. Finally, thorough comprehensive experimental results on synthetic and real- world large-scale graphs, we show that the proposed techniques are effective in practice and achieve significant speedups against state-of-the-art solutions. Chapter 5 considers clustering coefficients problem. Similarly to Betweenness Centrality, it is a fundamental tool in network analysis, as it specifically measures how nodes tend to cluster together in a network. In the chapter, we first extend caching techniques to Remote Memory Access (RMA) operations on distributed-memory system. The caching layer is mainly designed to avoid inter-node communications in order to achieve similar benefits for irregular applications as communication-avoiding algorithms. We also show how cached RMA is able to improve the performance of a new distributed asynchronous algorithm for the computation of local clustering coefficients. Finally, Chapter 6 contains a brief summary of the key contributions described in the dissertation and presents potential future directions of the work

Archivio della ricerca- Università di Roma La Sapienza

Integrated AlGaAs source of highly indistinguishable and energy-time entangled photons

Author: Autebert Claire
Bruno Natalia
Carbonell Carmen Gomez
Ducci Sara
Favero Ivan
Lemaître Aristide
Leo Giuseppe
Martin Anthony
Zbinden Hugo
Publication venue
Publication date: 20/07/2015
Field of study

The generation of nonclassical states of light in miniature chips is a crucial step towards practical implementations of future quantum technologies. Semiconductor materials are ideal to achieve extremely compact and massively parallel systems and several platforms are currently under development. In this context, spontaneous parametric down conversion in AlGaAs devices combines the advantages of room temperature operation, possibility of electrical injection and emission in the telecom band. Here we report on a chip-based AlGaAs source, producing indistinguishable and energy-time entangled photons with a brightness of

7.2\times10^6

pairs/s and a signal-to-noise ratio of

141\pm12

. Indistinguishability between the photons is demonstrated via a Hong-Ou-Mandel experiment with a visibility of

89\pm3\%

, while energy-time entanglement is tested via a Franson interferometer leading to a value for the Bell parameter

S=2.70\pm0.10

arXiv.org e-Print Archive

Crossref

Archive ouverte UNIGE

Achieving Efficient Strong Scaling with PETSc using Hybrid MPI/OpenMP Optimisation

Author: G. Goumas
G. Schubert
G. Wellein
M. Butler
M.D. Piggott
N. Bell
P. Balaji
S. Williams
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

The increasing number of processing elements and decreas- ing memory to core ratio in modern high-performance platforms makes efficient strong scaling a key requirement for numerical algorithms. In order to achieve efficient scalability on massively parallel systems scientific software must evolve across the entire stack to exploit the multiple levels of parallelism exposed in modern architectures. In this paper we demonstrate the use of hybrid MPI/OpenMP parallelisation to optimise parallel sparse matrix-vector multiplication in PETSc, a widely used scientific library for the scalable solution of partial differential equations. Using large matrices generated by Fluidity, an open source CFD application code which uses PETSc as its linear solver engine, we evaluate the effect of explicit communication overlap using task-based parallelism and show how to further improve performance by explicitly load balancing threads within MPI processes. We demonstrate a significant speedup over the pure-MPI mode and efficient strong scaling of sparse matrix-vector multiplication on Fujitsu PRIMEHPC FX10 and Cray XE6 systems

arXiv.org e-Print Archive

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository