Search CORE

1,165 research outputs found

A Case Study in Coordination Programming: Performance Evaluation of S-Net vs Intel's Concurrent Collections

Author: Gijsbers Bert
Grelck Clemens
Shafarenko Alex
Tveretina Olga
Zaichenkov Pavel
Publication venue
Publication date: 01/01/2014
Field of study

We present a programming methodology and runtime performance case study comparing the declarative data flow coordination language S-Net with Intel's Concurrent Collections (CnC). As a coordination language S-Net achieves a near-complete separation of concerns between sequential software components implemented in a separate algorithmic language and their parallel orchestration in an asynchronous data flow streaming network. We investigate the merits of S-Net and CnC with the help of a relevant and non-trivial linear algebra problem: tiled Cholesky decomposition. We describe two alternative S-Net implementations of tiled Cholesky factorization and compare them with two CnC implementations, one with explicit performance tuning and one without, that have previously been used to illustrate Intel CnC. Our experiments on a 48-core machine demonstrate that S-Net manages to outperform CnC on this problem.Comment: 9 pages, 8 figures, 1 table, accepted for PLC 2014 worksho

arXiv.org e-Print Archive

Crossref

Ghent University Academic Bibliography

International Migration, Integration and Social Cohesion online publications

Algorithms for Large-scale Whole Genome Association Analysis

Author: Aulchenko Yurii
Bientinesi Paolo
Fabregat Diego
Peise Elmar
Publication venue
Publication date: 01/01/2013
Field of study

In order to associate complex traits with genetic polymorphisms, genome-wide association studies process huge datasets involving tens of thousands of individuals genotyped for millions of polymorphisms. When handling these datasets, which exceed the main memory of contemporary computers, one faces two distinct challenges: 1) Millions of polymorphisms come at the cost of hundreds of Gigabytes of genotype data, which can only be kept in secondary storage; 2) the relatedness of the test population is represented by a covariance matrix, which, for large populations, can only fit in the combined main memory of a distributed architecture. In this paper, we present solutions for both challenges: The genotype data is streamed from and to secondary storage using a double buffering technique, while the covariance matrix is kept across the main memory of a distributed memory system. We show that these methods sustain high-performance and allow the analysis of enormous datase

arXiv.org e-Print Archive

Crossref

Publikationsserver der RWTH Aachen University

Distributed Bayesian Probabilistic Matrix Factorization

Author: Aa Tom Vander
Chakroun Imen
Haber Tom
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/05/2017
Field of study

Matrix factorization is a common machine learning technique for recommender systems. Despite its high prediction accuracy, the Bayesian Probabilistic Matrix Factorization algorithm (BPMF) has not been widely used on large scale data because of its high computational cost. In this paper we propose a distributed high-performance parallel implementation of BPMF on shared memory and distributed architectures. We show by using efficient load balancing using work stealing on a single node, and by using asynchronous communication in the distributed version we beat state of the art implementations

arXiv.org e-Print Archive

Crossref

High Performance Solutions for Big-data GWAS

Author: Bientinesi Paolo
Fabregat-Traver Diego
Peise Elmar
Publication venue
Publication date: 01/01/2014
Field of study

In order to associate complex traits with genetic polymorphisms, genome-wide association studies process huge datasets involving tens of thousands of individuals genotyped for millions of polymorphisms. When handling these datasets, which exceed the main memory of contemporary computers, one faces two distinct challenges: 1) Millions of polymorphisms and thousands of phenotypes come at the cost of hundreds of gigabytes of data, which can only be kept in secondary storage; 2) the relatedness of the test population is represented by a relationship matrix, which, for large populations, can only fit in the combined main memory of a distributed architecture. In this paper, by using distributed resources such as Cloud or clusters, we address both challenges: The genotype and phenotype data is streamed from secondary storage using a double buffer- ing technique, while the relationship matrix is kept across the main memory of a distributed memory system. With the help of these solutions, we develop separate algorithms for studies involving only one or a multitude of traits. We show that these algorithms sustain high-performance and allow the analysis of enormous datasets.Comment: Submitted to Parallel Computing. arXiv admin note: substantial text overlap with arXiv:1304.227

arXiv.org e-Print Archive

Publikationsserver der RWTH Aachen University

Directed Transmission Method, A Fully Asynchronous approach to Solve Sparse Linear Systems in Parallel

Author: Wei Fei
Yang Huazhong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

In this paper, we propose a new distributed algorithm, called Directed Transmission Method (DTM). DTM is a fully asynchronous and continuous-time iterative algorithm to solve SPD sparse linear system. As an architecture-aware algorithm, DTM could be freely running on all kinds of heterogeneous parallel computer. We proved that DTM is convergent by making use of the final-value theorem of Laplacian Transformation. Numerical experiments show that DTM is stable and efficient.Comment: v1: poster presented in SPAA'08; v2: full paper; v3: rename EVS to GNBT; v4: reuse EVS. More info, see my web page at http://weifei00.googlepages.co

arXiv.org e-Print Archive

Crossref

Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis

Author: Cheshmi Kazem
Dehnavi Maryam Mehri
Kamil Shoaib
Strout Michelle Mills
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/05/2017
Field of study

Sympiler is a domain-specific code generator that optimizes sparse matrix computations by decoupling the symbolic analysis phase from the numerical manipulation stage in sparse codes. The computation patterns in sparse numerical methods are guided by the input sparsity structure and the sparse algorithm itself. In many real-world simulations, the sparsity pattern changes little or not at all. Sympiler takes advantage of these properties to symbolically analyze sparse codes at compile-time and to apply inspector-guided transformations that enable applying low-level transformations to sparse codes. As a result, the Sympiler-generated code outperforms highly-optimized matrix factorization codes from commonly-used specialized libraries, obtaining average speedups over Eigen and CHOLMOD of 3.8X and 1.5X respectively.Comment: 12 page

arXiv.org e-Print Archive

Crossref

A Mobile Computing Architecture for Numerical Simulation

Author: Dumont Cyril
Mourlin Fabrice
Publication venue
Publication date: 01/01/2007
Field of study

The domain of numerical simulation is a place where the parallelization of numerical code is common. The definition of a numerical context means the configuration of resources such as memory, processor load and communication graph, with an evolving feature: the resources availability. A feature is often missing: the adaptability. It is not predictable and the adaptable aspect is essential. Without calling into question these implementations of these codes, we create an adaptive use of these implementations. Because the execution has to be driven by the availability of main resources, the components of a numeric computation have to react when their context changes. This paper offers a new architecture, a mobile computing architecture, based on mobile agents and JavaSpace. At the end of this paper, we apply our architecture to several case studies and obtain our first results

arXiv.org e-Print Archive