Search CORE

11,084 research outputs found

Algorithmic patterns for $\mathcal{H}$ -matrices on many-core processors

Author: Zaspel Peter
Publication venue
Publication date: 01/01/2017
Field of study

In this work, we consider the reformulation of hierarchical (

\mathcal{H}

) matrix algorithms for many-core processors with a model implementation on graphics processing units (GPUs).

\mathcal{H}

matrices approximate specific dense matrices, e.g., from discretized integral equations or kernel ridge regression, leading to log-linear time complexity in dense matrix-vector products. The parallelization of

\mathcal{H}

matrix operations on many-core processors is difficult due to the complex nature of the underlying algorithms. While previous algorithmic advances for many-core hardware focused on accelerating existing

\mathcal{H}

matrix CPU implementations by many-core processors, we here aim at totally relying on that processor type. As main contribution, we introduce the necessary parallel algorithmic patterns allowing to map the full

\mathcal{H}

matrix construction and the fast matrix-vector product to many-core hardware. Here, crucial ingredients are space filling curves, parallel tree traversal and batching of linear algebra operations. The resulting model GPU implementation hmglib is the, to the best of the authors knowledge, first entirely GPU-based Open Source

\mathcal{H}

matrix library of this kind. We conclude this work by an in-depth performance analysis and a comparative performance study against a standard

\mathcal{H}

matrix library, highlighting profound speedups of our many-core parallel approach

arXiv.org e-Print Archive

edoc

Enhanced LFR-toolbox for MATLAB and LFT-based gain scheduling

Author: Hecker Simon
Magni Jean-Francios
Varga Andras
Publication venue
Publication date: 01/06/2004
Field of study

We describe recent developments and enhancements of the LFR-Toolbox for MATLAB for building LFT-based uncertainty models and for LFT-based gain scheduling. A major development is the new LFT-object definition supporting a large class of uncertainty descriptions: continuous- and discrete-time uncertain models, regular and singular parametric expressions, more general uncertainty blocks (nonlinear, time-varying, etc.). By associating names to uncertainty blocks the reusability of generated LFT-models and the user friendliness of manipulation of LFR-descriptions have been highly increased. Significant enhancements of the computational efficiency and of numerical accuracy have been achieved by employing efficient and numerically robust Fortran implementations of order reduction tools via mex-function interfaces. The new enhancements in conjunction with improved symbolical preprocessing lead generally to a faster generation of LFT-models with significantly lower orders. Scheduled gains can be viewed as LFT-objects. Two techniques for designing such gains are presented. Analysis tools are also considered

Institute of Transport Research:Publications

The Role of Representations in Executive Function: Investigating a Developmental Link between Flexibility and Abstraction.

Author: Kharitonova Maria
Munakata Yuko
Publication venue: eScholarship, University of California
Publication date: 01/01/2011
Field of study

Young children often perseverate, engaging in previously correct, but no longer appropriate behaviors. One account posits that such perseveration results from the use of stimulus-specific representations of a situation, which are distinct from abstract, generalizable representations that support flexible behavior. Previous findings supported this account, demonstrating that only children who flexibly switch between rules could generalize their behavior to novel stimuli. However, this link between flexibility and generalization might reflect general cognitive abilities, or depend upon similarities across the measures or their temporal order. The current work examined these issues by testing the specificity and generality of this link. In two experiments with 3-year-old children, flexibility was measured in terms of switching between rules in a card-sorting task, while abstraction was measured in terms of selecting which stimulus did not belong in an odd-one-out task. The link between flexibility and abstraction was general across (1) abstraction dimensions similar to or different from those in the card-sorting task and (2) abstraction tasks that preceded or followed the switching task. Good performance on abstraction and flexibility measures did not extend to all cognitive tasks, including an IQ measure, and dissociated from children's ability to gaze at the correct stimulus in the odd-one-out task, suggesting that the link between flexibility and abstraction is specific to such measures, rather than reflecting general abilities that affect all tasks. We interpret these results in terms of the role that developing prefrontal cortical regions play in processes such as working memory, which can support both flexibility and abstraction

Directory of Open Access Journals

PubMed Central

Frontiers - Publisher Connector

eScholarship - University of California

Task-based adaptive multiresolution for time-space multi-scale reaction-diffusion systems on multi-core architectures

Author: Descombes Stéphane
Duarte Max
Dumont Thierry
Guillet Thomas
Louvet Violaine
Massot Marc
Publication venue: 'Cellule MathDoc/CEDRAM'
Publication date: 14/10/2016
Field of study

A new solver featuring time-space adaptation and error control has been recently introduced to tackle the numerical solution of stiff reaction-diffusion systems. Based on operator splitting, finite volume adaptive multiresolution and high order time integrators with specific stability properties for each operator, this strategy yields high computational efficiency for large multidimensional computations on standard architectures such as powerful workstations. However, the data structure of the original implementation, based on trees of pointers, provides limited opportunities for efficiency enhancements, while posing serious challenges in terms of parallel programming and load balancing. The present contribution proposes a new implementation of the whole set of numerical methods including Radau5 and ROCK4, relying on a fully different data structure together with the use of a specific library, TBB, for shared-memory, task-based parallelism with work-stealing. The performance of our implementation is assessed in a series of test-cases of increasing difficulty in two and three dimensions on multi-core and many-core architectures, demonstrating high scalability

arXiv.org e-Print Archive

HAL-CentraleSupelec

HAL-UJM

The SMAI journal of computational mathematics

Numérisation de Documents Anciens Mathématiques

Hal-Diderot

HAL-Polytechnique

HAL-Rennes 1

A scalable H-matrix approach for the solution of boundary integral equations on multi-GPU clusters

Author: Harbrecht Helmut
Zaspel Peter
Publication venue
Publication date: 01/06/2018
Field of study

In this work, we consider the solution of boundary integral equations by means of a scalable hierarchical matrix approach on clusters equipped with graphics hardware, i.e. graphics processing units (GPUs). To this end, we extend our existing single-GPU hierarchical matrix library hmglib such that it is able to scale on many GPUs and such that it can be coupled to arbitrary application codes. Using a model GPU implementation of a boundary element method (BEM) solver, we are able to achieve more than 67 percent relative parallel speed-up going from 128 to 1024 GPUs for a model geometry test case with 1.5 million unknowns and a real-world geometry test case with almost 1.2 million unknowns. On 1024 GPUs of the cluster Titan, it takes less than 6 minutes to solve the 1.5 million unknowns problem, with 5.7 minutes for the setup phase and 20 seconds for the iterative solver. To the best of the authors' knowledge, we here discuss the first fully GPU-based distributed-memory parallel hierarchical matrix Open Source library using the traditional H-matrix format and adaptive cross approximation with an application to BEM problems

arXiv.org e-Print Archive

edoc

A permanent formula for the Jones polynomial

Author: Loebl Martin
Moffatt Iain
Publication venue: 'Elsevier BV'
Publication date: 15/08/2010
Field of study

The permanent of a square matrix is defined in a way similar to the determinant, but without using signs. The exact computation of the permanent is hard, but there are Monte-Carlo algorithms that can estimate general permanents. Given a planar diagram of a link L with

n

crossings, we define a 7n by 7n matrix whose permanent equals to the Jones polynomial of L. This result accompanied with recent work of Freedman, Kitaev, Larson and Wang provides a Monte-Carlo algorithm to any decision problem belonging to the class BQP, i.e. such that it can be computed with bounded error in polynomial time using quantum resources.Comment: To appear in Advances in Applied Mathematic

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector