6,134 research outputs found
Matrix-free multigrid block-preconditioners for higher order Discontinuous Galerkin discretisations
Efficient and suitably preconditioned iterative solvers for elliptic partial
differential equations (PDEs) of the convection-diffusion type are used in all
fields of science and engineering. To achieve optimal performance, solvers have
to exhibit high arithmetic intensity and need to exploit every form of
parallelism available in modern manycore CPUs. The computationally most
expensive components of the solver are the repeated applications of the linear
operator and the preconditioner. For discretisations based on higher-order
Discontinuous Galerkin methods, sum-factorisation results in a dramatic
reduction of the computational complexity of the operator application while, at
the same time, the matrix-free implementation can run at a significant fraction
of the theoretical peak floating point performance. Multigrid methods for high
order methods often rely on block-smoothers to reduce high-frequency error
components within one grid cell. Traditionally, this requires the assembly and
expensive dense matrix solve in each grid cell, which counteracts any
improvements achieved in the fast matrix-free operator application. To overcome
this issue, we present a new matrix-free implementation of block-smoothers.
Inverting the block matrices iteratively avoids storage and factorisation of
the matrix and makes it is possible to harness the full power of the CPU. We
implemented a hybrid multigrid algorithm with matrix-free block-smoothers in
the high order DG space combined with a low order coarse grid correction using
algebraic multigrid where only low order components are explicitly assembled.
The effectiveness of this approach is demonstrated by solving a set of
representative elliptic PDEs of increasing complexity, including a convection
dominated problem and the stationary SPE10 benchmark.Comment: 28 pages, 10 figures, 10 tables; accepted for publication in Journal
of Computational Physic
Recommended from our members
A Parallel Direct Method for Finite Element Electromagnetic Computations Based on Domain Decomposition
High performance parallel computing and direct (factorization-based) solution methods have been the two main trends in electromagnetic computations in recent years. When time-harmonic (frequency-domain) Maxwell\u27s equation are directly discretized with the Finite Element Method (FEM) or other Partial Differential Equation (PDE) methods, the resulting linear system of equations is sparse and indefinite, thus harder to efficiently factorize serially or in parallel than alternative methods e.g. integral equation solutions, that result in dense linear systems. State-of-the-art sparse matrix direct solvers such as MUMPS and PARDISO don\u27t scale favorably, have low parallel efficiency and high memory footprint. This work introduces a new class of sparse direct solvers based on domain decomposition method, termed Direct Domain Decomposition Method (D3M), which is reliable, memory efficient, and offers very good parallel scalability for arbitrary 3D FEM problems.
Unlike recent trends in approximate/low-rank solvers, this method focuses on `numerically exact\u27 solution methods as they are more reliable for complex `real-life\u27 models. The proposed method leverages physical insights at every stage of the development through a new symmetric domain decomposition method (DDM) with one set of Lagrange multipliers. Applying a special regularization scheme at the interfaces, either artificial loss or gain is introduced to each domain to eliminate non-physical internal resonances. A block-wise recursive algorithm based on Takahashi relationship is proposed for the efficient computation of discrete Dirichlet-to-Neumann (DtN) map to reduce the volumetric problem from all domains into an auxiliary surfacial problem defined on the domain interfaces only. Numerical results show up to 50% run-time saving in DtN map computation using the proposed block-wise recursive algorithm compared to alternative approaches. The auxiliary unknowns on the domain interfaces form a considerably (approximately an order of magnitude) smaller block-wise sparse matrix, which is efficiently factorized using a customized block LDL factorization with restricted pivoting to ensure stability.
The parallelization of the proposed D3M is realized based on Directed Acyclic Graph (DAG). Recent advances in parallel dense direct solvers, have shifted toward parallel implementation that rely on DAG scheduling to achieve highly efficient asynchronous parallel execution. However, adaptation of such schemes to sparse matrices is harder and often impractical. In D3M, computation of each domain\u27s discrete DtN map ``embarrassingly parallel\u27\u27, whereas the customized block LDLT is suitable for a block directed acyclic graph (B-DAG) task scheduling, similar to that used in dense matrix parallel direct solvers. In this approach, computations are represented as a sequence of small tasks that operate on domains of DDM or dense matrix blocks of the reduced matrix. These tasks can be statically scheduled for parallel execution using their DAG dependencies and weights that depend on estimates of computation and communication costs.
Comparisons with state-of-the-art exact direct solvers on electrically large problems suggest up to 20% better parallel efficiency, 30% - 3X less memory and slightly faster in runtime, while maintaining the same accuracy
ASIP Design and Prototyping for Wireless Communication Applications
International audienc
Perceptual Conformity in Facial Emotion Processing
The ability to recognize and respond quickly to visual signals of threat is critical for survival. Threatening faces are hypothesized to capture visual attention more rapidly than nonthreatening faces. This experiment tested the perceptual conformity hypothesis, which predicts that attention differences elicited by threatening vs. nonthreatening faces depend on whether the inner facial features follow the curvature of the outer facial surround.
In a pre-experimental study, 38 participants rated the affect of stimuli with and without a facial surround. These ratings determined the stimuli for an experimental flankers task, which was completed by 35 different participants. Flanker displays included compatible and incompatible trials, in which flanker stimuli, if responded to, would or would not have the same response as the centrally-located targets.
The flankers experiment examined a) whether emotionally neutral surround-present and surround-absent stimuli, containing conforming and nonconforming inner lines, generated the flanker-effect asymmetries that have been reported for angry vs. happy faces; and b) whether incompatible flankers with nonconforming inner lines would generate more response interference than those with conforming inner lines, in both surround conditions.
No flanker-effect asymmetry or difference in response interference were obtained for either surround condition. For surround-present trials, reaction times were significantly faster to targets with conforming inner lines than to those with nonconforming inner lines, and to compatible as opposed to incompatible trials. For surround-absent trials, participants responded faster to compatible trials, and there were no reaction time differences between targets with conforming and nonconforming inner lines.
The results are not consistent with the perceptual conformity hypothesis. One potential reason is that perceptual conformity may not account for the reported attention distribution differences to threatening vs. nonthreatening faces. Some other perceptual feature may explain previously documented flanker-effect asymmetries, or facial affect may override perceptual contributions to these asymmetries. Such interpretations are clouded, however, by the inconclusive and potentially confounded extant literature and the scant evidence for the flanker-effect asymmetry based on facial threat. Assuming the validity of the reported attention differences, future research is needed to elucidate the attributes that consistently elicit such differences for targets that convey specific categories of emotion
- …