Search CORE

10,025 research outputs found

An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling

Author: Ghysels Pieter
Li Xiaoye S.
Napov Artem
Rouet Francois-Henry
Williams Samuel
Publication venue
Publication date: 25/02/2015
Field of study

We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination, and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factorization leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK -- STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices

arXiv.org e-Print Archive

eScholarship - University of California

DI-fusion

A recursive-faulting model of distributed damage in confined brittle materials

Author: Conti S.
Ortiz M.
Pandolfi A.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

We develop a model of distributed damage in brittle materials deforming in triaxial compression based on the explicit construction of special microstructures obtained by recursive faulting. The model aims to predict the effective or macroscopic behavior of the material from its elastic and fracture properties; and to predict the microstructures underlying the microscopic behavior. The model accounts for the elasticity of the matrix, fault nucleation and the cohesive and frictional behavior of the faults. We analyze the resulting quasistatic boundary value problem and determine the relaxation of the potential energy, which describes the macroscopic material behavior averaged over all possible fine-scale structures. Finally, we present numerical calculations of the dynamic multi-axial compression experiments on sintered aluminum nitride of Chen and Ravichandran [1994. Dynamic compressive behavior of ceramics under lateral confinement. J. Phys. IV 4, 177–182; 1996a. Static and dynamic compressive behavior of aluminum nitride under moderate confinement. J. Am. Soc. Ceramics 79(3), 579–584; 1996b. An experimental technique for imposing dynamic multiaxial compression with mechanical confinement. Exp. Mech. 36(2), 155–158; 2000. Failure mode transition in ceramics under dynamic multiaxial compression. Int. J. Fracture 101, 141–159]. The model correctly predicts the general trends regarding the observed damage patterns; and the brittle-to-ductile transition resulting under increasing confinement

Archivio istituzionale della ricerca - Politecnico di Milano

Caltech Authors

Semi-optimal Practicable Algorithmic Cooling

Author: L. J. Schulman
Tal Mor
Y. Elias
Yossi Weinstein
Yuval Elias
Publication venue: 'American Physical Society (APS)'
Publication date: 26/10/2011
Field of study

Algorithmic Cooling (AC) of spins applies entropy manipulation algorithms in open spin-systems in order to cool spins far beyond Shannon's entropy bound. AC of nuclear spins was demonstrated experimentally, and may contribute to nuclear magnetic resonance (NMR) spectroscopy. Several cooling algorithms were suggested in recent years, including practicable algorithmic cooling (PAC) and exhaustive AC. Practicable algorithms have simple implementations, yet their level of cooling is far from optimal; Exhaustive algorithms, on the other hand, cool much better, and some even reach (asymptotically) an optimal level of cooling, but they are not practicable. We introduce here semi-optimal practicable AC (SOPAC), wherein few cycles (typically 2-6) are performed at each recursive level. Two classes of SOPAC algorithms are proposed and analyzed. Both attain cooling levels significantly better than PAC, and are much more efficient than the exhaustive algorithms. The new algorithms are shown to bridge the gap between PAC and exhaustive AC. In addition, we calculated the number of spins required by SOPAC in order to purify qubits for quantum computation. As few as 12 and 7 spins are required (in an ideal scenario) to yield a mildly pure spin (60% polarized) from initial polarizations of 1% and 10%, respectively. In the latter case, about five more spins are sufficient to produce a highly pure spin (99.99% polarized), which could be relevant for fault-tolerant quantum computing.Comment: 13 pages, 5 figure

arXiv.org e-Print Archive

Real-time filtering and detection of dynamics for compression of HDTV

Author: Bauer Peter
Sauer Ken D.
Publication venue
Publication date
Field of study

The preprocessing of video sequences for data compressing is discussed. The end goal associated with this is a compression system for HDTV capable of transmitting perceptually lossless sequences at under one bit per pixel. Two subtopics were emphasized to prepare the video signal for more efficient coding: (1) nonlinear filtering to remove noise and shape the signal spectrum to take advantage of insensitivities of human viewers; and (2) segmentation of each frame into temporally dynamic/static regions for conditional frame replenishment. The latter technique operates best under the assumption that the sequence can be modelled as a superposition of active foreground and static background. The considerations were restricted to monochrome data, since it was expected to use the standard luminance/chrominance decomposition, which concentrates most of the bandwidth requirements in the luminance. Similar methods may be applied to the two chrominance signals