303 research outputs found

    Generalized scans and tridiagonal systems

    Get PDF
    AbstractMotivated by the analysis of known parallel techniques for the solution of linear tridiagonal system, we introduce generalized scans, a class of recursively defined length-preserving, sequence-to-sequence transformations that generalize the well-known prefix computations (scans). Generalized scan functions are described in terms of three algorithmic phases, the reduction phase that saves data for the third or expansion phase and prepares data for the second phase which is a recursive invocation of the same function on one fewer variable. Both the reduction and expansion phases operate on bounded number of variables, a key feature for their parallelization. Generalized scans enjoy a property, called here protoassociativity, that gives rise to ordinary associativity when generalized scans are specialized to ordinary scans. We show that the solution of positive-definite block tridiagonal linear systems can be cast as a generalized scan, thereby shedding light on the underlying structure enabling known parallelization schemes for this problem. We also describe a variety of parallel algorithms including some that are well known for tridiagonal systems and some that are much better suited to distributed computation

    Algebraic, Block and Multiplicative Preconditioners based on Fast Tridiagonal Solves on GPUs

    Get PDF
    This thesis contributes to the field of sparse linear algebra, graph applications, and preconditioners for Krylov iterative solvers of sparse linear equation systems, by providing a (block) tridiagonal solver library, a generalized sparse matrix-vector implementation, a linear forest extraction, and a multiplicative preconditioner based on tridiagonal solves. The tridiagonal library, which supports (scaled) partial pivoting, outperforms cuSPARSE's tridiagonal solver by factor five while completely utilizing the available GPU memory bandwidth. For the performance optimized solving of multiple right-hand sides, the explicit factorization of the tridiagonal matrix can be computed. The extraction of a weighted linear forest (union of disjoint paths) from a general graph is used to build algebraic (block) tridiagonal preconditioners and deploys the generalized sparse-matrix vector implementation of this thesis for preconditioner construction. During linear forest extraction, a new parallel bidirectional scan pattern, which can operate on double-linked list structures, identifies the path ID and the position of a vertex. The algebraic preconditioner construction is also used to build more advanced preconditioners, which contain multiple tridiagonal factors, based on generalized ILU factorizations. Additionally, other preconditioners based on tridiagonal factors are presented and evaluated in comparison to ILU and ILU incomplete sparse approximate inverse preconditioners (ILU-ISAI) for the solution of large sparse linear equation systems from the Sparse Matrix Collection. For all presented problems of this thesis, an efficient parallel algorithm and its CUDA implementation for single GPU systems is provided

    Surface density-of-states on semi-infinite topological photonic and acoustic crystals

    Full text link
    Iterative Green's function, based on cyclic reduction of block tridiagonal matrices, has been the ideal algorithm, through tight-binding models, to compute the surface density-of-states of semi-infinite topological electronic materials. In this paper, we apply this method to photonic and acoustic crystals, using finite-element discretizations and a generalized eigenvalue formulation, to calculate the local density-of-states on a single surface of semi-infinite lattices. The three-dimensional (3D) examples of gapless helicoidal surface states in Weyl and Dirac crystals are shown and the computational cost, convergence and accuracy are analyzed.Comment: 7 pages, 4 figure

    Novel time-saving first-principles calculation method for electron-transport properties

    Get PDF
    We present a time-saving simulator within the framework of the density functional theory to calculate the transport properties of electrons through nanostructures suspended between semi-infinite electrodes. By introducing the Fourier transform and preconditioning conjugate-gradient algorithms into the simulator, a highly efficient performance can be achieved in determining scattering wave functions and electron-transport properties of nanostructures suspended between semi-infinite jellium electrodes. To demonstrate the performance of the present algorithms, we study the conductance of metallic nanowires and the origin of the oscillatory behavior in the conductance of an Ir nanowire. It is confirmed that the ss-dz2d_{z^2} channel of the Ir nanowire exhibits the transmission oscillation with a period of two-atom length, which is also dominant in the experimentally obtained conductance trace

    Enhanced transport through desorption-mediated diffusion

    Get PDF
    We present a master equation approach to the study of the bulk-mediated surface diffusion mechanism in a three-dimensional confined domain. The proposed scheme allowed us to evaluate analytically a number of magnitudes that were used to characterize the efficiency of the bulk-mediated surface transport mechanism, for instance, the mean escape time from the domain, and the mean number of distinct visited sites on the confined domain boundary.Fil: Rojo, Félix. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; ArgentinaFil: Budde, Carlos Esteban. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Wio, Horacio Sergio. Universidad de Cantabria; España. Consejo Superior de Investigaciones Cientificas; España. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Budde, Carlos Ernesto. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentin

    Unraveling the Non-Hermitian Skin Effect in Dissipative Systems

    Full text link
    The non-Hermitian skin effect, i.e. eigenstate condensation at the edges in lattices with open boundaries, is an exotic manifestation of non-Hermitian systems. In Bloch theory, an effective non-Hermitian Hamiltonian is generally used to describe dissipation, which however is not norm-preserving and neglects quantum jumps. Here it is shown that in a self-consistent description of the dissipative dynamics in a one-band lattice, based on the stochastic Schr\"odinger equation or Lindblad master equation with a collective jump operator, the skin effect and its dynamical features are washed out. Nevertheless, both short- and long-time relaxation dynamics provide a hidden signature of the skin effect found in the semiclassical limit. In particular, relaxation toward a maximally mixed state with the largest von Neumann entropy in a lattice with open boundaries is a manifestation of the semiclassical skin effect.Comment: 14 pages, 6 figures, to appear in Phys Rev B (Rapid Communications

    Kerr non-linearity in a superconducting Josephson metamaterial

    Full text link
    We present a detailed experimental and theoretical analysis of the dispersion and non-linear Kerr frequency shifts of plasma modes in a one-dimensional Josephson junction chain containing 500 SQUIDs in the regime of weak nonlinearity. The measured low-power dispersion curve agrees perfectly with the theoretical model if we take into account the Kerr renormalisation of the bare frequencies and the long-range nature of the island charge screening by a remote ground plane. We measured the self- and cross-Kerr shifts for the frequencies of the eight lowest modes in the chain. We compare the measured Kerr coefficients with theory and find good agreement

    Investigation and modeling of viscoelastic moduli for multilayered polymeric systems using high frequency ultrasonic waves

    Get PDF
    Mechanical characterization of both the bulk and individual layer properties of layered polymer stacks provides important information for their use in novel applications. A single technique to measure both the bulk and layer properties is atempted. Ultrasonic testing provides an opportunity to determine the mechanical characteristics for layered samples in the form of the complex mechanical moduli. These moduli express the viscoelastic properties of the materials. Using ultrasound, this can be done for the bulk and the layers in a single test. With ultrasound, the ability to determine the complex moduli in single layers has been demonstrated. The moduli were determined within the expected range. The ultrasonic testing has also allowed the determination of the speed of sound of the individual layers in a 2 layer sample consisting of layers of Polycarbonate and Poly(methyl methacrylate). Internal interference limited the ability to measure attenuation. To attempt to allow for analysis of these complex waveforms, a secondary technique for waveform analysis has been proposed and developed. This method employs a finite element simulation to replicate the experiment. By deriving a simulation with the complex moduli as inputs, it is possible to use the simulation results to measure the moduli of multilayered samples. This is done comparatively through iteration of the simulation inputs. When a set of inputs creates a simulated result matching the experimental scans, a solution has been found. A preliminary version of the simulation is presented and demonstrated

    Verifying raytracing/Fokker-Planck lower-hybrid current drive predictions with self-consistent full-wave/Fokker-Planck simulations

    Full text link
    Raytracing/Fokker-Planck (FP) simulations used to model lower-hybrid current drive (LHCD) often fail to reproduce experimental results, particularly when LHCD is weakly damped. A proposed reason for this discrepancy is the lack of "full-wave" effects, such as diffraction and interference, in raytracing simulations and the breakdown of raytracing approximation. Previous studies of LHCD using non-Maxwellian full-wave/FP simulations have been performed, but these simulations were not self-consistent and enforced power conservation between the FP and full-wave code using a numerical rescaling factor. Here we have created a fully-self consistent full-wave/FP model for LHCD that is automatically power conserving. This was accomplished by coupling an overhauled version of the non-Maxwellian TORLH full-wave solver and the CQL3D FP code using the Integrated Plasma Simulator. We performed converged full-wave/FP simulations of Alcator C-Mod discharges and compared them to raytracing. We found that excellent agreement in the power deposition profiles from raytracing and TORLH could be obtained, however, TORLH had somewhat lower current drive efficiency and broader power deposition profiles in some cases. This discrepancy appears to be a result of numerical limitations present in the TORLH model and a small amount of diffractional broadening of the TORLH wave spectrum. Our results suggest full-wave simulation of LHCD is likely not necessary as diffraction and interference represented only a small correction that could not account for the differences between simulations and experiment
    corecore