232 research outputs found

    Blocked algorithms for the reduction to Hessenberg-triangular form revisited

    Get PDF
    We present two variants of Moler and Stewart's algorithm for reducing a matrix pair to Hessenberg-triangular (HT) form with increased data locality in the access to the matrices. In one of these variants, a careful reorganization and accumulation of Givens rotations enables the use of efficient level 3 BLAS. Experimental results on four different architectures, representative of current high performance processors, compare the performances of the new variants with those of the implementation of Moler and Stewart's algorithm in subroutine DGGHRD from LAPACK, Dackland and Kågström's two-stage algorithm for the HT form, and a modified version of the latter which requires considerably less flop

    Restructuring the Tridiagonal and Bidiagonal QR Algorithms for Performance

    Get PDF
    We show how both the tridiagonal and bidiagonal QR algorithms can be restructured so that they be- come rich in operations that can achieve near-peak performance on a modern processor. The key is a novel, cache-friendly algorithm for applying multiple sets of Givens rotations to the eigenvector/singular vector matrix. This algorithm is then implemented with optimizations that (1) leverage vector instruction units to increase floating-point throughput, and (2) fuse multiple rotations to decrease the total number of memory operations. We demonstrate the merits of these new QR algorithms for computing the Hermitian eigenvalue decomposition (EVD) and singular value decomposition (SVD) of dense matrices when all eigen- vectors/singular vectors are computed. The approach yields vastly improved performance relative to the traditional QR algorithms for these problems and is competitive with two commonly used alternatives— Cuppen’s Divide and Conquer algorithm and the Method of Multiple Relatively Robust Representations— while inheriting the more modest O(n) workspace requirements of the original QR algorithms. Since the computations performed by the restructured algorithms remain essentially identical to those performed by the original methods, robust numerical properties are preserved

    Shine bright or live long: substituent effects in [Cu(N^N)(P^P)]+-based light-emitting electrochemical cells where N^N is a 6-substituted 2,2'-bipyridine

    Get PDF
    We report [Cu(P^P)(N^N)][PF6] complexes with P^P = bis(2-(diphenylphosphino)phenyl)ether (POP) or 4,5-bis(diphenylphosphino)-9,9-dimethylxanthene (xantphos) and N^N = 6-methyl-2,2′-bipyridine (Mebpy), 6-ethyl-2,2′-bipyridine (Etbpy), 6,6′-dimethyl-2,2′-bipyridine (Me2bpy) or 6-phenyl-2,2′-bipyridine (Phbpy). The crystal structures of [Cu(POP)(Phbpy)][PF6]·Et2O, [Cu(POP)(Etbpy)][PF6]·Et2O, [Cu(xantphos)(Me2bpy)][PF6], [Cu(xantphos)(Mebpy)][PF6]·CH2Cl2·0.4Et2O, [Cu(xantphos)(Etbpy)][PF6]·CH2Cl2·1.5H2O and [Cu(xantphos)(Phbpy)][PF6] are described; each copper(I) centre is distorted tetrahedral. In the crystallographically determined structures, the N^N domain in [Cu(xantphos)(Phbpy)]+ and [Cu(POP)(Phbpy)]+ is rotated ∼180° with respect to its orientation in [Cu(xantphos)(Mebpy)]+, [Cu(POP)(Etbpy)]+ and [Cu(xantphos)(Etbpy)]+; in each complex containing xantphos, the xanthene ‘bowl’ retains the same conformation in the solid-state structures. The two conformers resulting from the 180° rotation of the N^N ligand were optimized at the B3LYP-D3/(6-31G**+LANL2DZ) level and are close in energy for each complex. Variable temperature NMR spectroscopy evidences the presence of two conformers of [Cu(xantphos)(Phbpy)]+ in solution which are related by inversion of the xanthene unit. The complexes exhibit MLCT absorption bands in the range 378 to 388 nm, and excitation into each MLCT band leads to yellow emissions. Photoluminescence quantum yields (PLQYs) increase from solution to thin-film and powder; the highest PLQYs are observed for powdered [Cu(xantphos)(Mebpy)][PF6] (34%), [Cu(xantphos)(Etbpy)][PF6] (37%) and [Cu(xantphos)(Me2bpy)][PF6] (37%) with lifetimes of 9.6–11 μs. Density functional theory calculations predict that the emitting triplet (T1) involves an electron transfer from the Cu–P^P environment to the N^N ligand and therefore shows a 3MLCT character. T1 is calculated to be ∼0.20 eV lower in energy than the first singlet excited state (S1). The [Cu(P^P)(N^N)][PF6] ionic transition-metal (iTMC) complexes were tested in light-emitting electrochemical cells (LECs). Turn-on times are fast, and the LEC with [Cu(xantphos)(Me2bpy)][PF6] achieves a maximum efficacy of 3.0 cd A−1 (luminance = 145 cd m−2) with a lifetime of 1 h; on going to the [Cu(xantphos)(Mebpy)][PF6]-based LEC, the lifetime exceeds 15 h but at the expense of the efficacy (1.9 cd A−1). The lifetimes of LECs containing [Cu(xantphos)(Etbpy)][PF6] and [Cu(POP)(Etbpy)][PF6] exceed 40 and 80 h respectively

    The Casiquiare river acts as a corridor between the Amazonas and Orinoco river basins: biogeographic analysis of the genus Cichla

    Full text link
    The Casiquiare River is a unique biogeographic corridor between the Orinoco and Amazonas basins. We investigated the importance of this connection for Neotropical fishes using peacock cichlids ( Cichla spp.) as a model system. We tested whether the Casiquiare provides a conduit for gene flow between contemporary populations, and investigated the origin of biogeographic distributions that span the Casiquiare. Using sequences from the mitochondrial control region of three focal species ( C. temensis , C. monoculus , and C. orinocensis ) whose distributions include the Amazonas, Orinoco, and Casiquiare, we constructed maximum likelihood phylograms of haplotypes and analyzed the populations under an isolation-with-migration coalescent model. Our analyses suggest that populations of all three species have experienced some degree of gene flow via the Casiquiare. We also generated a mitochondrial genealogy of all Cichla species using >2000 bp and performed a dispersal-vicariance analysis (DIVA) to reconstruct the historical biogeography of the genus. This analysis, when combined with the intraspecific results, supports two instances of dispersal from the Amazonas to the Orinoco. Thus, our results support the idea that the Casiquiare connection is important across temporal scales, facilitating both gene flow and the dispersal and range expansion of species.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/79059/1/j.1365-294X.2010.04540.x.pd

    Assessment of the evolution of the redox conditions in a low and intermediate level nuclear waste repository (SFR1, Sweden)

    Get PDF
    The evaluation of the redox conditions in an intermediate and low level radioactive waste repository such as SFR1 (Sweden) is of high relevance in the assessment of its future performance. The SFR1 repository contains heterogeneous types of wastes, of different activity levels and with very different materials, both in the waste itself and as immobilisation matrices and packaging. The level of complexity also applies to the different reactivity of the materials, so that an assessment of the uncertainties in the study of how the redox conditions would evolve must consider different processes, materials and parameters. This paper provides an assessment of the evolution of the redox conditions in the SFR1. The approach followed is based on the evaluation of the evolution of the redox conditions and the reducing capacity in 15 individual waste package types, selected as being representative of most of the different waste package types present or planned to be deposited in the SFR1. The model considers different geochemical processes of redox relevance in the system. The assessment of the redox evolution of the different vaults of the repository is obtained by combining the results of the modelled individual waste package types. According to the model results, corrosion of the steel-based material present in the repository keeps the system under reducing conditions for long time periods. The simulations have considered both the presence and the absence of microbial activity. In the initial step after the repository closure, the microbial mediated oxidation of organic matter rapidly causes the depletion of oxygen in the system. The system is afterwards kept under reducing conditions, and hydrogen is generated due to the anoxic corrosion of steel. The times for exhaustion of the steel contained in the vaults vary from 5 ky to more than 60 ky in the different vaults, depending on the amount and the surface area of steel. After the complete corrosion of steel, the system still keeps a high reducing capacity, due to the magnetite formed as steel corrosion product. The redox potential in the vaults is calculated to evolve from oxidising at very short times, due the initial oxygen content, to very reducing at times shorter than 5 years after repository closure. The redox potential imposed by the anoxic corrosion of steel and hydrogen production is on the order of -0.75 V at pH 12.5. In case of assuming that the system responds to the Fe(III)/Magnetite system, and considering the uncertainty in the pH due to the degradation of the concrete barriers, the redox potential would be in the range -0.7 to -0.01V. A Monte-Carlo probabilistic analysis on the rate of corrosion of steel shows that the reducing capacity of the system provided by magnetite is not exhausted at the end of the assessment period, even assuming the highest corrosion rates for steel. Simulations assuming presence of oxic water due to glacial melting, intruding the system 60 ky after repository closure, indicate that magnetite is progressively oxidised, forming Fe(III) oxides. The time at which magnetite is completely oxidised varies depending on the amount of steel initially present in the waste package. The behaviour of Np, Pu, Tc and Se under the conditions foreseen for this repository is discussed

    Algorithm 782

    Full text link

    Programming matrix algorithms-by-blocks for thread-level parallelism

    Get PDF
    With the emergence of thread-level parallelism as the primary means for continued improvement of performance, the programmability issue has reemerged as an obstacle to the use of architectural advances. We argue that evolving legacy libraries for dense and banded linear algebra is not a viable solution due to constraints imposed by early design decisions. We propose a philosophy of abstraction and separation of concerns that provides a promising solution in this problem domain. The first abstraction, FLASH, allows algorithms to express computation with matrices consisting of blocks, facilitating algorithms-by-blocks. Transparent to the library implementor, operand descriptions are registered for a particular operation a priori. A runtime system, SuperMatrix, uses this information to identify data dependencies between suboperations, allowing them to be scheduled to threads out-of-order and executed in parallel. But not all classical algorithms in linear algebra lend themselves to conversion to algorithms-by-blocks. We show how our recently proposed LU factorization with incremental pivoting and closely related algorithm-by-blocks for the QR factorization, both originally designed for out-of-core computation, overcome this difficulty. Anecdotal evidence regarding the development of routines with a core functionality demonstrates how the methodology supports high productivity while experimental results suggest that high performance is abundantly achievabl

    Entropy and Action of Dilaton Black Holes

    Get PDF
    We present a detailed calculation of the entropy and action of U(1) 2U(1)~2 dilaton black holes, and show that both quantities coincide with one quarter of the area of the event horizon. Our methods of calculation make it possible to find an explanation of the rule S=A/4S = A/4 for all static, spherically symmetric black holes studied so far. We show that the only contribution to the entropy comes from the extrinsic curvature term at the horizon, which gives S=A/4S = A/4 independently of the charge(s) of the black hole, presence of scalar fields, etc. Previously, this result did not have a general explanation, but was established on a case-by-case basis. The on-shell Lagrangian for maximally supersymmetric extreme dilaton black holes is also calculated and shown to vanish, in agreement with the result obtained by taking the limit of the expression obtained for black holes with regular horizon.The physical meaning of the entropy is discussed in relation to the issue of splitting of extreme black holes.Comment: 15 p., SU-ITP-92-2

    Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing

    Get PDF
    © ACM, YYYY. This is the author's version of the work "Anzt, H., Cojean, T., Flegar, G., Göbel, F., Grützmacher, T., Nayak, P., ... & Quintana-Ortí, E. S. (2022). Ginkgo: A modern linear operator algebra framework for high performance computing. ACM Transactions on Mathematical Software (TOMS), 48(1), 1-33". It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Mathematical Software, {VOL48, ISS 1, (MAR 2022)} http://doi.acm.org/10.1145/3480935"[EN] In this article, we present GINKGO, a modern C++ math library for scientific high performance computing. While classical linear algebra libraries act on matrix and vector objects, Gnswo's design principle abstracts all functionality as linear operators," motivating the notation of a "linear operator algebra library" GINKGO'S current focus is oriented toward providing sparse linear algebra functionality for high performance graphics processing unit (GPU) architectures, but given the library design, this focus can be easily extended to accommodate other algorithms and hardware architectures. We introduce this sophisticated software architecture that separates core algorithms from architecture-specific backends and provide details on extensibility and sustainability measures. We also demonstrate GINKGO'S usability by providing examples on how to use its functionality inside the MFEM and deal.ii finite element ecosystems. Finally, we offer a practical demonstration of GINKGO'S high performance on state-of-the-art GPU architectures.This work was supported by the "Impuls und Vernetzungsfond of the Helmholtz Association" under grant VH-NG-1241. G. Flegar and E. S. Quintana-Orti were supported by project TIN2017-82972-R of the MINECO and FEDER and the H2020 EU FETHPC Project 732631 "OPRECOMP". This researchwas also supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. The experiments on the NVIDIA A100 GPU were performed on the HAICORE@KIT partition, funded by the "Impuls und Vernetzungsfond" of the Helmholtz Association. The experiments on the AMD MI100 GPU were performed on Tulip, an early-access platform hosted by HPE.Anzt, H.; Cojean, T.; Flegar, G.; Göbel, F.; Grützmacher, T.; Nayak, P.; Ribizel, T.... (2022). Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing. ACM Transactions on Mathematical Software. 48(1):1-33. https://doi.org/10.1145/348093513348

    Controlling the Host-Guest Interaction Mode through a Redox Stimulus

    Get PDF
    A proof-of-concept related to the redox-control of the binding/releasing process in a host-guest system is achieved by designing a neutral and robust Pt-based redox-active metallacage involving two extended-tetrathiafulvalene (exTTF) ligands. When neutral, the cage is able to bind a planar polyaromatic guest (coronene). Remarkably, the chemical or electrochemical oxidation of the host-guest complex leads to the reversible expulsion of the guest outside the cavity, which is assigned to a drastic change of the host-guest interaction mode, illustrating the key role of counteranions along the exchange process. The reversible process is supported by various experimental data (1 H NMR spectroscopy, ESI-FTICR, and spectroelectrochemistry) as well as by in-depth theoretical calculations performed at the density functional theory (DFT) level
    • …
    corecore