61,769 research outputs found

    Study of laser deposited thin films Final report, 4 May 1967 - 4 May 1968

    Get PDF
    Feasibility of laser deposited metal films for mirror productio

    Hamilton's Turns for the Lorentz Group

    Full text link
    Hamilton in the course of his studies on quaternions came up with an elegant geometric picture for the group SU(2). In this picture the group elements are represented by ``turns'', which are equivalence classes of directed great circle arcs on the unit sphere S2S^2, in such a manner that the rule for composition of group elements takes the form of the familiar parallelogram law for the Euclidean translation group. It is only recently that this construction has been generalized to the simplest noncompact group SU(1,1)=Sp(2,R)=SL(2,R)SU(1,1) = Sp(2, R) = SL(2,R), the double cover of SO(2,1). The present work develops a theory of turns for SL(2,C)SL(2,C), the double and universal cover of SO(3,1) and SO(3,C)SO(3,C), rendering a geometric representation in the spirit of Hamilton available for all low dimensional semisimple Lie groups of interest in physics. The geometric construction is illustrated through application to polar decomposition, and to the composition of Lorentz boosts and the resulting Wigner or Thomas rotation.Comment: 13 pages, Late

    Parallelising wavefront applications on general-purpose GPU devices

    Get PDF
    Pipelined wavefront applications form a large portion of the high performance scientific computing workloads at supercomputing centres. This paper investigates the viability of graphics processing units (GPUs) for the acceleration of these codes, using NVIDIA's Compute Unified Device Architecture (CUDA). We identify the optimisations suitable for this new architecture and quantify the characteristics of those wavefront codes that are likely to experience speedups

    On the acceleration of wavefront applications using distributed many-core architectures

    Get PDF
    In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectures to accelerate pipelined wavefront applications—a ubiquitous class of parallel algorithms used for the solution of a number of scientific and engineering applications. Specifically, we employ a recently developed port of the LU solver (from the NAS Parallel Benchmark suite) to investigate the performance of these algorithms on high-performance computing solutions from NVIDIA (Tesla C1060 and C2050) as well as on traditional clusters (AMD/InfiniBand and IBM BlueGene/P). Benchmark results are presented for problem classes A to C and a recently developed performance model is used to provide projections for problem classes D and E, the latter of which represents a billion-cell problem. Our results demonstrate that while the theoretical performance of GPU solutions will far exceed those of many traditional technologies, the sustained application performance is currently comparable for scientific wavefront applications. Finally, a breakdown of the GPU solution is conducted, exposing PCIe overheads and decomposition constraints. A new k-blocking strategy is proposed to improve the future performance of this class of algorithm on GPU-based architectures

    Experiences with porting and modelling wavefront algorithms on many-core architectures

    Get PDF
    We are currently investigating the viability of many-core architectures for the acceleration of wavefront applications and this report focuses on graphics processing units (GPUs) in particular. To this end, we have implemented NASA’s LU benchmark – a real world production-grade application – on GPUs employing NVIDIA’s Compute Unified Device Architecture (CUDA). This GPU implementation of the benchmark has been used to investigate the performance of a selection of GPUs, ranging from workstation-grade commodity GPUs to the HPC "Tesla” and "Fermi” GPUs. We have also compared the performance of the GPU solution at scale to that of traditional high perfor- mance computing (HPC) clusters based on a range of multi- core CPUs from a number of major vendors, including Intel (Nehalem), AMD (Opteron) and IBM (PowerPC). In previous work we have developed a predictive “plug-and-play” performance model of this class of application running on such clusters, in which CPUs communicate via the Message Passing Interface (MPI). By extending this model to also capture the performance behaviour of GPUs, we are able to: (1) comment on the effects that architectural changes will have on the performance of single-GPU solutions, and (2) make projections regarding the performance of multi-GPU solutions at larger scale
    corecore