48 research outputs found
Astaroth: Ohjelmistokirjasto stensiililaskentaan grafiikkasuorittimilla
Graphics processing units (GPUs) are coprocessors, which offer higher throughput and better power efficiency than central processing units in dataparallel tasks. For this reason, graphics processors provide a good platform for high-performance computing. However, programming GPUs such that all the available performance is utilized requires in-depth knowledge of the architecture of the hardware. Additionally, the problem of high-order stencil computations on GPUs in challenging multiphysics applications has not been adequately explored in previous work. In this thesis, we address these issues by presenting a library, an efficient algorithm and a domain-specific language for solving stencil computations within a structured grid. We tested our implementation by simulating magnetohydrodynamics, which involved the computation of first, second, and cross partial derivatives using second-, fourth-, sixth-, and eight-order finite differences with single and double precision. The running time of our integration kernel was 2.8–9.1 times slower than the theoretical minimum time, which it would take to read the computational domain and write it back to device memory exactly once, without taking into account the effects of finite caches or arithmetic operations on performance. Additionally, we made a performance comparison with a CPU solver widely used for scientific computations, which we benchmarked on a total of 24 cores of two Intel Xeon E5-2690 v3 processors. Our solver, benchmarked on a Tesla P100 PCIe GPU, outperformed the CPU solver by factors of 6.7 and 10.4 when using single and double precision, respectively.Grafiikkasuorittimet ovat apusuorittimia, jotka tarjoavat rinnakkain laskettavissa tehtävissä parempaa suoritus- ja energiatehokkuutta kuin keskussuorittimet. Tästä syystä grafiikkasuorittimet tarjoavat hyvän alustan suurteholaskennan tarpeisiin. Toisaalta grafiikkasuorittimen ohjelmointi siten, että kaikki tarjolla oleva suorituskyky saadaan hyödynnettyä, vaatii syvällistä asiantuntemusta ohjelmoitavan laitteiston arkkitehtuurista. Korkean asteen stensiililaskentaa haastavissa fysiikkasovelluksissa ei ole myöskään tutkittu laajalti aiemmissa julkaisuissa. Tässä työssä otamme kantaa näihin ongelmiin esittelemällä ohjelmistokirjaston, tehokkaan algoritmin, sekä tehtävään räätälöidyn ohjelmointikielen stensiililaskujen ratkaisemiseen säännöllisessä hilassa. Testasimme toteutustamme simuloimalla magnetohydrodynamiikkaa, johon kuului ensimmäisen ja toisen kertaluvun derivaattojen lisäksi ristiderivaattojen ratkaisutoisen, neljännen, kuudennen ja kahdeksannen kertaluvun differenssimenetelmällä käyttäen sekä 32- että 64-bittisiä liukulukuja. Integrointifunktiomme suoritusaika oli 2.8–9.1 kertaa hitaampi kuin teoreettinen vähimmäisajoaika, joka menisi laskennallisen alueen lukemiseen ja kirjoittamiseen apusuorittimen muistista täsmälleen kerran, ottamatta huomioon äärellisen välimuistin tai laskennan vaikutusta suoritusaikaan. Vertasimme kirjastomme suoritusaikaa laajalti tieteellisessä laskennassa käytettyyn keskussuorittimille tarkoitettuun ratkaisijaan, jonka ajoimme kokonaisuudessaan 24:llä ytimellä kahdella Intel Xeon E5-2690 v3 -suorittimella. Tähän ratkaisijaan verrattuna Tesla P100 PCIe -grafiikkasuorittimella ajettu ratkaisijamme oli 6.7 ja 10.4 kertaa nopeampi 32- ja 64-bittisillä liukuluvuilla laskettaessa, tässä järjestyksessä
筑波大学計算科学研究センター 平成30年度 年次報告書
まえがき ...... 21 センター組織と構成員 ...... 42 平成30 年度の活動状況 ...... 83 各研究部門の報告 ...... 15I. 素粒子物理研究部門 ...... 15II. 宇宙物理研究部門 ....... 40III. 原子核物理研究部門 ...... 65IV. 量子物性研究部門 ...... 83V. 生命科学研究部門 ...... 110 V-1. 生命機能情報分野 ...... 110 V-2. 分子進化分野 ...... 125VI. 地球環境研究部門 ...... 140VII. 高性能計算システム研究部門 ...... 155VIII. 計算情報学研究部門 ...... 207 VIII-1. データ基盤分野 ...... 207 VIII-2. 計算メディア分野 ...... 22
Dense and sparse parallel linear algebra algorithms on graphics processing units
Una línea de desarrollo seguida en el campo de la supercomputación es el uso de procesadores de propósito específico para acelerar determinados tipos de cálculo. En esta tesis estudiamos el uso de tarjetas gráficas como aceleradores de la computación y lo aplicamos al ámbito del álgebra lineal. En particular trabajamos con la biblioteca SLEPc para resolver problemas de cálculo de autovalores en matrices de gran dimensión, y para aplicar funciones de matrices en los cálculos de aplicaciones científicas. SLEPc es una biblioteca paralela que se basa en el estándar MPI y está desarrollada con la premisa de ser escalable, esto es, de permitir resolver problemas más grandes al aumentar las unidades de procesado.
El problema lineal de autovalores, Ax = lambda x en su forma estándar, lo abordamos con el uso de técnicas iterativas, en concreto con métodos de Krylov, con los que calculamos una pequeña porción del espectro de autovalores. Este tipo de algoritmos se basa en generar un subespacio de tamaño reducido (m) en el que proyectar el problema de gran dimensión (n), siendo m << n. Una vez se ha proyectado el problema, se resuelve este mediante métodos directos, que nos proporcionan aproximaciones a los autovalores del problema inicial que queríamos resolver. Las operaciones que se utilizan en la expansión del subespacio varían en función de si los autovalores deseados están en el exterior o en el interior del espectro. En caso de buscar autovalores en el exterior del espectro, la expansión se hace mediante multiplicaciones matriz-vector. Esta operación la realizamos en la GPU, bien mediante el uso de bibliotecas o mediante la creación de funciones que aprovechan la estructura de la matriz. En caso de autovalores en el interior del espectro, la expansión requiere resolver sistemas de ecuaciones lineales. En esta tesis implementamos varios algoritmos para la resolución de sistemas de ecuaciones lineales para el caso específico de matrices con estructura tridiagonal a bloques, que se ejecutan en GPU.
En el cálculo de las funciones de matrices hemos de diferenciar entre la aplicación directa de una función sobre una matriz, f(A), y la aplicación de la acción de una función de matriz sobre un vector, f(A)b. El primer caso implica un cálculo denso que limita el tamaño del problema. El segundo permite trabajar con matrices dispersas grandes, y para resolverlo también hacemos uso de métodos de Krylov. La expansión del subespacio se hace mediante multiplicaciones matriz-vector, y hacemos uso de GPUs de la misma forma que al resolver autovalores. En este caso el problema proyectado comienza siendo de tamaño m, pero se incrementa en m en cada reinicio del método. La resolución del problema proyectado se hace aplicando una función de matriz de forma directa. Nosotros hemos implementado varios algoritmos para calcular las funciones de matrices raíz cuadrada y exponencial, en las que el uso de GPUs permite acelerar el cálculo.One line of development followed in the field of supercomputing is the use of specific purpose processors to speed up certain types of computations. In this thesis we study the use of graphics processing units as computer accelerators and apply it to the field of linear algebra. In particular, we work with the SLEPc library to solve large scale eigenvalue problems, and to apply matrix functions in scientific applications. SLEPc is a parallel library based on the MPI standard and is developed with the premise of being scalable, i.e. to allow solving larger problems by increasing the processing units.
We address the linear eigenvalue problem, Ax = lambda x in its standard form, using iterative techniques, in particular with Krylov's methods, with which we calculate a small portion of the eigenvalue spectrum. This type of algorithms is based on generating a subspace of reduced size (m) in which to project the large dimension problem (n), being m << n. Once the problem has been projected, it is solved by direct methods, which provide us with approximations of the eigenvalues of the initial problem we wanted to solve. The operations used in the expansion of the subspace vary depending on whether the desired eigenvalues are from the exterior or from the interior of the spectrum. In the case of searching for exterior eigenvalues, the expansion is done by matrix-vector multiplications. We do this on the GPU, either by using libraries or by creating functions that take advantage of the structure of the matrix. In the case of eigenvalues from the interior of the spectrum, the expansion requires solving linear systems of equations. In this thesis we implemented several algorithms to solve linear systems of equations for the specific case of matrices with a block-tridiagonal structure, that are run on GPU.
In the computation of matrix functions we have to distinguish between the direct application of a matrix function, f(A), and the action of a matrix function on a vector, f(A)b. The first case involves a dense computation that limits the size of the problem. The second allows us to work with large sparse matrices, and to solve it we also make use of Krylov's methods. The expansion of subspace is done by matrix-vector multiplication, and we use GPUs in the same way as when solving eigenvalues. In this case the projected problem starts being of size m, but it is increased by m on each restart of the method. The solution of the projected problem is done by directly applying a matrix function. We have implemented several algorithms to compute the square root and the exponential matrix functions, in which the use of GPUs allows us to speed up the computation.Una línia de desenvolupament seguida en el camp de la supercomputació és l'ús de processadors de propòsit específic per a accelerar determinats tipus de càlcul. En aquesta tesi estudiem l'ús de targetes gràfiques com a acceleradors de la computació i ho apliquem a l'àmbit de l'àlgebra lineal. En particular treballem amb la biblioteca SLEPc per a resoldre problemes de càlcul d'autovalors en matrius de gran dimensió, i per a aplicar funcions de matrius en els càlculs d'aplicacions científiques. SLEPc és una biblioteca paral·lela que es basa en l'estàndard MPI i està desenvolupada amb la premissa de ser escalable, açò és, de permetre resoldre problemes més grans en augmentar les unitats de processament.
El problema lineal d'autovalors, Ax = lambda x en la seua forma estàndard, ho abordem amb l'ús de tècniques iteratives, en concret amb mètodes de Krylov, amb els quals calculem una xicoteta porció de l'espectre d'autovalors. Aquest tipus d'algorismes es basa a generar un subespai de grandària reduïda (m) en el qual projectar el problema de gran dimensió (n), sent m << n. Una vegada s'ha projectat el problema, es resol aquest mitjançant mètodes directes, que ens proporcionen aproximacions als autovalors del problema inicial que volíem resoldre. Les operacions que s'utilitzen en l'expansió del subespai varien en funció de si els autovalors desitjats estan en l'exterior o a l'interior de l'espectre. En cas de cercar autovalors en l'exterior de l'espectre, l'expansió es fa mitjançant multiplicacions matriu-vector. Aquesta operació la realitzem en la GPU, bé mitjançant l'ús de biblioteques o mitjançant la creació de funcions que aprofiten l'estructura de la matriu. En cas d'autovalors a l'interior de l'espectre, l'expansió requereix resoldre sistemes d'equacions lineals. En aquesta tesi implementem diversos algorismes per a la resolució de sistemes d'equacions lineals per al cas específic de matrius amb estructura tridiagonal a blocs, que s'executen en GPU.
En el càlcul de les funcions de matrius hem de diferenciar entre l'aplicació directa d'una funció sobre una matriu, f(A), i l'aplicació de l'acció d'una funció de matriu sobre un vector, f(A)b. El primer cas implica un càlcul dens que limita la grandària del problema. El segon permet treballar amb matrius disperses grans, i per a resoldre-ho també fem ús de mètodes de Krylov. L'expansió del subespai es fa mitjançant multiplicacions matriu-vector, i fem ús de GPUs de la mateixa forma que en resoldre autovalors. En aquest cas el problema projectat comença sent de grandària m, però s'incrementa en m en cada reinici del mètode. La resolució del problema projectat es fa aplicant una funció de matriu de forma directa. Nosaltres hem implementat diversos algorismes per a calcular les funcions de matrius arrel quadrada i exponencial, en les quals l'ús de GPUs permet accelerar el càlcul.Lamas Daviña, A. (2018). Dense and sparse parallel linear algebra algorithms on graphics processing units [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/112425TESI
Towards many-body physics with Rydberg-dressed cavity polaritons
An exciting frontier in quantum information science is the creation and manipulation of
bottom-up quantum systems that are built and controlled one by one. For the past 30
years, we have witnessed signi cant progresses in harnessing strong atom- eld interactions
for critical applications in quantum computation, communication, simulation, and metrology.
By extension, we can envisage a quantum network consisting of material nodes coupled
together with in nite-dimensional bosonic quantum channels. In this context, there
has been active research worldwide to achieve quantum optical circuits, for which single
atoms are wired by freely-propagating single photons through the circuit elements. For all
these systems, the system-size expansion with atoms and photons results in a fundamental
pathologic scaling that linearizes the very atom- eld interaction, and signi cantly limits
the degree of non-classicality and entanglement in analog atom- eld quantum systems for
atom number N 1.
The long-term motivation of this MSc thesis is (i) to discover new physical mechanisms
that extend the inherent scaling behavior of atom- eld interactions and (ii) to
develop quantum optics toolkits that design dynamical gauge structures for the realization
of lattice-gauge-theoretic quantum network and the synthesis of novel quantum optically
gauged materials. The basic premise is to achieve the strong coupling regime for a quantum
many-body material system interacting with the quantized elds of an optical cavity. Our
laboratory e ort can be described as the march towards \many-body QED," where optical
elds acquire some properties of the material interactions that constrain their dynamical
processes, as with quantum eld theories. While such an e ort currently do not exist elsewhere,
we are convicted that our work will become an essential endeavor to enable cavity
quantum electrodynamics (QED) in the bona- de regime of quantum many-body physics
in this entanglement frontier.
In this context, I describe an example in Chapter 2 that utilizes strong RydbergRydberg
interactions to design dynamical gauge structures for the quantum square ice
models. Quantum
uctuations driven by cavity-mediated in nite-range interaction stabilize
the quantum-gauged system into a long-range entangled quantum spin liquid that may
be detected through the time-ordered photoelectric statistics for photons leaking out of the
cavity. Fractionalized \spinon" and \vison" excitations can be manipulated for topological
quantum computation, and the emergent photons of arti cial QED in our lattice gauge
theoretic system can be directly measured and studied.
The laboratory challenge towards strongly coupled cavity Rydberg polaritons encompasses
three daunting research milestones that push the technological boundaries beyond of the state-of-the-arts. In Chapter 3, I discuss our extreme-high-vacuum chamber (XHV)
cluster system that allows the world's lowest operating vacuum environment P ' 10
Torr for an ultracold AMO experiment with long background-limited trap lifetimes. In
Chapter 4, I discuss our ultrastable laser systems stabilized to the ultra-low-expansion
optical cavities. Coupled with a scalable eld-programmable-gate-array (FPGA) digitalanalog
control system, we can manipulate arbitrarily the phase-amplitude relationship of
several dozens of laser elds across 300 nm to 1550 nm at mHz precision. In Chapter 5,
we discuss the quantum trajectory simulations for manipulating the external degrees of
freedom of ultracold atoms with external laser elds. Electrically tunable liquid crystal
lens creates a dynamically tunable optical trap to move the ultracold atomic gases over
long distance within the ultra-high-vacuum (UHV) chamber system.
In Chapter 6, I discuss our collaborative development of two science cavity platforms
{ the \Rydberg" quantum dot and the many-body QED platforms. An important development
was the research into new high-index IBS materials, where we have utilized our
low-loss optical mirrors for extending the world's highest cavity nesse F 500k! We discuss
the unique challenges of implementing optical cavity QED for Rydberg atoms, which
required tremendous degrees of electromagnetic shielding and eld control. Single-crystal
Sapphire structure, along with Angstrom-level diamond-turned Ti blade electrodes, is utilized
for the eld compensation and extinction by > 60 dB. Single-crystal PZTs on silica
V-grooves are utilized for the stabilization of the optical cavity with length uncertainty less
than 1=100 of a single nucleon, along with extreme level of vibration isolation in a XHV
environment. The capability to perform in-situ RF plasma cleaning allows the regeneration
of optical mirrors when coated with a few Cs atoms. Lastly but not the least, we combine
single-atom resolution quantum gas microscopy technique with superpixel holographic algorithm
to project arbitrary real-time recon gurable di raction-limited optical potential
landscapes for the preparation of low-entropy atom arrays
FIAS Scientific Report 2011
In the year 2010 the Frankfurt Institute for Advanced Studies has successfully continued to follow its agenda to pursue theoretical research in the natural sciences. As stipulated in its charter, FIAS closely collaborates with extramural research institutions, like the Max Planck Institute for Brain Research in Frankfurt and the GSI Helmholtz Center for Heavy Ion Research, Darmstadt and with research groups at the science departments of Goethe University. The institute also engages in the training of young researchers and the education of doctoral students. This Annual Report documents how these goals have been pursued in the year 2010. Notable events in the scientific life of the Institute will be presented, e.g., teaching activities in the framework of the Frankfurt International Graduate School for Science (FIGSS), colloquium schedules, conferences organized by FIAS, and a full bibliography of publications by authors affiliated with FIAS. The main part of the Report consists of short one-page summaries describing the scientific progress reached in individual research projects in the year 2010..
XSEDE: The Extreme Science and Engineering Discovery Environment (OAC 15-48562) Interim Project Report 13: Report Year 5, Reporting Period 2 August 1, 2020 – October 31, 2020
This is the Interim Project Report 13 (IPR13) for the NSF XSEDE project. It includes Key Performance Indicator data and project highlights for Reporting Year 5, Report Period 2 (August 1-October 31, 2020).NSF OAC 15-48562Ope
Discovery in Physics
Volume 2 covers knowledge discovery in particle and astroparticle physics. Instruments gather petabytes of data and machine learning is used to process the vast amounts of data and to detect relevant examples efficiently. The physical knowledge is encoded in simulations used to train the machine learning models. The interpretation of the learned models serves to expand the physical knowledge resulting in a cycle of theory enhancement
Neural Radiance Fields: Past, Present, and Future
The various aspects like modeling and interpreting 3D environments and
surroundings have enticed humans to progress their research in 3D Computer
Vision, Computer Graphics, and Machine Learning. An attempt made by Mildenhall
et al in their paper about NeRFs (Neural Radiance Fields) led to a boom in
Computer Graphics, Robotics, Computer Vision, and the possible scope of
High-Resolution Low Storage Augmented Reality and Virtual Reality-based 3D
models have gained traction from res with more than 1000 preprints related to
NeRFs published. This paper serves as a bridge for people starting to study
these fields by building on the basics of Mathematics, Geometry, Computer
Vision, and Computer Graphics to the difficulties encountered in Implicit
Representations at the intersection of all these disciplines. This survey
provides the history of rendering, Implicit Learning, and NeRFs, the
progression of research on NeRFs, and the potential applications and
implications of NeRFs in today's world. In doing so, this survey categorizes
all the NeRF-related research in terms of the datasets used, objective
functions, applications solved, and evaluation criteria for these applications.Comment: 413 pages, 9 figures, 277 citation