549 research outputs found
Fast and Accurate Modeling of Transient-State Gradient-Spoiled Sequences by Recurrent Neural Networks
Fast and accurate modeling of MR signal responses are typically required for
various quantitative MRI applications, such as MR Fingerprinting and MR-STAT.
This work uses a new EPG-Bloch model for accurate simulation of transient-state
gradient-spoiled MR sequences, and proposes a Recurrent Neural Network (RNN) as
a fast surrogate of the EPG-Bloch model for computing large-scale MR signals
and derivatives. The computational efficiency of the RNN model is demonstrated
by comparing with other existing models, showing one to three orders of
acceleration comparing to the latest GPU-accelerated open-source EPG package.
By using numerical and in-vivo brain data, two use cases, namely MRF dictionary
generation and optimal experimental design, are also provided. Results show
that the RNN surrogate model can be efficiently used for computing large-scale
dictionaries of transient-states signals and derivatives within tens of
seconds, resulting in several orders of magnitude acceleration with respect to
state-of-the-art implementations. The practical application of transient-states
quantitative techniques can therefore be substantially facilitated.Comment: Correct for typo error
Non-stationary wave relaxation methods for general linear systems of Volterra equations: convergence and parallel GPU implementation
Producción CientíficaIn the present paper, a parallel-in-time discretization of linear systems of Volterra equations of type
u¯(t)=u¯0+∫t0K(t−s)u¯(s) d s+f¯(t),0<t≤T,
is addressed. Related to the analytical solution, a general enough functional setting is firstly stated. Related to the numerical solution, a parallel numerical scheme based on the Non-Stationary Wave Relaxation (NSWR) method for the time discretization is proposed, and its convergence is studied as well. A CUDA parallel implementation of the method is carried out in order to exploit Graphics Processing Units (GPUs), which are nowadays widely employed for reducing the computational time of several general purpose applications. The performance of these methods is compared to some sequential implementation. It is revealed throughout several experiments of special interest in practical applications the good performance of the parallel approach.Ministerio de Universidades e Investigación de Italia (MUR), a través del proyecto PRIN 2017 (No. 2017JYCLSF) “Aproximación preservadora de estructuras de problemas evolutivos”Publicación en abierto financiada por el Consorcio de Bibliotecas Universitarias de Castilla y León (BUCLE), con cargo al Programa Operativo 2014ES16RFOP009 FEDER 2014-2020 DE CASTILLA Y LEÓN, Actuación:20007-CL - Apoyo Consorcio BUCL
Inlet and Outlet Boundary Conditions and Uncertainty Quantification in Volumetric Lattice Boltzmann Method for Image-Based Computational Hemodynamics
Inlet and outlet boundary conditions (BCs) play an important role in newly emerged image-based computational hemodynamics for blood flows in human arteries anatomically extracted from medical images. We developed physiological inlet and outlet BCs based on patients’ medical data and integrated them into the volumetric lattice Boltzmann method. The inlet BC is a pulsatile paraboloidal velocity profile, which fits the real arterial shape, constructed from the Doppler velocity waveform. The BC of each outlet is a pulsatile pressure calculated from the three-element Windkessel model, in which three physiological parameters are tuned by the corresponding Doppler velocity waveform. Both velocity and pressure BCs are introduced into the lattice Boltzmann equations through Guo’s non-equilibrium extrapolation scheme. Meanwhile, we performed uncertainty quantification for the impact of uncertainties on the computation results. An application study was conducted for six human aortorenal arterial systems. The computed pressure waveforms have good agreement with the medical measurement data. A systematic uncertainty quantification analysis demonstrates the reliability of the computed pressure with associated uncertainties in the Windkessel model. With the developed physiological BCs, the image-based computation hemodynamics is expected to provide a computation potential for the noninvasive evaluation of hemodynamic abnormalities in diseased human vessels
Recommended from our members
Hybrid Analog-Digital Co-Processing for Scientific Computation
In the past 10 years computer architecture research has moved to more heterogeneity and less adherence to conventional abstractions. Scientists and engineers hold an unshakable belief that computing holds keys to unlocking humanity's Grand Challenges. Acting on that belief they have looked deeper into computer architecture to find specialized support for their applications. Likewise, computer architects have looked deeper into circuits and devices in search of untapped performance and efficiency. The lines between computer architecture layers---applications, algorithms, architectures, microarchitectures, circuits and devices---have blurred. Against this backdrop, a menagerie of computer architectures are on the horizon, ones that forgo basic assumptions about computer hardware, and require new thinking of how such hardware supports problems and algorithms.
This thesis is about revisiting hybrid analog-digital computing in support of diverse modern workloads. Hybrid computing had extensive applications in early computing history, and has been revisited for small-scale applications in embedded systems. But architectural support for using hybrid computing in modern workloads, at scale and with high accuracy solutions, has been lacking.
I demonstrate solving a variety of scientific computing problems, including stochastic ODEs, partial differential equations, linear algebra, and nonlinear systems of equations, as case studies in hybrid computing. I solve these problems on a system of multiple prototype analog accelerator chips built by a team at Columbia University. On that team I made contributions toward programming the chips, building the digital interface, and validating the chips' functionality. The analog accelerator chip is intended for use in conjunction with a conventional digital host computer.
The appeal and motivation for using an analog accelerator is efficiency and performance, but it comes with limitations in accuracy and problem sizes that we have to work around.
The first problem is how to do problems in this unconventional computation model. Scientific computing phrases problems as differential equations and algebraic equations. Differential equations are a continuous view of the world, while algebraic equations are a discrete one. Prior work in analog computing mostly focused on differential equations; algebraic equations played a minor role in prior work in analog computing. The secret to using the analog accelerator to support modern workloads on conventional computers is that these two viewpoints are interchangeable. The algebraic equations that underlie most workloads can be solved as differential equations,
and differential equations are naturally solvable in the analog accelerator chip. A hybrid analog-digital computer architecture can focus on solving linear and nonlinear algebra problems to support many workloads.
The second problem is how to get accurate solutions using hybrid analog-digital computing. The reason that the analog computation model gives less accurate solutions is it gives up representing numbers as digital binary numbers, and instead uses the full range of analog voltage and current to represent real numbers. Prior work has established that encoding data in analog signals gives an energy efficiency advantage as long as the analog data precision is limited. While the analog accelerator alone may be useful for energy-constrained applications where inputs and outputs are imprecise, we are more interested in using analog in conjunction with digital for precise solutions. This thesis gives novel insight that the trick to do so is to solve nonlinear problems where low-precision guesses are useful for conventional digital algorithms.
The third problem is how to solve large problems using hybrid analog-digital computing. The reason the analog computation model can't handle large problems is it gives up step-by-step discrete-time operation, instead allowing variables to evolve smoothly in continuous time. To make that happen the analog accelerator works by chaining hardware for mathematical operations end-to-end. During computation analog data flows through the hardware with no overheads in control logic and memory accesses. The downside is then the needed hardware size grows alongside problem sizes. While scientific computing researchers have for a long time split large problems into smaller subproblems to fit in digital computer constraints, this thesis is a first attempt to consider these divide-and-conquer algorithms as an essential tool in using the analog model of computation.
As we enter the post-Moore’s law era of computing, unconventional architectures will offer specialized models of computation that uniquely support specific problem types. Two prominent examples are deep neural networks and quantum computers. Recent trends in computer science research show these unconventional architectures will soon have broad adoption. In this thesis I show another specialized, unconventional architecture is to use analog accelerators to solve problems in scientific computing. Computer architecture researchers will discover other important models of computation in the future. This thesis is an example of the discovery process, implementation, and evaluation of how an unconventional architecture supports specialized workloads
Rapid determination of LISA sensitivity to extreme mass ratio inspirals with machine learning
Gravitational wave observations of the inspiral of stellar-mass compact
objects into massive black holes (MBHs), extreme mass ratio inspirals (EMRIs),
enable precision measurements of parameters such as the MBH mass and spin. The
Laser Interferometer Space Antenna is expected to detect sufficient EMRIs to
probe the underlying source population, testing theories of the formation and
evolution of MBHs and their environments. Population studies are subject to
selection effects that vary across the EMRI parameter space, which bias
inference results if unaccounted for. This bias can be corrected, but
evaluating the detectability of many EMRI signals is computationally expensive.
We mitigate this cost by (i) constructing a rapid and accurate neural network
interpolator capable of predicting the signal-to-noise ratio of an EMRI from
its parameters, and (ii) further accelerating detectability estimation with a
neural network that learns the selection function, leveraging our first neural
network for data generation. The resulting framework rapidly estimates the
selection function, enabling a full treatment of EMRI detectability in
population inference analyses. We apply our method to an astrophysically
motivated EMRI population model, demonstrating the potential selection biases
and subsequently correcting for them. Accounting for selection effects, we
predict that LISA will measure the MBH mass function slope to a precision of
8.8%, the CO mass function slope to a precision of 4.6%, the width of the MBH
spin magnitude distribution to a precision of 10% and the event rate to a
precision of 12% with EMRIs at redshifts below z=6.Comment: 12 pages, 4 figure
Recommended from our members
Graphics Processing Unit-Accelerated Numerical Simulations and Theoretical Study of Qubit Dynamics in Realistic Systems
Quantum computers are thought to be the future of computation, using the properties of quantum mechanics to solve problems intractable to classical computers.
Quantum computing leverages non-classical properties, such as entanglement, to achieve an exponential improvement in computational power. A quantum computer would enable us to address many real-world problems, such as how to synthesize fertilizers more efficiently; how to combat global warming; or to simulate protein folding in biological systems.
Although much work has been done to describe the use and implementation of entanglement generation theoretically, it is still a challenge to develop such protocols experimentally.
The bulk of this work is focused on creating Graphics Processing Unit (GPU)-accelerated computer simulations of quantum systems with advanced numerical and analytical techniques. Simulations can guide experiments attempting to create building blocks of quantum computers - qubits and their control devices. However, simulation of more realistic device setups in two dimensional systems has been facing problems owing to the space and time domain scaling associated with the solutions of the many-particle time dependent Schrodinger equation (TDSE). Nevertheless, recent advances in computer hardware performance has made previously intractable two-particle problems readily solvable. I have developed custom GPU-accelerated software based on a staggered-leapfrog algorithm that opens up new possibilities of simulating two-dimensional two-particle systems accurately.
I focus on three research projects. Firstly, optimally defining a charge-based solid state qubit, and controlling it in a simple and experimentally achievable way, while accounting for imperfections of the waveform generators. I simulate the physical qubit on a fine-grained lattice, and propose an innovative control scheme that accounts for finite rise/fall time of the experimental apparatus, while being relatively fast and resulting in very high operation fidelity. An optimal pulsing scheme with rise time-dependent parameters is found, and shown to be able to achieve an arbitrary qubit rotation. Since the proposed pulse sequence reduces to sine waves to minimize total pulse duration, it is straightforward to implement experimentally, and easily generalisable to different systems. I also show how the fidelity remains sufficiently high independently of the initial qubit state. The proposed sequence can even reduce errors caused by charge noise under certain conditions. Readout techniques are discussed as well, and found to not present significant issues.
Secondly, I aid the effort to create a Surface Acoustic Wave quantum computer prototype by describing how to produce an universal quantum gate set with a Root-of-SWAP operation used as a physical two-qubit gate. Using realistic parameters, it is shown how this operation can be performed with high fidelity.
Previous work has been done to simulate a proposed Root-of-SWAP method in one dimension - this work focuses on extending this to two dimensions.
We find that the method of generating Root-of-SWAP mentioned above breaks down in two dimensions- unwanted excitations are introduced in the extra dimension, causing a phase difference to appear, and thus ruining coherence of the state.
I propose to implement the Root-of-SWAP operation via a tunneling interaction across the effective double dot instead. This was previously considered, however was thought to be unstable against variations in tunnel barrier height, which has exponential impact on the speed of the quantum operation. Using newly available computing power, we were able to run detailed two dimensional simulations investigating this method and its robustness against variations in the double dot potential. We find that the method produces high fidelity Root-of-SWAP states, and is robust against small variations in the tunnel barrier. Additionally, we find a relation between the tunnel barrier height and spin measurement probability, providing a way for experimentalists to estimate an actual device barrier indirectly.
Finally, I theoretically model and simulate transport through a single electron transistor (SET) device. It is shown that a single donor structure can reliably be engineered from doped quantum dots by taking advantage of the tunability of the electron tunneling rates as well as the interplay, at low temperatures, between disorder conferred by randomness in dopant distribution and electron-electron interaction originating from the high doping concentration. It is possible to electrostatically isolate a single donor from the large ensemble of dopants. I investigate how such a complex system is expected to conduct, and verify a hypothesis that two donors take part in the transport by numerically reproducing the experimental measurements. Finally, it is shown that this device can be used as a single atom detector of the charge occupancy of a nearby capacitively coupled double quantum dot. While this final part does not make use of the GPU-accelerated software, it is still closely related to the rest of this work, and the theme of modeling realistic quantum devices.Project for Developing Innovation Systems of
the Ministry of Education, Culture, Sports, Science and Technology (MEXT)
Engineering and Physical Sciences Research Council (EPSRC) and Hitachi via CASE studentships RG 9463
Recommended from our members
Simulation for Reliability, Hardware Security, and Ising Computing in VLSI Chip Design
The continued scaling of VLSI circuits has provided a wealth of opportunities andchallenges to the VLSI circuit design area. Both these challenges and opportunities, however,require new simulation tools that can enable their solution or exploitation as classicalmethods typically dealt with problem domains with smaller scales or less complexity. Inthis dissertation, simulation methods are presented to address the emerging VLSI designtopics of Electromigration induced aging and Ising computing and are then applied to theapplication areas of hardware security and graph partitioning respectively.The Electromigration aging effect in VLSI circuits is a long-term reliability issueaffecting current carrying metal wires leading to IR drop degradation. Typically, simpleanalytical equations can determine a wire’s effective age or if it will be affected by the EMaging effect at all. However, these classical methods are overly conservative and can lead toover design or unnecessary design iterations. Furthermore, it is expected that the EM agingeffect will become more severe in future Integrated Cirucits (ICs) due to increasing currentdensities and the prevalance of polycrystaline copper atom structures seen at small wiredimensions. For this reason, more comprehensive simulation techniques that can efficientlysimulate the EM effect with less conservative results can help mitigate overdesign andincrease design margins while reducing design iterations.The area of Hardware Security is becoming increasingly important as the chipsupply chain becomes more globalized and the integrity of chips becomes more diffiuclt toverify. Utilizing the accurate simulation techniques for EM, we can utilize this reliabilityeffect to demonstrate how a reliability based attack could be perpatrated. Furthermore, wecan utilize this aging effect as a defense mechanism to help us validate the integrity of anIC and detect counterfeit chips in the component supply chain market.Ising computing is an emerging method of solving combinatorial optimization problemsby simulating the interactions of so-called spin glasses and their interactions. Borrowingconcepts from quantum computing, this methods mimics the quantum interaction betweenspin glasses in such a way that finding a ground state of these spin glass models leadsto the solution of a particular problem. In this dissertation, effective methods of simulatingthe spin glass interactions using General Purpose Graphics Processing Units (GPGPUs)and finding their ground state are developed.In addition to the GPU based Ising model simulations, important combinatorialproblems can be mapped to the Ising model. In this dissertation the Ising solver is appliedto graph partitioning which can be utilized in VLSI design and many other domains as well.Specifically, solvers for the maxcut problem and the balanced min-cut partitioning problemare developed
- …