6 research outputs found
NLSEmagic: Nonlinear Schr\"odinger Equation Multidimensional Matlab-based GPU-accelerated Integrators using Compact High-order Schemes
We present a simple to use, yet powerful code package called NLSEmagic to
numerically integrate the nonlinear Schr\"odinger equation in one, two, and
three dimensions. NLSEmagic is a high-order finite-difference code package
which utilizes graphic processing unit (GPU) parallel architectures. The codes
running on the GPU are many times faster than their serial counterparts, and
are much cheaper to run than on standard parallel clusters. The codes are
developed with usability and portability in mind, and therefore are written to
interface with MATLAB utilizing custom GPU-enabled C codes with the
MEX-compiler interface. The packages are freely distributed, including user
manuals and set-up files.Comment: 37 pages, 13 figure
GPU Integration into a Software Defined Radio Framework
Software Defined Radio (SDR) was brought about by moving processing done on specific hardware components to reconfigurable software. Hardware components like General Purpose Processors (GPPs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs) are used to make the software and hardware processing of the radio more portable and as efficient as possible. Graphics Processing Units (GPUs) designed years ago for video rendering, are now finding new uses in research. The parallel architecture provided by the GPU gives developers the ability to speed up the performance of computationally intense programs. An open source tool for SDR, Open Source Software Communications Architecture (SCA) Implementation: Embedded (OSSIE), is a free waveform development environment for any developer who wants to experiment with SDR. In this work, OSSIE is integrated with a GPU computing framework to show how performance improvement can be gained from GPU parallelization. GPU research performed with SDR encompasses improving SDR simulations to implementing specific wireless protocols. In this thesis, we are aiming to show performance improvement within an SCA architected SDR implementation. The software components within OSSIE gained significant performance increases with little software changes due to the natural parallelism of the GPU, using Compute Unified Device Architecture (CUDA), Nvidia\u27s GPU programming API. Using sample data sizes for the I and Q channel inputs, performance improvements were seen in as little as 512 samples when using the GPU optimized version of OSSIE. As the sample size increased, the CUDA performance improved as well. Porting OSSIE components onto the CUDA architecture showed that improved performance can be seen in SDR related software through the use of GPU technology
Implementación de algoritmos numéricos en una tarjeta gráfica
Se definirán los objetivos de la monografía que consistirá en valorar las herramientas existentes
para usar estas tarjetas como coprocesador y desarrollar una librería para usarlas. Se establecerá un
plan de desarrollo y el entorno de trabajo.
Se contrastarán las herramientas existentes para programar estas tarjetas gráficas desde un punto de
vista de un profano en el ámbito de gráficos, su facilidad de uso y se definirán las líneas maestras
de la librería y la implementaremos.
Finalmente se analizán los resultados obtenidos y sev enumerarán futuras líneas de trabajo.
 
Investigation of general-purpose computing on graphics processing units and its application to the finite element analysis of electromagnetic problems
In this dissertation, the hardware and API architectures of GPUs are investigated, and the corresponding acceleration techniques are applied on the traditional frequency domain finite element method (FEM), the element-level time-domain methods, and the nonlinear discontinuous Galerkin method. First, the assembly and the solution phases of the FEM are parallelized and mapped onto the granular GPU processors. Efficient parallelization strategies for the finite element matrix assembly on a single GPU and on multiple GPUs are proposed. The parallelization strategies for the finite element matrix solution, in conjunction with parallelizable preconditioners are investigated to reduce the total solution time. Second, the element-level dual-field domain decomposition (DFDD-ELD) method is parallelized on GPU. The element-level algorithms treat each finite element as a subdomain, where the elements march the fields in time by exchanging fields and fluxes on the element boundary interfaces with the neighboring elements. The proposed parallelization framework is readily applicable to similar element-level algorithms, where the application to the discontinuous Galerkin time-domain (DGTD) methods show good acceleration results. Third, the element-level parallelization framework is further adapted to the acceleration of nonlinear DGTD algorithm, which has potential applications in the field of optics. The proposed nonlinear DGTD algorithm describes the third-order instantaneous nonlinear effect between the electromagnetic field and the medium permittivity. The Newton-Raphson method is incorporated to reduce the number of nonlinear iterations through its quadratic convergence. Various nonlinear examples are presented to show the different Kerr effects observed through the third-order nonlinearity. With the acceleration using MPI+GPU under large cluster environments, the solution times for the various linear and nonlinear examples are significantly reduced
Recommended from our members
Quantum Chemistry in Nanoscale Environments: Insights on Surface-Enhanced Raman Scattering and Organic Photovoltaics
The understanding of molecular effects in nanoscale environments is becoming increasingly relevant for various emerging fields. These include spectroscopy for molecular identification as well as in finding molecules for energy harvesting. Theoretical quantum chemistry has been increasingly useful to address these phenomena to yield an understanding of these effects. In the first part of this dissertation, we study the chemical effect of surface-enhanced Raman scattering (SERS). We use quantum chemistry simulations to study the metal-molecule interactions present in these systems. We find that the excitations that provide a chemical enhancement contain a mixed contribution from the metal and the molecule. Moreover, using atomistic studies we propose an additional source of enhancement, where a transition metal dopant surface could provide an additional enhancement. We also develop methods to study the electrostatic effects of molecules in metallic environments. We study the importance of image-charge effects, as well as field-bias to molecules interacting with perfect conductors. The atomistic modeling and the electrostatic approximation enable us to study the effects of the metal interacting with the molecule in a complementary fashion, which provides a better understanding of the complex effects present in SERS. In the second part of this dissertation, we present the Harvard Clean Energy project, a high-throughput approach for a large-scale computational screening and design of organic photovoltaic materials. We create molecular libraries to search for candidates structures and use quantum chemistry, machine learning and cheminformatics methods to characterize these systems and find structure-property relations. The scale of this study requires an equally large computational resource. We rely on distributed volunteer computing to obtain these properties. In the third part of this dissertation we present our work related to the acceleration of electronic structure methods using graphics processing units. This hardware represents a change of paradigm with respect to the typical CPU device architectures. We accelerate the resolution-of-the-identity Moller-Plesset second-order perturbation theory algorithm using graphics cards. We also provide detailed tools to address memory and single-precision issues that these cards often present