Search CORE

46,678 research outputs found

From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

Author: Blazewicz Marek
Brandt Steven R.
Ciznicki Milosz
Hinder Ian
Kierzynka Michal
Koppelman David M.
Löffler Frank
Schnetter Erik
Tao Jian
Publication venue: 'IOS Press'
Publication date: 01/01/2013
Field of study

Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.Comment: 18 pages, 4 figures, accepted for publication in Scientific Programmin

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

Louisiana State University

MPG.PuRe

Developing efficient web-based GIS applications

Author: Adnan M.
Longley P.
Singleton A.
Publication venue: Centre for Advanced Spatial Analysis (UCL)
Publication date: 01/02/2010
Field of study

There is an increase in the number of web-based GIS applications over the recent years. This paper describes different mapping technologies, database standards, and web application development standards that are relevant to the development of web-based GIS applications. Different mapping technologies for displaying geo-referenced data are available and can be used in different situations. This paper also explains why Oracle is the system of choice for geospatial applications that need to handle large amounts of data. Wireframing and design patterns have been shown to be useful in making GIS web applications efficient, scalable and usable, and should be an important part of every web-based GIS application. A range of different development technologies are available, and their use in different operating environments has been discussed here in some detail

UCL Discovery

PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation

Author: Ahmed Fasih
Andreas Klöckner
Bell
Bryan Catanzaro
Buck
Chandler
Dalcín
Eich
Feldman
Flanagan
Frigo
Group
Hestenes
Hesthaven
Kennedy
Klöckner
Lam
Langtangen
Lindholm
McCarthy
McCool
Nicolas Pinto
Oliphant
Owens
Paul Ivanov
Pinto
Pinto
Prud’homme
Reynders
Seiler
Stein
Valiant
van Hateren
Veldhuizen
Wang
Whaley
Yunsup Lee
Publication venue: 'Elsevier BV'
Publication date: 29/03/2011
Field of study

High-performance computing has recently seen a surge of interest in heterogeneous systems, with an emphasis on modern Graphics Processing Units (GPUs). These devices offer tremendous potential for performance and efficiency in important large-scale applications of computational science. However, exploiting this potential can be challenging, as one must adapt to the specialized and rapidly evolving computing environment currently exhibited by GPUs. One way of addressing this challenge is to embrace better techniques and develop tools tailored to their needs. This article presents one simple technique, GPU run-time code generation (RTCG), along with PyCUDA and PyOpenCL, two open-source toolkits that support this technique. In introducing PyCUDA and PyOpenCL, this article proposes the combination of a dynamic, high-level scripting language with the massive performance of a GPU as a compelling two-tiered computing platform, potentially offering significant performance and productivity advantages over conventional single-tier, static systems. The concept of RTCG is simple and easily implemented using existing, robust infrastructure. Nonetheless it is powerful enough to support (and encourage) the creation of custom application-specific tools by its users. The premise of the paper is illustrated by a wide range of examples where the technique has been applied with considerable success.Comment: Submitted to Parallel Computing, Elsevie

arXiv.org e-Print Archive

Crossref

A terahertz grid frequency doubler

Author: Alina Moussessian
David B. Rutledge
Jung-Chih Chiao
Michael C. Wanke
S. James Allen
Senior Member
Thomas W. Crowe
Yongjun Li
Publication venue
Publication date: 01/01/1998
Field of study

We present a 144-element terahertz quasi-optical grid frequency doubler. The grid is a planar structure with bow-tie antennas as a unit cell, each loaded with a planar Schottky diode. The maximum output power measured for this grid is 24 mW at 1 THz for 3.1-μs 500-GHz input pulses with a peak input power of 47 W. An efficiency of 0.17% for an input power of 6.3 W and output power of 10.8 mW is measured. To date, this is the largest recorded output power for a multiplier at terahertz frequencies. Input and output tuning curves are presented and an output pattern is measured and compared to theory

CiteSeerX

Caltech Authors

Devito: Towards a generic Finite Difference DSL using Symbolic Python

Author: Gorman Gerard
Kazakas Paulius
Kukreja Navjot
Lange Michael
Louboutin Mathias
Luporini Fabio
Pandolfo Vincenzo
Velesko Paulius
Vieira Felippe
Publication venue
Publication date: 01/01/2016
Field of study

Domain specific languages (DSL) have been used in a variety of fields to express complex scientific problems in a concise manner and provide automated performance optimization for a range of computational architectures. As such DSLs provide a powerful mechanism to speed up scientific Python computation that goes beyond traditional vectorization and pre-compilation approaches, while allowing domain scientists to build applications within the comforts of the Python software ecosystem. In this paper we present Devito, a new finite difference DSL that provides optimized stencil computation from high-level problem specifications based on symbolic Python expressions. We demonstrate Devito's symbolic API and performance advantages over traditional Python acceleration methods before highlighting its use in the scientific context of seismic inversion problems.Comment: pyHPC 2016 conference submissio

arXiv.org e-Print Archive

Crossref

Spiral - Imperial College Digital Repository

Dynamics estimation and generalized tuning of stationary frame current controller for grid-tied power converters

Author: Luna Alloza Álvaro
Mir Cantarellas Antonio
Remón Rodríguez Daniel
Rodríguez Cortés Pedro
Zhang Weiyi
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2016
Field of study

The integration of AC-DC power converters to manage the connection of generation to the grid has increased exponentially over the last years. PV or wind generation plants are one of the main applications showing this trend. High power converters are increasingly installed for integrating the renewables in a larger scale. The control design for these converters becomes more challenging due to the reduced control bandwidth and increased complexity in the grid connection filter. A generalized and optimized control tuning approach for converters becomes more favored. This paper proposes an algorithm for estimating the dynamic performance of the stationary frame current controllers, and based on it a generalized and optimized tuning approach is developed. The experience-based specifications of the tuning inputs are not necessary through the tuning approach. Simulation and experimental results in different scenarios are shown to evaluate the proposal.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

An Extensible Timing Infrastructure for Adaptive Large-scale Applications

Author: Allen Gabrielle
Goodale Tom
Radke Thomas
Schnetter Erik
Stark Dylan
Publication venue
Publication date: 01/01/2007
Field of study

Real-time access to accurate and reliable timing information is necessary to profile scientific applications, and crucial as simulations become increasingly complex, adaptive, and large-scale. The Cactus Framework provides flexible and extensible capabilities for timing information through a well designed infrastructure and timing API. Applications built with Cactus automatically gain access to built-in timers, such as gettimeofday and getrusage, system-specific hardware clocks, and high-level interfaces such as PAPI. We describe the Cactus timer interface, its motivation, and its implementation. We then demonstrate how this timing information can be used by an example scientific application to profile itself, and to dynamically adapt itself to a changing environment at run time

arXiv.org e-Print Archive

MPG.PuRe