6,061 research outputs found
AD in Fortran, Part 2: Implementation via Prepreprocessor
We describe an implementation of the Farfel Fortran AD extensions. These
extensions integrate forward and reverse AD directly into the programming
model, with attendant benefits to flexibility, modularity, and ease of use. The
implementation we describe is a "prepreprocessor" that generates input to
existing Fortran-based AD tools. In essence, blocks of code which are targeted
for AD by Farfel constructs are put into subprograms which capture their
lexical variable context, and these are closure-converted into top-level
subprograms and specialized to eliminate EXTERNAL arguments, rendering them
amenable to existing AD preprocessors, which are then invoked, possibly
repeatedly if the AD is nested
Julia: A Fresh Approach to Numerical Computing
Bridging cultures that have often been distant, Julia combines expertise from
the diverse fields of computer science and computational science to create a
new approach to numerical computing. Julia is designed to be easy and fast.
Julia questions notions generally held as "laws of nature" by practitioners of
numerical computing:
1. High-level dynamic programs have to be slow.
2. One must prototype in one language and then rewrite in another language
for speed or deployment, and
3. There are parts of a system for the programmer, and other parts best left
untouched as they are built by the experts.
We introduce the Julia programming language and its design --- a dance
between specialization and abstraction. Specialization allows for custom
treatment. Multiple dispatch, a technique from computer science, picks the
right algorithm for the right circumstance. Abstraction, what good computation
is really about, recognizes what remains the same after differences are
stripped away. Abstractions in mathematics are captured as code through another
technique from computer science, generic programming.
Julia shows that one can have machine performance without sacrificing human
convenience.Comment: 37 page
Loo.py: From Fortran to performance via transformation and substitution rules
A large amount of numerically-oriented code is written and is being written
in legacy languages. Much of this code could, in principle, make good use of
data-parallel throughput-oriented computer architectures. Loo.py, a
transformation-based programming system targeted at GPUs and general
data-parallel architectures, provides a mechanism for user-controlled
transformation of array programs. This transformation capability is designed to
not just apply to programs written specifically for Loo.py, but also those
imported from other languages such as Fortran. It eases the trade-off between
achieving high performance, portability, and programmability by allowing the
user to apply a large and growing family of transformations to an input
program. These transformations are expressed in and used from Python and may be
applied from a variety of settings, including a pragma-like manner from other
languages.Comment: ARRAY 2015 - 2nd ACM SIGPLAN International Workshop on Libraries,
Languages and Compilers for Array Programming (ARRAY 2015
A C++11 implementation of arbitrary-rank tensors for high-performance computing
This article discusses an efficient implementation of tensors of arbitrary
rank by using some of the idioms introduced by the recently published C++ ISO
Standard (C++11). With the aims at providing a basic building block for
high-performance computing, a single Array class template is carefully crafted,
from which vectors, matrices, and even higher-order tensors can be created. An
expression template facility is also built around the array class template to
provide convenient mathematical syntax. As a result, by using templates, an
extra high-level layer is added to the C++ language when dealing with algebraic
objects and their operations, without compromising performance. The
implementation is tested running on both CPU and GPU.Comment: 21 pages, 6 figures, 1 tabl
- …