2,658 research outputs found
Managing Communication Latency-Hiding at Runtime for Parallel Programming Languages and Libraries
This work introduces a runtime model for managing communication with support
for latency-hiding. The model enables non-computer science researchers to
exploit communication latency-hiding techniques seamlessly. For compiled
languages, it is often possible to create efficient schedules for
communication, but this is not the case for interpreted languages. By
maintaining data dependencies between scheduled operations, it is possible to
aggressively initiate communication and lazily evaluate tasks to allow maximal
time for the communication to finish before entering a wait state. We implement
a heuristic of this model in DistNumPy, an auto-parallelizing version of
numerical Python that allows sequential NumPy programs to run on distributed
memory architectures. Furthermore, we present performance comparisons for eight
benchmarks with and without automatic latency-hiding. The results shows that
our model reduces the time spent on waiting for communication as much as 27
times, from a maximum of 54% to only 2% of the total execution time, in a
stencil application.Comment: PREPRIN
Array operators using multiple dispatch: a design methodology for array implementations in dynamic languages
Arrays are such a rich and fundamental data type that they tend to be built
into a language, either in the compiler or in a large low-level library.
Defining this functionality at the user level instead provides greater
flexibility for application domains not envisioned by the language designer.
Only a few languages, such as C++ and Haskell, provide the necessary power to
define -dimensional arrays, but these systems rely on compile-time
abstraction, sacrificing some flexibility. In contrast, dynamic languages make
it straightforward for the user to define any behavior they might want, but at
the possible expense of performance.
As part of the Julia language project, we have developed an approach that
yields a novel trade-off between flexibility and compile-time analysis. The
core abstraction we use is multiple dispatch. We have come to believe that
while multiple dispatch has not been especially popular in most kinds of
programming, technical computing is its killer application. By expressing key
functions such as array indexing using multi-method signatures, a surprising
range of behaviors can be obtained, in a way that is both relatively easy to
write and amenable to compiler analysis. The compact factoring of concerns
provided by these methods makes it easier for user-defined types to behave
consistently with types in the standard library.Comment: 6 pages, 2 figures, workshop paper for the ARRAY '14 workshop, June
11, 2014, Edinburgh, United Kingdo
Automatic differentiation in machine learning: a survey
Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in
machine learning. Automatic differentiation (AD), also called algorithmic
differentiation or simply "autodiff", is a family of techniques similar to but
more general than backpropagation for efficiently and accurately evaluating
derivatives of numeric functions expressed as computer programs. AD is a small
but established field with applications in areas including computational fluid
dynamics, atmospheric sciences, and engineering design optimization. Until very
recently, the fields of machine learning and AD have largely been unaware of
each other and, in some cases, have independently discovered each other's
results. Despite its relevance, general-purpose AD has been missing from the
machine learning toolbox, a situation slowly changing with its ongoing adoption
under the names "dynamic computational graphs" and "differentiable
programming". We survey the intersection of AD and machine learning, cover
applications where AD has direct relevance, and address the main implementation
techniques. By precisely defining the main differentiation techniques and their
interrelationships, we aim to bring clarity to the usage of the terms
"autodiff", "automatic differentiation", and "symbolic differentiation" as
these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure
solveME: fast and reliable solution of nonlinear ME models.
BackgroundGenome-scale models of metabolism and macromolecular expression (ME) significantly expand the scope and predictive capabilities of constraint-based modeling. ME models present considerable computational challenges: they are much (>30 times) larger than corresponding metabolic reconstructions (M models), are multiscale, and growth maximization is a nonlinear programming (NLP) problem, mainly due to macromolecule dilution constraints.ResultsHere, we address these computational challenges. We develop a fast and numerically reliable solution method for growth maximization in ME models using a quad-precision NLP solver (Quad MINOS). Our method was up to 45 % faster than binary search for six significant digits in growth rate. We also develop a fast, quad-precision flux variability analysis that is accelerated (up to 60Ă— speedup) via solver warm-starts. Finally, we employ the tools developed to investigate growth-coupled succinate overproduction, accounting for proteome constraints.ConclusionsJust as genome-scale metabolic reconstructions have become an invaluable tool for computational and systems biologists, we anticipate that these fast and numerically reliable ME solution methods will accelerate the wide-spread adoption of ME models for researchers in these fields
A Taxonomy of Automatic Differential Tools
Many of the current automatic differentiation (AD) tools have similar characteristics. Unfortunately, it is often the case that the similarities between these various AD tools can not be easily ascertained by reading the corresponding documentation. To clarify this situation, a taxonomy of AD tools is presented. The taxonomy places AD tools into the Elemental, Extensional, Integral, Operational, and Symbolic classes. This taxonomy is used to classify twenty-nine AD tools. Each tool is examined individually with respect to the mode of differentiation used and the degree of derivatives computed. A list detailing the availability of the surveyed AD tools is provided as an appendix
Computing and Information Science (CIS)
Cornell University Courses of Study Vol. 97 2005/200
- …