8 research outputs found
Automatic Generation of Efficient Linear Algebra Programs
The level of abstraction at which application experts reason about linear
algebra computations and the level of abstraction used by developers of
high-performance numerical linear algebra libraries do not match. The former is
conveniently captured by high-level languages and libraries such as Matlab and
Eigen, while the latter expresses the kernels included in the BLAS and LAPACK
libraries. Unfortunately, the translation from a high-level computation to an
efficient sequence of kernels is a task, far from trivial, that requires
extensive knowledge of both linear algebra and high-performance computing.
Internally, almost all high-level languages and libraries use efficient
kernels; however, the translation algorithms are too simplistic and thus lead
to a suboptimal use of said kernels, with significant performance losses. In
order to both achieve the productivity that comes with high-level languages,
and make use of the efficiency of low level kernels, we are developing Linnea,
a code generator for linear algebra problems. As input, Linnea takes a
high-level description of a linear algebra problem and produces as output an
efficient sequence of calls to high-performance kernels. In 25 application
problems, the code generated by Linnea always outperforms Matlab, Julia, Eigen
and Armadillo, with speedups up to and exceeding 10x
The Linear Algebra Mapping Problem
We observe a disconnect between the developers and the end users of linear
algebra libraries. On the one hand, the numerical linear algebra and the
high-performance communities invest significant effort in the development and
optimization of highly sophisticated numerical kernels and libraries, aiming at
the maximum exploitation of both the properties of the input matrices, and the
architectural features of the target computing platform. On the other hand, end
users are progressively less likely to go through the error-prone and time
consuming process of directly using said libraries by writing their code in C
or Fortran; instead, languages and libraries such as Matlab, Julia, Eigen and
Armadillo, which offer a higher level of abstraction, are becoming more and
more popular. Users are given the opportunity to code matrix computations with
a syntax that closely resembles the mathematical description; it is then a
compiler or an interpreter that internally maps the input program to lower
level kernels, as provided by libraries such as BLAS and LAPACK. Unfortunately,
our experience suggests that in terms of performance, this translation is
typically vastly suboptimal.
In this paper, we first introduce the Linear Algebra Mapping Problem, and
then investigate how effectively a benchmark of test problems is solved by
popular high-level programming languages. Specifically, we consider Matlab,
Octave, Julia, R, Armadillo (C++), Eigen (C++), and NumPy (Python); the
benchmark is meant to test both standard compiler optimizations such as common
subexpression elimination and loop-invariant code motion, as well as linear
algebra specific optimizations such as optimal parenthesization of a matrix
product and kernel selection for matrices with properties. The aim of this
study is to give concrete guidelines for the development of languages and
libraries that support linear algebra computations