BLASFEO is a dense linear algebra library providing high-performance
implementations of BLAS- and LAPACK-like routines for use in embedded
optimization. A key difference with respect to existing high-performance
implementations of BLAS is that the computational performance is optimized for
small to medium scale matrices, i.e., for sizes up to a few hundred. BLASFEO
comes with three different implementations: a high-performance implementation
aiming at providing the highest performance for matrices fitting in cache, a
reference implementation providing portability and embeddability and optimized
for very small matrices, and a wrapper to standard BLAS and LAPACK providing
high-performance on large matrices. The three implementations of BLASFEO
together provide high-performance dense linear algebra routines for matrices
ranging from very small to large. Compared to both open-source and proprietary
highly-tuned BLAS libraries, for matrices of size up to about one hundred the
high-performance implementation of BLASFEO is about 20-30% faster than the
corresponding level 3 BLAS routines and 2-3 times faster than the corresponding
LAPACK routines