A parallel fast direct solver for rank-compressible block tridiagonal linear
systems is presented. Algorithmic synergies between Cyclic Reduction and
Hierarchical matrix arithmetic operations result in a solver with O(Nlog2N) arithmetic complexity and O(NlogN) memory footprint. We provide a
baseline for performance and applicability by comparing with well known
implementations of the H-LU factorization and algebraic multigrid
with a parallel implementation that leverages the concurrency features of the
method. Numerical experiments reveal that this method is comparable with other
fast direct solvers based on Hierarchical Matrices such as H-LU and
that it can tackle problems where algebraic multigrid fails to converge