Many problems in static program analysis can be modeled as the context-free
language (CFL) reachability problem on directed labeled graphs. The CFL
reachability problem can be generally solved in time O(n3), where n is the
number of vertices in the graph, with some specific cases that can be solved
faster. In this work, we ask the following question: given a specific CFL, what
is the exact exponent in the monomial of the running time? In other words, for
which cases do we have linear, quadratic or cubic algorithms, and are there
problems with intermediate runtimes? This question is inspired by recent
efforts to classify classic problems in terms of their exact polynomial
complexity, known as {\em fine-grained complexity}. Although recent efforts
have shown some conditional lower bounds (mostly for the class of combinatorial
algorithms), a general picture of the fine-grained complexity landscape for CFL
reachability is missing.
Our main contribution is lower bound results that pinpoint the exact running
time of several classes of CFLs or specific CFLs under widely believed lower
bound conjectures (Boolean Matrix Multiplication and k-Clique). We
particularly focus on the family of Dyck-k languages (which are strings with
well-matched parentheses), a fundamental class of CFL reachability problems. We
present new lower bounds for the case of sparse input graphs where the number
of edges m is the input parameter, a common setting in the database
literature. For this setting, we show a cubic lower bound for Andersen's
Pointer Analysis which significantly strengthens prior known results.Comment: Appeared in POPL 2023. Please note the erratum on the first pag