3 research outputs found
A Behavioural Theory of Recursive Algorithms
"What is an algorithm?" is a fundamental question of computer science.
Gurevich's behavioural theory of sequential algorithms (aka the sequential ASM
thesis) gives a partial answer by defining (non-deterministic) sequential
algorithms axiomatically, without referring to a particular machine model or
programming language, and showing that they are captured by (non-deterministic)
sequential Abstract State Machines (nd-seq ASMs). Moschovakis pointed out that
recursive algorithms such as mergesort are not covered by this theory. In this
article we propose an axiomatic definition of the notion of sequential
recursive algorithm which extends Gurevich's axioms for sequential algorithms
by a Recursion Postulate and allows us to prove that sequential recursive
algorithms are captured by recursive Abstract State Machines, an extension of
nd-seq ASMs by a CALL rule. Applying this recursive ASM thesis yields a
characterization of sequential recursive algorithms as finitely composed
concurrent algorithms all of whose concurrent runs are partial-order runs.Comment: 34 page
Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion
This paper investigates the ability of transformer-based models to learn
structural recursion from examples. Recursion is a universal concept in both
natural and formal languages. Structural recursion is central to the
programming language and formal mathematics tasks where symbolic tools
currently excel beyond neural models, such as inferring semantic relations
between datatypes and emulating program behavior. We introduce a general
framework that nicely connects the abstract concepts of structural recursion in
the programming language domain to concrete sequence modeling problems and
learned models' behavior. The framework includes a representation that captures
the general \textit{syntax} of structural recursion, coupled with two different
frameworks for understanding their \textit{semantics} -- one that is more
natural from a programming languages perspective and one that helps bridge that
perspective with a mechanistic understanding of the underlying transformer
architecture.
With our framework as a powerful conceptual tool, we identify different
issues under various set-ups. The models trained to emulate recursive
computations cannot fully capture the recursion yet instead fit short-cut
algorithms and thus cannot solve certain edge cases that are under-represented
in the training distribution. In addition, it is difficult for state-of-the-art
large language models (LLMs) to mine recursive rules from in-context
demonstrations. Meanwhile, these LLMs fail in interesting ways when emulating
reduction (step-wise computation) of the recursive function.Comment: arXiv admin note: text overlap with arXiv:2305.1469