3 research outputs found

    A Behavioural Theory of Recursive Algorithms

    Full text link
    "What is an algorithm?" is a fundamental question of computer science. Gurevich's behavioural theory of sequential algorithms (aka the sequential ASM thesis) gives a partial answer by defining (non-deterministic) sequential algorithms axiomatically, without referring to a particular machine model or programming language, and showing that they are captured by (non-deterministic) sequential Abstract State Machines (nd-seq ASMs). Moschovakis pointed out that recursive algorithms such as mergesort are not covered by this theory. In this article we propose an axiomatic definition of the notion of sequential recursive algorithm which extends Gurevich's axioms for sequential algorithms by a Recursion Postulate and allows us to prove that sequential recursive algorithms are captured by recursive Abstract State Machines, an extension of nd-seq ASMs by a CALL rule. Applying this recursive ASM thesis yields a characterization of sequential recursive algorithms as finitely composed concurrent algorithms all of whose concurrent runs are partial-order runs.Comment: 34 page

    Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion

    Full text link
    This paper investigates the ability of transformer-based models to learn structural recursion from examples. Recursion is a universal concept in both natural and formal languages. Structural recursion is central to the programming language and formal mathematics tasks where symbolic tools currently excel beyond neural models, such as inferring semantic relations between datatypes and emulating program behavior. We introduce a general framework that nicely connects the abstract concepts of structural recursion in the programming language domain to concrete sequence modeling problems and learned models' behavior. The framework includes a representation that captures the general \textit{syntax} of structural recursion, coupled with two different frameworks for understanding their \textit{semantics} -- one that is more natural from a programming languages perspective and one that helps bridge that perspective with a mechanistic understanding of the underlying transformer architecture. With our framework as a powerful conceptual tool, we identify different issues under various set-ups. The models trained to emulate recursive computations cannot fully capture the recursion yet instead fit short-cut algorithms and thus cannot solve certain edge cases that are under-represented in the training distribution. In addition, it is difficult for state-of-the-art large language models (LLMs) to mine recursive rules from in-context demonstrations. Meanwhile, these LLMs fail in interesting ways when emulating reduction (step-wise computation) of the recursive function.Comment: arXiv admin note: text overlap with arXiv:2305.1469
    corecore