1,210 research outputs found

    Learning with Latent Language

    Full text link
    The named concepts and compositional operators present in natural language provide a rich source of information about the kinds of abstractions humans use to navigate the world. Can this linguistic background knowledge improve the generality and efficiency of learned classifiers and control policies? This paper aims to show that using the space of natural language strings as a parameter space is an effective way to capture natural task structure. In a pretraining phase, we learn a language interpretation model that transforms inputs (e.g. images) into outputs (e.g. labels) given natural language descriptions. To learn a new concept (e.g. a classifier), we search directly in the space of descriptions to minimize the interpreter's loss on training examples. Crucially, our models do not require language data to learn these concepts: language is used only in pretraining to impose structure on subsequent learning. Results on image classification, text editing, and reinforcement learning show that, in all settings, models with a linguistic parameterization outperform those without

    The Road to General Intelligence

    Get PDF
    Humans have always dreamed of automating laborious physical and intellectual tasks, but the latter has proved more elusive than naively suspected. Seven decades of systematic study of Artificial Intelligence have witnessed cycles of hubris and despair. The successful realization of General Intelligence (evidenced by the kind of cross-domain flexibility enjoyed by humans) will spawn an industry worth billions and transform the range of viable automation tasks.The recent notable successes of Machine Learning has lead to conjecture that it might be the appropriate technology for delivering General Intelligence. In this book, we argue that the framework of machine learning is fundamentally at odds with any reasonable notion of intelligence and that essential insights from previous decades of AI research are being forgotten. We claim that a fundamental change in perspective is required, mirroring that which took place in the philosophy of science in the mid 20th century. We propose a framework for General Intelligence, together with a reference architecture that emphasizes the need for anytime bounded rationality and a situated denotational semantics. We given necessary emphasis to compositional reasoning, with the required compositionality being provided via principled symbolic-numeric inference mechanisms based on universal constructions from category theory. • Details the pragmatic requirements for real-world General Intelligence. • Describes how machine learning fails to meet these requirements. • Provides a philosophical basis for the proposed approach. • Provides mathematical detail for a reference architecture. • Describes a research program intended to address issues of concern in contemporary AI. The book includes an extensive bibliography, with ~400 entries covering the history of AI and many related areas of computer science and mathematics.The target audience is the entire gamut of Artificial Intelligence/Machine Learning researchers and industrial practitioners. There are a mixture of descriptive and rigorous sections, according to the nature of the topic. Undergraduate mathematics is in general sufficient. Familiarity with category theory is advantageous for a complete understanding of the more advanced sections, but these may be skipped by the reader who desires an overall picture of the essential concepts This is an open access book

    Efficient instance and hypothesis space revision in Meta-Interpretive Learning

    Get PDF
    Inductive Logic Programming (ILP) is a form of Machine Learning. The goal of ILP is to induce hypotheses, as logic programs, that generalise training examples. ILP is characterised by a high expressivity, generalisation ability and interpretability. Meta-Interpretive Learning (MIL) is a state-of-the-art sub-field of ILP. However, current MIL approaches have limited efficiency: the sample and learning complexity respectively are polynomial and exponential in the number of clauses. My thesis is that improvements over the sample and learning complexity can be achieved in MIL through instance and hypothesis space revision. Specifically, we investigate 1) methods that revise the instance space, 2) methods that revise the hypothesis space and 3) methods that revise both the instance and the hypothesis spaces for achieving more efficient MIL. First, we introduce a method for building training sets with active learning in Bayesian MIL. Instances are selected maximising the entropy. We demonstrate this method can reduce the sample complexity and supports efficient learning of agent strategies. Second, we introduce a new method for revising the MIL hypothesis space with predicate invention. Our method generates predicates bottom-up from the background knowledge related to the training examples. We demonstrate this method is complete and can reduce the learning and sample complexity. Finally, we introduce a new MIL system called MIGO for learning optimal two-player game strategies. MIGO learns from playing: its training sets are built from the sequence of actions it chooses. Moreover, MIGO revises its hypothesis space with Dependent Learning: it first solves simpler tasks and can reuse any learned solution for solving more complex tasks. We demonstrate MIGO significantly outperforms both classical and deep reinforcement learning. The methods presented in this thesis open exciting perspectives for efficiently learning theories with MIL in a wide range of applications including robotics, modelling of agent strategies and game playing.Open Acces

    What underlies rapid learning and systematic generalization in humans

    Full text link
    Despite the groundbreaking successes of neural networks, contemporary models require extensive training with massive datasets and exhibit poor out-of-sample generalization. One proposed solution is to build systematicity and domain-specific constraints into the model, echoing the tenets of classical, symbolic cognitive architectures. In this paper, we consider the limitations of this approach by examining human adults' ability to learn an abstract reasoning task from a brief instructional tutorial and explanatory feedback for incorrect responses, demonstrating that human learning dynamics and ability to generalize outside the range of the training examples differ drastically from those of a representative neural network model, and that the model is brittle to changes in features not anticipated by its authors. We present further evidence from human data that the ability to consistently solve the puzzles was associated with education, particularly basic mathematics education, and with the ability to provide a reliably identifiable, valid description of the strategy used. We propose that rapid learning and systematic generalization in humans may depend on a gradual, experience-dependent process of learning-to-learn using instructions and explanations to guide the construction of explicit abstract rules that support generalizable inferences.Comment: 22 pages, 48 references, 6 Figures, and one Table, plus S

    A Pre-expectation Calculus for Probabilistic Sensitivity

    Get PDF
    Sensitivity properties describe how changes to the input of a program affect the output, typically by upper bounding the distance between the outputs of two runs by a monotone function of the distance between the corresponding inputs. When programs are probabilistic, the distance between outputs is a distance between distributions. The Kantorovich lifting provides a general way of defining a distance between distributions by lifting the distance of the underlying sample space; by choosing an appropriate distance on the base space, one can recover other usual probabilistic distances, such as the Total Variation distance. We develop a relational pre-expectation calculus to upper bound the Kantorovich distance between two executions of a probabilistic program. We illustrate our methods by proving algorithmic stability of a machine learning algorithm, convergence of a reinforcement learning algorithm, and fast mixing for card shuffling algorithms. We also consider some extensions: using our calculus to show convergence of Markov chains to the uniform distribution over states and an asynchronous extension to reason about pairs of program executions with different control flow
    • …
    corecore