Search CORE

54 research outputs found

Neural-Symbolic Recursive Machine for Systematic Generalization

Author: Huang Siyuan
Li Qing
Liang Yitao
Wu Ying Nian
Zhu Song-Chun
Zhu Yixin
Publication venue
Publication date: 04/10/2022
Field of study

Despite the tremendous success, existing machine learning models still fall short of human-like systematic generalization -- learning compositional rules from limited data and applying them to unseen combinations in various domains. We propose Neural-Symbolic Recursive Machine (NSR) to tackle this deficiency. The core representation of NSR is a Grounded Symbol System (GSS) with combinatorial syntax and semantics, which entirely emerges from training data. Akin to the neuroscience studies suggesting separate brain systems for perceptual, syntactic, and semantic processing, NSR implements analogous separate modules of neural perception, syntactic parsing, and semantic reasoning, which are jointly learned by a deduction-abduction algorithm. We prove that NSR is expressive enough to model various sequence-to-sequence tasks. Superior systematic generalization is achieved via the inductive biases of equivariance and recursiveness embedded in NSR. In experiments, NSR achieves state-of-the-art performance in three benchmarks from different domains: SCAN for semantic parsing, PCFG for string manipulation, and HINT for arithmetic reasoning. Specifically, NSR achieves 100% generalization accuracy on SCAN and PCFG and outperforms state-of-the-art models on HINT by about 23%. Our NSR demonstrates stronger generalization than pure neural networks due to its symbolic representation and inductive biases. NSR also demonstrates better transferability than existing neural-symbolic approaches due to less domain-specific knowledge required

arXiv.org e-Print Archive

Expectation-based syntactic comprehension

Author: Altmann
Anderson
Anderson
Bornkessel
Carstairs
Christiansen
Clifton
Comrie
Comrie
Crocker
Cuetos
Culy
Ehrlich
Elman
Elman
Engbert
Farmer
Federmeier
Ferreira
Frazier
Frisson
Gibson
Gibson
Gibson
Gibson
Gordon
Green
Grodner
Hale
Hale
Jelinek
Johnson
Jurafsky
Jurafsky
Kaiser
Keller
King
Kiparsky
Konieczny
Kutas
Kutas
MacDonald
MacDonald
Marcus
Marslen-Wilson
McDonald
McDonald
McRae
Mitchell
Mitchell
Rayner
Rayner
Reichle
Roark
Roger Levy
Rumelhart
Schlesewsky
Shieber
Skut
Spivey
Stolcke
Tabor
Tabor
Tabor
Tabor
Tanenhaus
Taylor
Traxler
Uszkoreit
van Gompel
van Gompel
Vasishth
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Towards Artificial Language Learning in a Potts Attractor Network

Author: Pirmoradian Sahar
Publication venue: place:Trieste
Publication date: 28/01/2013
Field of study

It remains a mystery how children acquire natural languages; languages far beyond the few symbols that a young chimp struggles to learn, and with complex rules that incomparably surpass the repetitive structure of bird songs. How should one explain the emergence of such a capacity from the basic elements of the nervous system, namely neuronal networks? To understand the brain mechanisms underlying the language phenomenon, specifically sentence construction, different approaches have been attempted to implement an artificial neural network that encodes words and constructs sentences (see e.g. (Hummel, J.E. and Holyoak, 1997; Huyck, 2009; Velde and de Kamps, 2006; Stewart and Eliasmith, 2009)). These attempts differ on how the sentence constituents (parts) are represented\u2014either individually and locally, or in a distributed fashion\u2014and on how these constituents are bound together. In LISA (Hummel, J.E. and Holyoak, 1997), each sentence constituent (either a word, a phrase, or even a proposition) is represented individually by a unit\u2014intended to be a population of neurons (Hummel and Holyoak, 2003)\u2014and relevant constituents synchronously get activated in the construction of a sentence (or the inference of a proposition). Considering the productivity of the language\u2014the ability of humans to create many possible sentences out of a limited vocabulary\u2014this representation results in an exponential growth in the number of units needed for structure representation. In order to avoid this problem, Neural Blackboard Architectures (Velde and de Kamps, 2006) were proposed as systems endowed with dynamic bindings between assemblies of words, roles (e.g. theme or agent), and word categories (e.g. nouns or verbs). A neural blackboard architecture resembles a switchboard (a blackboard) that wires sentence constituents together via circuits, using highly complex and meticulously (unrealistic) organized connections. As opposed to localized approaches, in a Vector Symbolic Architecture (Gayler, 2003; Plate, 1991), words are represented in a fully distributed fashion on a vector. The words are bound (and merged) together by algebraic operations\u2014e.g. tensor products (Smolensky, 1990) or circular convolution (Plate, 1991)\u2014in the vector space. In order to give a biological account, some steps have been attempted towards the neural implementation of such operations (Stewart and Eliasmith, 2009). Another distributed approach was toward implementing a simple recurrent neural network that predicts the next word in a sentence (Elman, 1991). Apart from the limited language size that the network could deal with (Elman, 1993), this system lacked an explicit representation of syntactic constituents, thus resulting in a lack of grammatical knowledge in the network (Borensztajn, 2011; Velde and de Kamps, 2006). However, despite all these attempts, there remains the lack of a neural model that addresses the challenges of language size, semantic and syntactic distinction, word binding, and word implementation in a neurally plausible manner. We are exploring a novel approach to address these challenges, that involves first constructing an artificial language of intermediate complexity and then implementing a neural network, as a simplified cortical model of sentence production, which stores the vocabulary and the grammar of the artificial language in a neurally inspired manner on two components: one semantic and one syntactic. As the training language of the network, we have constructed BLISS (Pirmoradian and Treves, 2011), a scaled-down synthetic language of intermediate complexity, with about 150 words, 40 production rules, and a definition of semantics that is reduced to statistical dependence between words. In Chapter 2, we will explain the details of the implementation of BLISS. As a sentence production model, we have implemented a Potts attractor neural network, whose units hypothetically represent patches of cortex. The choice of the Potts network, for sentence production, has been mainly motivated by the latching dynamics it exhibits (Kropff and Treves, 2006); that is, an ability to spontaneously hop, or latch, across memory patterns, which have been stored as dynamical attractors, thus producing a long or even infinite sequence of patterns, at least in some regimes (Russo and Treves, 2012). The goal is to train the Potts network with a corpus of sentences in BLISS. This involves setting first the structure of the network, then the generating algorithm for word representations, and finally the protocol to train the network with the specific transitions present in the BLISS corpus, using both auto- and hetero-associative learning rules. In Chapter 3, we will explain the details of the procedure we have adapted for word representation in the network. The last step involves utilizing the spontaneous latching dynamics exhibited by the Potts network, the word representation we have developed, and crucially hetero-associative weights favouring specific transitions, to generate, with a suitable associative training procedure, sentences \u201duttered\u201d by the network. This last stage of spontaneous sentence production by the network has been explained in Chapter 4

Sissa Digital Library

The integration of syntax and semantic plausibility in a wide-coverage model of human sentence processing

Author: Padó Ulrike
Publication venue: Fakultät 4 - Philosophische Fakultät II. Fachrichtung 4.7 - Allgemeine Linguistik
Publication date: 01/01/2007
Field of study

Models of human sentence processing have paid much attention to three key characteristics of the sentence processor: Its robust and accurate processing of unseen input (wide coverage), its immediate, incremental interpretation of partial input and its sensitivity to structural frequencies in previous language experience. In this thesis, we propose a model of human sentence processing that accounts for these three characteristics and also models a fourth key characteristic, namely the influence of semantic plausibility on sentence processing. The precondition for such a sentence processing model is a general model of human plausibility intuitions. We therefore begin by presenting a probabilistic model of the plausibility of verb-argument relations, which we estimate as the probability of encountering a verb-argument pair in the relation specified by a thematic role in a role-annotated training corpus. This model faces a significant sparse data problem, which we alleviate by combining two orthogonal smoothing methods. We show that the smoothed model\u27;s predictions are significantly correlated to human plausibility judgements for a range of test sets. We also demonstrate that our semantic plausibility model outperforms selectional preference models and a standard role labeller, which solve tasks from computational linguistics that are related to the prediction of human judgements. We then integrate this semantic plausibility model with an incremental, wide-coverage, probabilistic model of syntactic processing to form the Syntax/Semantics (SynSem) Integration model of sentence processing. The SynSem-Integration model combines preferences for candidate syntactic structures from two sources: Syntactic probability estimates from a probabilistic parser and our semantic plausibility model\u27;s estimates of the verb-argument relations in each syntactic analysis. The model uses these preferences to determine a globally preferred structure and predicts difficulty in human sentence processing either if syntactic and semantic preferences conflict, or if the interpretation of the preferred analysis changes non-monotonically. In a thorough evaluation against the patterns of processing difficulty found for four ambiguity phenomena in eight reading-time studies, we demonstrate that the SynSem-Integration model reliably predicts human reading time behaviour.Diese Dissertation behandelt die Modellierung des menschlichen Sprachverstehens auf der Ebene einzelner Sätze. Während sich bereits existierende Modelle hauptsächlich mit syntaktischen Prozessen befassen, liegt unser Schwerpunkt darauf, ein Modell für die semantische Plausibilität von Äußerungen in ein Satzverarbeitungsmodell zu integrieren. Vier wichtige Eigenschaften des Sprachverstehens bestimmen die Konstruktion unseres Modells: Inkrementelle Verarbeitung, eine erfahrungsbasierte Architektur, breite Abdeckung von Äußerungen, und die Integration von semantischer Plausibilität. Während die ersten drei Eigenschaften von vielen Modellen aufgegriffen werden, gab es bis jetzt kein Modell, das außerdem auch Plausibilität einbezieht. Wir stellen zunächst ein generelles Plausibilitätsmodell vor, um es dann mit einem inkrementellen, probabilistischen Satzverarbeitungsmodell mit breiter Abdeckung zu einem Modell mit allen vier angestrebten Eigenschaften zu integrieren. Unser Plausibilitätsmodell sagt menschliche Plausibilitätsbewertungen für Verb-Argumentpaare in verschiedenen Relationen (z.B. Agens oder Patiens) voraus. Das Modell estimiert die Plausibilität eines Verb-Argumentpaars in einer spezifischen, durch eine thematische Rolle angegebenen Relation als die Wahrscheinlichkeit, das Tripel aus Verb, Argument und Rolle in einem rollensemantisch annotierten Trainingskorpus anzutreffen. Die Vorhersagen des Plausbilitätsmodells korrelieren für eine Reihe verschiedener Testdatensätze signifikant mit menschlichen Plausibilitätsbewertungen. Ein Vergleich mit zwei computerlinguist- ischen Ansätzen, die jeweils eine verwandte Aufgabe erfüllen, nämlich die Zuweisung von thematischen Rollen und die Berechnung von Selektionspräferenzen, zeigt, daß unser Modell Plausibilitätsurteile verläßlicher vorhersagt. Unser Satzverstehensmodell, das Syntax/Semantik-Integrationsmodell, ist eine Kombination aus diesem Plausibilitätsmodell und einem inkrementellen, probabilistischen Satzverarbeitungsmodell auf der Basis eines syntaktischen Parsers mit breiter Abdeckung. Das Syntax/Semantik-Integrationsmodell interpoliert syntaktische Wahrscheinlichkeitsabschätzungen für Analysen einer Äußerung mit den semantischen Plausibilitätsabschätzungen für die Verb-Argumentpaare in jeder Analyse. Das Ergebnis ist eine global präferierte Analyse. Das Syntax/Semantik-Integrationsmodell sagt Verarbeitungsschwierigkeiten voraus, wenn entweder die syntaktisch und semantisch präferierte Analyse konfligieren oder wenn sich die semantische Interpretation der global präferierten Analyse in einem Verarbeitungsschritt nicht-monoton ändert. Die abschließende Evaluation anhand von Befunden über menschliche Verarbeitungsschwierigkeiten, wie sie experimentell in acht Studien für vier Ambiguitätsphänomene festgestellt wurden, zeigt, daß das Syntax/Semantik-Integrationsmodell die experimentellen Daten korrekt voraussagt

Active inductive inference in children and adults:A constructivist perspective

Author: Bramley Neil R
Xu Fei
Publication venue
Publication date: 01/09/2023
Field of study

Edinburgh Research Explorer

Bootstrapping Language Acquisition

Author: Abend Omri
Goldwater Sharon
Kwiatkowski Tom
Smith Nathaniel J.
Steedman Mark
Publication venue: 'Elsevier BV'
Publication date: 01/07/2017
Field of study

Edinburgh Research Explorer

Semantic Entropy in Language Comprehension

Author: Brouwer Harm
Crocker Matthew W.
Venhuizen Noortje J.
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 20/12/2019
Field of study

Language is processed on a more or less word-by-word basis, and the processing difficulty induced by each word is affected by our prior linguistic experience as well as our general knowledge about the world. Surprisal and entropy reduction have been independently proposed as linking theories between word processing difficulty and probabilistic language models. Extant models, however, are typically limited to capturing linguistic experience and hence cannot account for the influence of world knowledge. A recent comprehension model by Venhuizen, Crocker, and Brouwer (2019, Discourse Processes) improves upon this situation by instantiating a comprehension-centric metric of surprisal that integrates linguistic experience and world knowledge at the level of interpretation and combines them in determining online expectations. Here, we extend this work by deriving a comprehension-centric metric of entropy reduction from this model. In contrast to previous work, which has found that surprisal and entropy reduction are not easily dissociated, we do find a clear dissociation in our model. While both surprisal and entropy reduction derive from the same cognitive process—the word-by-word updating of the unfolding interpretation—they reflect different aspects of this process: state-by-state expectation (surprisal) versus end-state confirmation (entropy reduction)

Universaar

Acronym