Search CORE

33 research outputs found

Connectionist natural language parsing

Author: Berg
Callan
Christiansen
Christiansen
Cleeremans
Cottrell
Cottrell
Dominic Palmer-Brown
Elman
Fanty
Fodor
Frazier
Friederici
Giles
Greene
Hadley
Heather M. Powell
Ho
Howells
Jonathan A. Tepper
Kwansy
Lane
Lawrence
MacDonald
Marcus
Martelli
Mayberry
McDonald
Miikkulainen
Miikkulainen
Moisl
Pearlmutter
Pollack
Rayner
Reilly
Rodriguez
Santos
Sells
Selman
Servan-Schreiber
Sharkey
Sharkey
St. John
Stevenson
Stowe
Tanenhaus
Taraban
Tepper
Tepper
Waltz
Wermter
Wiles
Zeng
Publication venue: 'Elsevier BV'
Publication date: 01/01/2002
Field of study

The key developments of two decades of connectionist parsing are reviewed. Connectionist parsers are assessed according to their ability to learn to represent syntactic structures from examples automatically, without being presented with symbolic grammar rules. This review also considers the extent to which connectionist parsers offer computational models of human sentence processing and provide plausible accounts of psycholinguistic data. In considering these issues, special attention is paid to the level of realism, the nature of the modularity, and the type of processing that is to be found in a wide range of parsers

Crossref

Nottingham Trent Institutional Repository (IRep)

Strong systematicity in sentence processing by simple recurrent networks

Author: Brakel Philémon
Frank Stefan
Publication venue: Cognitive Science Society
Publication date: 01/01/2009
Field of study

Ghent University Academic Bibliography

eScholarship - University of California

International Migration, Integration and Social Cohesion online publications

Empirical Lessons for Philosophical Theories of Mental Content

Author: Shea Nicholas
Publication venue
Publication date: 19/02/2008
Field of study

This thesis concerns the content of mental representations. It draws lessons for philosophical theories of content from some empirical findings about brains and behaviour drawn from experimental psychology (cognitive, developmental, comparative), cognitive neuroscience and cognitive science (computational modelling). Chapter 1 motivates a naturalist and realist approach to mental representation. Chapter 2 sets out and defends a theory of content for static feedforward connectionist networks, and explains how the theory can be extended to other supervised networks. The theory takes forward Churchland’s state space semantics by making a new and clearer proposal about the syntax of connectionist networks − one which nicely accounts for representational development. Chapter 3 argues that the same theoretical approach can be extended to unsupervised connectionist networks, and to some of the representational systems found in real brains. The approach can also show why connectionist systems sometimes show typicality effects, explaining them without relying upon prototype structure. That is discussed in chapter 4, which also argues that prototype structure, where it does exist, does not determine content. The thesis goes on to defend some unorthodox features of the foregoing theory: that a role is assigned to external samples in specifying syntax, that both inputs to and outputs from the system have a role in determining content, and that the content of a representation is partly determined by the circumstances in which it developed. Each, it is argued, may also be a fruitful way of thinking about mental content more generally. Reliance on developmental factors prompts a swampman-type objection. This is rebutted by reference to three possible reasons why content is attributed at all. Two of these motivations support the idea that content is partly determined by historical factors, and the third is consistent with it. The result: some empirical lessons for philosophical theories of mental content.Philosophy of Min

SAS-SPACE

UvA-DARE (Digital Academic Repository) Generalization and Systematicity in Echo State Networks

Author: Stefan L Frank
Publication venue
Publication date: 30/04/2020
Field of study

Abstract Echo state networks (ESNs) are recurrent neural networks that can be trained efficiently because the weights of recurrent connections remain fixed at random values. Investigations of these networks' ability to generalize in sentence-processing tasks have resulted in mixed outcomes. Here, we argue that ESNs do generalize but that they are not systematic, which we define as the ability to generally outperform Markov models on test sentences that violate the training sentences' grammar. Moreover, we show that systematicity in ESNs can easily be obtained by switching from arbitrary to informative representations of words, suggesting that the information provided by such representations facilitates connectionist systematicity

CiteSeerX

Connectionist language production : distributed representations and the uniform information density hypothesis

Author: Calvillo Jesús
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2019
Field of study

This dissertation approaches the task of modeling human sentence production from a connectionist point of view, and using distributed semantic representations. The main questions it tries to address are: (i) whether the distributed semantic representations defined by Frank et al. (2009) are suitable to model sentence production using artificial neural networks, (ii) the behavior and internal mechanism of a model that uses this representations and recurrent neural networks, and (iii) a mechanistic account of the Uniform Information Density Hypothesis (UID; Jaeger, 2006; Levy and Jaeger, 2007). Regarding the first point, the semantic representations of Frank et al. (2009), called situation vectors are points in a vector space where each vector contains information about the observations in which an event and a corresponding sentence are true. These representations have been successfully used to model language comprehension (e.g., Frank et al., 2009; Venhuizen et al., 2018). During the construction of these vectors, however, a dimensionality reduction process introduces some loss of information, which causes some aspects to be no longer recognizable, reducing the performance of a model that utilizes them. In order to address this issue, belief vectors are introduced, which could be regarded as an alternative way to obtain semantic representations of manageable dimensionality. These two types of representations (situation and belief vectors) are evaluated using them as input for a sentence production model that implements an extension of a Simple Recurrent Neural network (Elman, 1990). This model was tested under different conditions corresponding to different levels of systematicity, which is the ability of a model to generalize from a set of known items to a set of novel ones. Systematicity is an essential attribute that a model of sentence processing has to possess, considering that the number of sentences that can be generated for a given language is infinite, and therefore it is not feasible to memorize all possible message-sentence pairs. The results showed that the model was able to generalize with a very high performance in all test conditions, demonstrating a systematic behavior. Furthermore, the errors that it elicited were related to very similar semantic representations, reflecting the speech error literature, which states that speech errors involve elements with semantic or phonological similarity. This result further demonstrates the systematic behavior of the model, as it processes similar semantic representations in a similar way, even if they are new to the model. Regarding the second point, the sentence production model was analyzed in two different ways. First, by looking at the sentences it produces, including the errors elicited, highlighting difficulties and preferences of the model. The results revealed that the model learns the syntactic patterns of the language, reflecting its statistical nature, and that its main difficulty is related to very similar semantic representations, sometimes producing unintended sentences that are however very semantically related to the intended ones. Second, the connection weights and activation patterns of the model were also analyzed, reaching an algorithmic account of the internal processing of the model. According to this, the input semantic representation activates the words that are related to its content, giving an idea of their order by providing relatively more activation to words that are likely to appear early in the sentence. Then, at each time step the word that was previously produced activates syntactic and semantic constraints on the next word productions, while the context units of the recurrence preserve information through time, allowing the model to enforce long distance dependencies. We propose that these results can inform about the internal processing of models with similar architecture. Regarding the third point, an extension of the model is proposed with the goal of modeling UID. According to UID, language production is an efficient process affected by a tendency to produce linguistic units distributing the information as uniformly as possible and close to the capacity of the communication channel, given the encoding possibilities of the language, thus optimizing the amount of information that is transmitted per time unit. This extension of the model approaches UID by balancing two different production strategies: one where the model produces the word with highest probability given the semantics and the previously produced words, and another one where the model produces the word that would minimize the sentence length given the semantic representation and the previously produced words. By combining these two strategies, the model was able to produce sentences with different levels of information density and uniformity, providing a first step to model UID at the algorithmic level of analysis. In sum, the results show that the distributed semantic representations of Frank et al. (2009) can be used to model sentence production, exhibiting systematicity. Moreover, an algorithmic account of the internal behavior of the model was reached, with the potential to generalize to other models with similar architecture. Finally, a model of UID is presented, highlighting some important aspects about UID that need to be addressed in order to go from the formulation of UID at the computational level of analysis to a mechanistic account at the algorithmic level.Diese Dissertation widmet sich der Aufgabe, die menschliche Satzproduktion aus konnektionistischer Sicht zu modellieren und dabei verteilte semantische Repräsentationen zu verwenden. Die Schwerpunkte werden dabei sein: (i) die Frage, ob die von Frank et al. (2009) definierten verteilten semantischen Repräsentationen geeignet sind, um die Satzproduktion unter Verwendung künstlicher neuronaler Netze zu modellieren; (ii) das Verhalten und der interne Mechanismus eines Modells, das diese Repräsentationen und rekurrente neuronale Netze verwendet; (iii) eine mechanistische Darstellung der Uniform Information Density Hypothesis (UID; Jaeger, 2006; Levy and Jaeger, 2007). Zunächst sei angenommen, dass die Repräsentationen von Frank et al. (2009), genannt Situation Vektoren, Punkte in einem Vektorraum sind, in dem jeder Vektor Informationen über Beobachtungen enthält, in denen ein Ereignis und ein entsprechender Satz wahr sind. Diese Repräsentationen wurden erfolgreich verwendet, um Sprachverständnis zu modellieren (z.B. Frank et al., 2009; Venhuizen et al., 2018). Während der Konstruktion dieser Vektoren führt ein Prozess der Dimensionsreduktion jedoch zu einem gewissen Informationsverlust, wodurch einige Aspekte verloren gehen. Um das Problem zu lösen, werden als Alternative Belief Vektoren eingeführt. Diese beiden Arten der Repräsentation werden ausgewertet, indem sie als Eingabe für ein Satzproduktionsmodell verwendet werden, welches als Erweiterung eines Simple Recurrent Neural Network (SRN, Elman, 1990) implementiert wurden. Dieses Modell wird unter verschiedenen Bedingungen getestet, die verschiedenen Ebenen der Systematizität entsprechen, d.h. der Fähigkeit eines Modells, von einer Menge bekannter Elemente auf eine Menge neuer Elemente zu verallgemeinern. Systematizität ist ein wesentliches Attribut, das ein Modell der Satzverarbeitung besitzen muss, wenn man bedenkt, dass die Anzahl der Sätze, die in einer bestimmte Sprache erzeugt werden können, unendlich ist und es daher nicht möglich ist, sich alle möglichen Nachrichten-Satz-Paare zu merken. Die Ergebnisse zeigen, dass das Modell in der Lage ist, unter allen Testbedingungen erfolgreich zu generalisieren und dabei ein systematisches Verhalten zeigt. Darüber hinaus weisen die verbleibenden Fehler starke Ähnlichkeit zu anderen semantischen Repräsentationen auf. Dies findet sich in der Literatur zu Sprachfehlern wider, wo es heißt, dass Fehler Elemente semantischer oder phonologischer Ähnlichkeit beinhalten. Dieses Ergebnis beweist das v systematische Verhalten des Modells, da es ähnliche semantische Repräsentationen in ähnlicher Weise verarbeitet, auch wenn sie dem Modell unbekannt sind. Zweitens wurde das Satzproduktionsmodell auf zwei verschiedene Arten analysiert. (i) Indem man sich die von ihm erzeugten Sätze ansieht, einschließlich der aufgetretenen Fehler, und dabei die Schwierigkeiten und Präferenzen des Modells hervorhebt. Die Ergebnisse zeigen, dass das Modell die syntaktischen Muster der Sprache lernt. Darüber hinaus zeigt sich, dass die verbleibenden Probleme im Wesentlichen mit sehr ähnlichen semantischen Repräsentationen zusammenhängen, die manchmal ungewollte Sätze produzieren, welche jedoch semantisch nah an den beabsichtigten Sätzen liegen. (ii) Indem die Verbindungsgewichte und Aktivierungsmuster des Modells analysiert und eine algorithmische Darstellung der internen Verarbeitung erzielt wird. Demnach aktiviert die semantische Eingangsrepräsentation jene Wörter, mit denen sie inhaltlich zusammenhängt. In diesem Zusammenhang wird ein Ranking erzeugt, weil Wörter, die wahrscheinlich früh im Satz erscheinen eine stärkere Aktivierung erfahren. Im nächsten Schritt aktiviert das zuvor produzierte Wort syntaktische und semantische Einschränkungen der nächsten Wortproduktionen. Derweil speichern Kontext-Einheiten Informationen für einen längeren Zeitraum, und ermöglichen es dem Modell so, längere Abhängigkeiten zu realisieren. Nach unserem Verständnis können diese Erkenntnisse als Erklärungsgrundlage für andere, verwandte Modelle herangezogen werden. Drittens wird eine Erweiterung des Modells vorgeschlagen, um die UID nachzubilden. Laut UID ist die Sprachproduktion ein effizienter Prozess, der von der Tendenz geprägt ist, linguistische Einheiten zu produzieren, die Informationen so einheitlich wie möglich verteilen, und dabei die Kapazität des Kommunikationskanals vor dem Hintergrund der sprachlichen Kodierungsmöglichkeiten ausreizt, wodurch die Menge der pro Zeiteinheit übertragenen Informationen maximiert wird. Dies wird in der Erweiterung umgesetzt, indem zwei verschiedene Strategien der Wortproduktion gegeneinander ausgespielt werden: Wähle das Wort (i) mit der höchsten Wahrscheinlichkeit unter den zuvor produzierten Wörtern; oder (ii) welches die Satzlänge minimiert. Durch die Kombination dieser beiden Strategien ist das Modell in der Lage, Sätze unter Vorgabe der Informationsdichte und -verteilung zu erzeugen, was einer ersten Modellierung der UID auf algorithmischer Ebene gleichkommt. Zusammenfassend zeigen die Resultate, dass die verteilten semantischen Repräsentationen von Frank et al. (2009) für die Satzproduktion verwendet werden können und dabei Systematizität beobachtet werden kann. Darüber hinaus wird eine algorithmische Erklärung der internen Mechanismen des Modells geliefert. Schließlich wird ein Modell der UID vorgestellt, das einen ersten Schritt zu einer mechanistischen Darstellung auf der algorithmischen Ebene der Analyse darstellt

Universaar

Acronym

Compositional Linguistic Generalization in Artificial Neural Networks

Author: Kim Najoung
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 21/02/2022
Field of study

Compositionality---the principle that the meaning of a complex expression is built from the meanings of its parts---is considered a central property of human language. This dissertation focuses on compositional generalization, a key benefit of compositionality that enables the production and comprehension of novel expressions. Specifically, this dissertation develops a test for compositional generalization for sequence-to-sequence artificial neural networks (ANNs). Before doing so, I start by developing a test for grammatical category abstraction: an important precondition to compositional generalization, because category membership determines the applicability of compositional rules. Then, I construct a test for compositional generalization based on human generalization patterns discussed in existing linguistic and developmental studies. The test takes the form of semantic parsing (translation from natural language expressions to semantic representations) where the training and generalization sets have systematic gaps that can be filled by composing known parts. The generalization cases fall into two broad categories: lexical and structural, depending on whether generalization to novel combinations of known lexical items and known structures is required, or generalization to novel structures is required. The ANNs evaluated on this test exhibit limited degrees of compositional generalization, implying that the inductive biases of the ANNs and human learners differ substantially. An error analysis reveals that all ANNs tested frequently make generalizations that violate faithfulness constraints (e.g., Emma saw Lina ↝ see'(Emma', Audrey') instead of see'(Emma', Lina')). Adding a glossing task (word-by-word translation)---a task that requires maximally faithful input-output mappings---as an auxiliary objective to the Transformer model (Vaswani et al. 2017) greatly improves generalization, demonstrating that a faithfulness bias can be injected through the auxiliary training approach. However, the improvement is limited to lexical generalization; all models struggle with assigning appropriate semantic representations to novel structures regardless of auxiliary training. This difficulty of structural generalization leaves open questions for both ANN and human learners. I discuss promising directions for improving structural generalization in ANNs, and furthermore propose an artificial language learning study for human subjects analogous to the tests posed to ANNs, which will lead to more detailed characterization of the patterns of structural generalization in human learners

JScholarship

Linguistic Competence and New Empiricism in Philosophy and Science

Author: Subotić Vanja
Publication venue
Publication date: 01/01/2023
Field of study

The topic of this dissertation is the nature of linguistic competence, the capacity to understand and produce sentences of natural language. I defend the empiricist account of linguistic competence embedded in the connectionist cognitive science. This strand of cognitive science has been opposed to the traditional symbolic cognitive science, coupled with transformational-generative grammar, which was committed to nativism due to the view that human cognition, including language capacity, should be construed in terms of symbolic representations and hardwired rules. Similarly, linguistic competence in this framework was regarded as being innate, rule-governed, domain-specific, and fundamentally different from performance, i.e., idiosyncrasies and factors governing linguistic behavior. I analyze state-of-the-art connectionist, deep learning models of natural language processing, most notably large language models, to see what they can tell us about linguistic competence. Deep learning is a statistical technique for the classification of patterns through which artificial intelligence researchers train artificial neural networks containing multiple layers that crunch a gargantuan amount of textual and/or visual data. I argue that these models suggest that linguistic competence should be construed as stochastic, pattern-based, and stemming from domain-general mechanisms. Moreover, I distinguish syntactic from semantic competence, and I show for each the ramifications of the endorsement of a connectionist research program as opposed to the traditional symbolic cognitive science and transformational-generative grammar. I provide a unifying front, consisting of usage-based theories, a construction grammar approach, and an embodied approach to cognition to show that the more multimodal and diverse models are in terms of architectural features and training data, the stronger the case is for the connectionist linguistic competence. I also propose to discard the competence vs. performance distinction as theoretically inferior so that a novel and integrative account of linguistic competence originating in connectionism and empiricism that I propose and defend in the dissertation could be put forward in scientific and philosophical literature

PhilPapers

Integrative (Synchronisations-)Mechanismen der (Neuro-)Kognition vor dem Hintergrund des (Neo-)Konnektionismus, der Theorie der nichtlinearen dynamischen Systeme, der Informationstheorie und des Selbstorganisationsparadigmas

Author: Maurer Harald
Publication venue: Universität Tübingen
Publication date: 01/01/2014
Field of study

Der Gegenstand der vorliegenden Arbeit besteht darin, aufbauend auf dem (Haupt-)Thema, der Darlegung und Untersuchung der Lösung des Bindungsproblems anhand von temporalen integrativen (Synchronisations-)Mechanismen im Rahmen der kognitiven (Neuro-)Architekturen im (Neo-)Konnektionismus mit Bezug auf die Wahrnehmungs- und Sprachkognition, vor allem mit Bezug auf die dabei auftretende Kompositionalitäts- und Systematizitätsproblematik, die Konstruktion einer noch zu entwickelnden integrativen Theorie der (Neuro-)Kognition zu skizzie-ren, auf der Basis des Repräsentationsformats einer sog. „vektoriellen Form“, u.z. vor dem Hintergrund des (Neo-)Konnektionismus, der Theorie der nichtlinearen dynamischen Systeme, der Informationstheorie und des Selbstorganisations-Paradigmas

Publikationsserver der Universität Tübingen

The Boltzmann Machine: a Connectionist Model for Supra-Classical Logic

Author: Blanchette Glenn Clifford
Publication venue: 'University of Otago Library'
Publication date: 02/09/2018
Field of study

This thesis moves towards reconciliation of two of the major paradigms of artificial intelligence: by exploring the representation of symbolic logic in an artificial neural network. Previous attempts at the machine representation of classical logic are reviewed. We however, consider the requirements of inference in the broader realm of supra-classical, non-monotonic logic. This logic is concerned with the tolerance of exceptions, thought to be associated with common-sense reasoning. Biological plausibility extends these requirements in the context of human cognition. The thesis identifies the requirements of supra-classical, non-monotonic logic in relation to the properties of candidate neural networks. Previous research has theoretically identified the Boltzmann machine as a potential candidate. We provide experimental evidence supporting a version of the Boltzmann machine as a practical representation of this logic. The theme is pursued by looking at the benefits of utilising the relationship between the logic and the Boltzmann machine in two areas. We report adaptations to the machine architecture which select for different information distributions. These distributions correspond to state preference in traditional logic versus the concept of atomic typicality in contemporary approaches to logic. We also show that the learning algorithm of the Boltzmann machine can be adapted to implement pseudo-rehearsal during retraining. The results of machine retraining are then utilised to consider the plausibility of some current theories of belief revision in logic. Furthermore, we propose an alternative approach to belief revision based on the experimental results of retraining the Boltzmann machine

Te Tumu Eprints Repository