649 research outputs found

    Semantic Systematicity in Connectionist Language Production

    Get PDF
    Decades of studies trying to define the extent to which artificial neural networks can exhibit systematicity suggest that systematicity can be achieved by connectionist models but not by default. Here we present a novel connectionist model of sentence production that employs rich situation model representations originally proposed for modeling systematicity in comprehension. The high performance of our model demonstrates that such representations are also well suited to model language production. Furthermore, the model can produce multiple novel sentences for previously unseen situations, including in a different voice (actives vs. passive) and with words in new syntactic roles, thus demonstrating semantic and syntactic generalization and arguably systematicity. Our results provide yet further evidence that such connectionist approaches can achieve systematicity, in production as well as comprehension. We propose our positive results to be a consequence of the regularities of the microworld from which the semantic representations are derived, which provides a sufficient structure from which the neural network can interpret novel inputs

    Connectionist Inference Models

    Get PDF
    The performance of symbolic inference tasks has long been a challenge to connectionists. In this paper, we present an extended survey of this area. Existing connectionist inference systems are reviewed, with particular reference to how they perform variable binding and rule-based reasoning, and whether they involve distributed or localist representations. The benefits and disadvantages of different representations and systems are outlined, and conclusions drawn regarding the capabilities of connectionist inference systems when compared with symbolic inference systems or when used for cognitive modeling

    Connectionist language production : distributed representations and the uniform information density hypothesis

    Get PDF
    This dissertation approaches the task of modeling human sentence production from a connectionist point of view, and using distributed semantic representations. The main questions it tries to address are: (i) whether the distributed semantic representations defined by Frank et al. (2009) are suitable to model sentence production using artificial neural networks, (ii) the behavior and internal mechanism of a model that uses this representations and recurrent neural networks, and (iii) a mechanistic account of the Uniform Information Density Hypothesis (UID; Jaeger, 2006; Levy and Jaeger, 2007). Regarding the first point, the semantic representations of Frank et al. (2009), called situation vectors are points in a vector space where each vector contains information about the observations in which an event and a corresponding sentence are true. These representations have been successfully used to model language comprehension (e.g., Frank et al., 2009; Venhuizen et al., 2018). During the construction of these vectors, however, a dimensionality reduction process introduces some loss of information, which causes some aspects to be no longer recognizable, reducing the performance of a model that utilizes them. In order to address this issue, belief vectors are introduced, which could be regarded as an alternative way to obtain semantic representations of manageable dimensionality. These two types of representations (situation and belief vectors) are evaluated using them as input for a sentence production model that implements an extension of a Simple Recurrent Neural network (Elman, 1990). This model was tested under different conditions corresponding to different levels of systematicity, which is the ability of a model to generalize from a set of known items to a set of novel ones. Systematicity is an essential attribute that a model of sentence processing has to possess, considering that the number of sentences that can be generated for a given language is infinite, and therefore it is not feasible to memorize all possible message-sentence pairs. The results showed that the model was able to generalize with a very high performance in all test conditions, demonstrating a systematic behavior. Furthermore, the errors that it elicited were related to very similar semantic representations, reflecting the speech error literature, which states that speech errors involve elements with semantic or phonological similarity. This result further demonstrates the systematic behavior of the model, as it processes similar semantic representations in a similar way, even if they are new to the model. Regarding the second point, the sentence production model was analyzed in two different ways. First, by looking at the sentences it produces, including the errors elicited, highlighting difficulties and preferences of the model. The results revealed that the model learns the syntactic patterns of the language, reflecting its statistical nature, and that its main difficulty is related to very similar semantic representations, sometimes producing unintended sentences that are however very semantically related to the intended ones. Second, the connection weights and activation patterns of the model were also analyzed, reaching an algorithmic account of the internal processing of the model. According to this, the input semantic representation activates the words that are related to its content, giving an idea of their order by providing relatively more activation to words that are likely to appear early in the sentence. Then, at each time step the word that was previously produced activates syntactic and semantic constraints on the next word productions, while the context units of the recurrence preserve information through time, allowing the model to enforce long distance dependencies. We propose that these results can inform about the internal processing of models with similar architecture. Regarding the third point, an extension of the model is proposed with the goal of modeling UID. According to UID, language production is an efficient process affected by a tendency to produce linguistic units distributing the information as uniformly as possible and close to the capacity of the communication channel, given the encoding possibilities of the language, thus optimizing the amount of information that is transmitted per time unit. This extension of the model approaches UID by balancing two different production strategies: one where the model produces the word with highest probability given the semantics and the previously produced words, and another one where the model produces the word that would minimize the sentence length given the semantic representation and the previously produced words. By combining these two strategies, the model was able to produce sentences with different levels of information density and uniformity, providing a first step to model UID at the algorithmic level of analysis. In sum, the results show that the distributed semantic representations of Frank et al. (2009) can be used to model sentence production, exhibiting systematicity. Moreover, an algorithmic account of the internal behavior of the model was reached, with the potential to generalize to other models with similar architecture. Finally, a model of UID is presented, highlighting some important aspects about UID that need to be addressed in order to go from the formulation of UID at the computational level of analysis to a mechanistic account at the algorithmic level.Diese Dissertation widmet sich der Aufgabe, die menschliche Satzproduktion aus konnektionistischer Sicht zu modellieren und dabei verteilte semantische ReprĂ€sentationen zu verwenden. Die Schwerpunkte werden dabei sein: (i) die Frage, ob die von Frank et al. (2009) definierten verteilten semantischen ReprĂ€sentationen geeignet sind, um die Satzproduktion unter Verwendung kĂŒnstlicher neuronaler Netze zu modellieren; (ii) das Verhalten und der interne Mechanismus eines Modells, das diese ReprĂ€sentationen und rekurrente neuronale Netze verwendet; (iii) eine mechanistische Darstellung der Uniform Information Density Hypothesis (UID; Jaeger, 2006; Levy and Jaeger, 2007). ZunĂ€chst sei angenommen, dass die ReprĂ€sentationen von Frank et al. (2009), genannt Situation Vektoren, Punkte in einem Vektorraum sind, in dem jeder Vektor Informationen ĂŒber Beobachtungen enthĂ€lt, in denen ein Ereignis und ein entsprechender Satz wahr sind. Diese ReprĂ€sentationen wurden erfolgreich verwendet, um SprachverstĂ€ndnis zu modellieren (z.B. Frank et al., 2009; Venhuizen et al., 2018). WĂ€hrend der Konstruktion dieser Vektoren fĂŒhrt ein Prozess der Dimensionsreduktion jedoch zu einem gewissen Informationsverlust, wodurch einige Aspekte verloren gehen. Um das Problem zu lösen, werden als Alternative Belief Vektoren eingefĂŒhrt. Diese beiden Arten der ReprĂ€sentation werden ausgewertet, indem sie als Eingabe fĂŒr ein Satzproduktionsmodell verwendet werden, welches als Erweiterung eines Simple Recurrent Neural Network (SRN, Elman, 1990) implementiert wurden. Dieses Modell wird unter verschiedenen Bedingungen getestet, die verschiedenen Ebenen der SystematizitĂ€t entsprechen, d.h. der FĂ€higkeit eines Modells, von einer Menge bekannter Elemente auf eine Menge neuer Elemente zu verallgemeinern. SystematizitĂ€t ist ein wesentliches Attribut, das ein Modell der Satzverarbeitung besitzen muss, wenn man bedenkt, dass die Anzahl der SĂ€tze, die in einer bestimmte Sprache erzeugt werden können, unendlich ist und es daher nicht möglich ist, sich alle möglichen Nachrichten-Satz-Paare zu merken. Die Ergebnisse zeigen, dass das Modell in der Lage ist, unter allen Testbedingungen erfolgreich zu generalisieren und dabei ein systematisches Verhalten zeigt. DarĂŒber hinaus weisen die verbleibenden Fehler starke Ähnlichkeit zu anderen semantischen ReprĂ€sentationen auf. Dies findet sich in der Literatur zu Sprachfehlern wider, wo es heißt, dass Fehler Elemente semantischer oder phonologischer Ähnlichkeit beinhalten. Dieses Ergebnis beweist das v systematische Verhalten des Modells, da es Ă€hnliche semantische ReprĂ€sentationen in Ă€hnlicher Weise verarbeitet, auch wenn sie dem Modell unbekannt sind. Zweitens wurde das Satzproduktionsmodell auf zwei verschiedene Arten analysiert. (i) Indem man sich die von ihm erzeugten SĂ€tze ansieht, einschließlich der aufgetretenen Fehler, und dabei die Schwierigkeiten und PrĂ€ferenzen des Modells hervorhebt. Die Ergebnisse zeigen, dass das Modell die syntaktischen Muster der Sprache lernt. DarĂŒber hinaus zeigt sich, dass die verbleibenden Probleme im Wesentlichen mit sehr Ă€hnlichen semantischen ReprĂ€sentationen zusammenhĂ€ngen, die manchmal ungewollte SĂ€tze produzieren, welche jedoch semantisch nah an den beabsichtigten SĂ€tzen liegen. (ii) Indem die Verbindungsgewichte und Aktivierungsmuster des Modells analysiert und eine algorithmische Darstellung der internen Verarbeitung erzielt wird. Demnach aktiviert die semantische EingangsreprĂ€sentation jene Wörter, mit denen sie inhaltlich zusammenhĂ€ngt. In diesem Zusammenhang wird ein Ranking erzeugt, weil Wörter, die wahrscheinlich frĂŒh im Satz erscheinen eine stĂ€rkere Aktivierung erfahren. Im nĂ€chsten Schritt aktiviert das zuvor produzierte Wort syntaktische und semantische EinschrĂ€nkungen der nĂ€chsten Wortproduktionen. Derweil speichern Kontext-Einheiten Informationen fĂŒr einen lĂ€ngeren Zeitraum, und ermöglichen es dem Modell so, lĂ€ngere AbhĂ€ngigkeiten zu realisieren. Nach unserem VerstĂ€ndnis können diese Erkenntnisse als ErklĂ€rungsgrundlage fĂŒr andere, verwandte Modelle herangezogen werden. Drittens wird eine Erweiterung des Modells vorgeschlagen, um die UID nachzubilden. Laut UID ist die Sprachproduktion ein effizienter Prozess, der von der Tendenz geprĂ€gt ist, linguistische Einheiten zu produzieren, die Informationen so einheitlich wie möglich verteilen, und dabei die KapazitĂ€t des Kommunikationskanals vor dem Hintergrund der sprachlichen Kodierungsmöglichkeiten ausreizt, wodurch die Menge der pro Zeiteinheit ĂŒbertragenen Informationen maximiert wird. Dies wird in der Erweiterung umgesetzt, indem zwei verschiedene Strategien der Wortproduktion gegeneinander ausgespielt werden: WĂ€hle das Wort (i) mit der höchsten Wahrscheinlichkeit unter den zuvor produzierten Wörtern; oder (ii) welches die SatzlĂ€nge minimiert. Durch die Kombination dieser beiden Strategien ist das Modell in der Lage, SĂ€tze unter Vorgabe der Informationsdichte und -verteilung zu erzeugen, was einer ersten Modellierung der UID auf algorithmischer Ebene gleichkommt. Zusammenfassend zeigen die Resultate, dass die verteilten semantischen ReprĂ€sentationen von Frank et al. (2009) fĂŒr die Satzproduktion verwendet werden können und dabei SystematizitĂ€t beobachtet werden kann. DarĂŒber hinaus wird eine algorithmische ErklĂ€rung der internen Mechanismen des Modells geliefert. Schließlich wird ein Modell der UID vorgestellt, das einen ersten Schritt zu einer mechanistischen Darstellung auf der algorithmischen Ebene der Analyse darstellt

    High level cognitive information processing in neural networks

    Get PDF
    Two related research efforts were addressed: (1) high-level connectionist cognitive modeling; and (2) local neural circuit modeling. The goals of the first effort were to develop connectionist models of high-level cognitive processes such as problem solving or natural language understanding, and to understand the computational requirements of such models. The goals of the second effort were to develop biologically-realistic model of local neural circuits, and to understand the computational behavior of such models. In keeping with the nature of NASA's Innovative Research Program, all the work conducted under the grant was highly innovative. For instance, the following ideas, all summarized, are contributions to the study of connectionist/neural networks: (1) the temporal-winner-take-all, relative-position encoding, and pattern-similarity association techniques; (2) the importation of logical combinators into connection; (3) the use of analogy-based reasoning as a bridge across the gap between the traditional symbolic paradigm and the connectionist paradigm; and (4) the application of connectionism to the domain of belief representation/reasoning. The work on local neural circuit modeling also departs significantly from the work of related researchers. In particular, its concentration on low-level neural phenomena that could support high-level cognitive processing is unusual within the area of biological local circuit modeling, and also serves to expand the horizons of the artificial neural net field

    A Defense of Pure Connectionism

    Full text link
    Connectionism is an approach to neural-networks-based cognitive modeling that encompasses the recent deep learning movement in artificial intelligence. It came of age in the 1980s, with its roots in cybernetics and earlier attempts to model the brain as a system of simple parallel processors. Connectionist models center on statistical inference within neural networks with empirically learnable parameters, which can be represented as graphical models. More recent approaches focus on learning and inference within hierarchical generative models. Contra influential and ongoing critiques, I argue in this dissertation that the connectionist approach to cognitive science possesses in principle (and, as is becoming increasingly clear, in practice) the resources to model even the most rich and distinctly human cognitive capacities, such as abstract, conceptual thought and natural language comprehension and production. Consonant with much previous philosophical work on connectionism, I argue that a core principle—that proximal representations in a vector space have similar semantic values—is the key to a successful connectionist account of the systematicity and productivity of thought, language, and other core cognitive phenomena. My work here differs from preceding work in philosophy in several respects: (1) I compare a wide variety of connectionist responses to the systematicity challenge and isolate two main strands that are both historically important and reflected in ongoing work today: (a) vector symbolic architectures and (b) (compositional) vector space semantic models; (2) I consider very recent applications of these approaches, including their deployment on large-scale machine learning tasks such as machine translation; (3) I argue, again on the basis mostly of recent developments, for a continuity in representation and processing across natural language, image processing and other domains; (4) I explicitly link broad, abstract features of connectionist representation to recent proposals in cognitive science similar in spirit, such as hierarchical Bayesian and free energy minimization approaches, and offer a single rebuttal of criticisms of these related paradigms; (5) I critique recent alternative proposals that argue for a hybrid Classical (i.e. serial symbolic)/statistical model of mind; (6) I argue that defending the most plausible form of a connectionist cognitive architecture requires rethinking certain distinctions that have figured prominently in the history of the philosophy of mind and language, such as that between word- and phrase-level semantic content, and between inference and association

    Design for a Darwinian Brain: Part 1. Philosophy and Neuroscience

    Full text link
    Physical symbol systems are needed for open-ended cognition. A good way to understand physical symbol systems is by comparison of thought to chemistry. Both have systematicity, productivity and compositionality. The state of the art in cognitive architectures for open-ended cognition is critically assessed. I conclude that a cognitive architecture that evolves symbol structures in the brain is a promising candidate to explain open-ended cognition. Part 2 of the paper presents such a cognitive architecture.Comment: Darwinian Neurodynamics. Submitted as a two part paper to Living Machines 2013 Natural History Museum, Londo

    A Short Survey of Systematic Generalization

    Full text link
    This survey includes systematic generalization and a history of how machine learning addresses it. We aim to summarize and organize the related information of both conventional and recent improvements. We first look at the definition of systematic generalization, then introduce Classicist and Connectionist. We then discuss different types of Connectionists and how they approach the generalization. Two crucial problems of variable binding and causality are discussed. We look into systematic generalization in language, vision, and VQA fields. Recent improvements from different aspects are discussed. Systematic generalization has a long history in artificial intelligence. We could cover only a small portion of many contributions. We hope this paper provides a background and is beneficial for discoveries in future work

    Classical Computational Models

    Get PDF
    • 

    corecore