15 research outputs found

    Localist representation can improve efficiency for detection and counting

    Get PDF
    Almost all representations have both distributed and localist aspects, depending upon what properties of the data are being considered. With noisy data, features represented in a localist way can be detected very efficiently, and in binary representations they can be counted more efficiently than those represented in a distributed way. Brains operate in noisy environments, so the localist representation of behaviourally important events is advantageous, and fits what has been found experimentally. Distributed representations require more neurons to perform as efficiently, but they do have greater versatility

    ToyArchitecture: Unsupervised Learning of Interpretable Models of the World

    Full text link
    Research in Artificial Intelligence (AI) has focused mostly on two extremes: either on small improvements in narrow AI domains, or on universal theoretical frameworks which are usually uncomputable, incompatible with theories of biological intelligence, or lack practical implementations. The goal of this work is to combine the main advantages of the two: to follow a big picture view, while providing a particular theory and its implementation. In contrast with purely theoretical approaches, the resulting architecture should be usable in realistic settings, but also form the core of a framework containing all the basic mechanisms, into which it should be easier to integrate additional required functionality. In this paper, we present a novel, purposely simple, and interpretable hierarchical architecture which combines multiple different mechanisms into one system: unsupervised learning of a model of the world, learning the influence of one's own actions on the world, model-based reinforcement learning, hierarchical planning and plan execution, and symbolic/sub-symbolic integration in general. The learned model is stored in the form of hierarchical representations with the following properties: 1) they are increasingly more abstract, but can retain details when needed, and 2) they are easy to manipulate in their local and symbolic-like form, thus also allowing one to observe the learning process at each level of abstraction. On all levels of the system, the representation of the data can be interpreted in both a symbolic and a sub-symbolic manner. This enables the architecture to learn efficiently using sub-symbolic methods and to employ symbolic inference.Comment: Revision: changed the pdftitl

    Combinatorial Generalisation in Machine Vision

    Get PDF
    The human capacity for generalisation, i.e. the fact that we are able to successfully perform a familiar task in novel contexts, is one of the hallmarks of our intelligent behaviour. But what mechanisms enable this capacity that is at the same time so impressive but comes so naturally to us? This is a question that has driven copious amounts of research in both Cognitive Science and Artificial Intelligence for almost a century, with some advocating the need for symbolic systems and others the benefits of distributed representations. In this thesis we will explore which principles help AI systems to generalise to novel combinations of previously observed elements (such as color and shape) in the context of machine vision. We will show that while approaches such as disentangled representation learning showed initial promise, they are fundamentally unable to solve this generalisation problem. In doing so we will illustrate the need to perform severe tests of models in order to properly assess their limitations. We will also see how such failures are robust across different datasets, training modalities and in the internal representations of the models. We then show that a different type of system that attempts to learn object-centric representations is capable of solving the generalisation challenges that previous models could not. We conclude by discussing the implications of these results for long-standing questions regarding the kinds of cognitive systems that are required to solve generalisation problems

    Holistic processing of hierarchical structures in connectionist networks

    Get PDF
    Despite the success of connectionist systems to model some aspects of cognition, critics argue that the lack of symbol processing makes them inadequate for modelling high-level cognitive tasks which require the representation and processing of hierarchical structures. In this thesis we investigate four mechanisms for encoding hierarchical structures in distributed representations that are suitable for processing in connectionist systems: Tensor Product Representation, Recursive Auto-Associative Memory (RAAM), Holographic Reduced Representation (HRR), and Binary Spatter Code (BSC). In these four schemes representations of hierarchical structures are either learned in a connectionist network or constructed by means of various mathematical operations from binary or real-value vectors.It is argued that the resulting representations carry structural information without being themselves syntactically structured. The structural information about a represented object is encoded in the position of its representation in a high-dimensional representational space. We use Principal Component Analysis and constructivist networks to show that well-separated clusters consisting of representations for structurally similar hierarchical objects are formed in the representational spaces of RAAMs and HRRs. The spatial structure of HRRs and RAAM representations supports the holistic yet structure-sensitive processing of them. Holistic operations on RAAM representations can be learned by backpropagation networks. However, holistic operators over HRRs, Tensor Products, and BSCs have to be constructed by hand, which is not a desirable situation. We propose two new algorithms for learning holistic transformations of HRRs from examples. These algorithms are able to generalise the acquired knowledge to hierarchical objects of higher complexity than the training examples. Such generalisations exhibit systematicity of a degree which, to our best knowledge, has not yet been achieved by any other comparable learning method.Finally, we outline how a number of holistic transformations can be learned in parallel and applied to representations of structurally different objects. The ability to distinguish and perform a number of different structure-sensitive operations is one step towards a connectionist architecture that is capable of modelling complex high-level cognitive tasks such as natural language processing and logical inference

    Statistical language learning

    Get PDF
    Theoretical arguments based on the "poverty of the stimulus" have denied a priori the possibility that abstract linguistic representations can be learned inductively from exposure to the environment, given that the linguistic input available to the child is both underdetermined and degenerate. I reassess such learnability arguments by exploring a) the type and amount of statistical information implicitly available in the input in the form of distributional and phonological cues; b) psychologically plausible inductive mechanisms for constraining the search space; c) the nature of linguistic representations, algebraic or statistical. To do so I use three methodologies: experimental procedures, linguistic analyses based on large corpora of naturally occurring speech and text, and computational models implemented in computer simulations. In Chapters 1,2, and 5, I argue that long-distance structural dependencies - traditionally hard to explain with simple distributional analyses based on ngram statistics - can indeed be learned associatively provided the amount of intervening material is highly variable or invariant (the Variability effect). In Chapter 3, I show that simple associative mechanisms instantiated in Simple Recurrent Networks can replicate the experimental findings under the same conditions of variability. Chapter 4 presents successes and limits of such results across perceptual modalities (visual vs. auditory) and perceptual presentation (temporal vs. sequential), as well as the impact of long and short training procedures. In Chapter 5, I show that generalisation to abstract categories from stimuli framed in non-adjacent dependencies is also modulated by the Variability effect. In Chapter 6, I show that the putative separation of algebraic and statistical styles of computation based on successful speech segmentation versus unsuccessful generalisation experiments (as published in a recent Science paper) is premature and is the effect of a preference for phonological properties of the input. In chapter 7 computer simulations of learning irregular constructions suggest that it is possible to learn from positive evidence alone, despite Gold's celebrated arguments on the unlearnability of natural languages. Evolutionary simulations in Chapter 8 show that irregularities in natural languages can emerge from full regularity and remain stable across generations of simulated agents. In Chapter 9 I conclude that the brain may endowed with a powerful statistical device for detecting structure, generalising, segmenting speech, and recovering from overgeneralisations. The experimental and computational evidence gathered here suggests that statistical language learning is more powerful than heretofore acknowledged by the current literature

    Statistical language learning

    Get PDF
    Theoretical arguments based on the "poverty of the stimulus" have denied a priori the possibility that abstract linguistic representations can be learned inductively from exposure to the environment, given that the linguistic input available to the child is both underdetermined and degenerate. I reassess such learnability arguments by exploring a) the type and amount of statistical information implicitly available in the input in the form of distributional and phonological cues; b) psychologically plausible inductive mechanisms for constraining the search space; c) the nature of linguistic representations, algebraic or statistical. To do so I use three methodologies: experimental procedures, linguistic analyses based on large corpora of naturally occurring speech and text, and computational models implemented in computer simulations. In Chapters 1,2, and 5, I argue that long-distance structural dependencies - traditionally hard to explain with simple distributional analyses based on ngram statistics - can indeed be learned associatively provided the amount of intervening material is highly variable or invariant (the Variability effect). In Chapter 3, I show that simple associative mechanisms instantiated in Simple Recurrent Networks can replicate the experimental findings under the same conditions of variability. Chapter 4 presents successes and limits of such results across perceptual modalities (visual vs. auditory) and perceptual presentation (temporal vs. sequential), as well as the impact of long and short training procedures. In Chapter 5, I show that generalisation to abstract categories from stimuli framed in non-adjacent dependencies is also modulated by the Variability effect. In Chapter 6, I show that the putative separation of algebraic and statistical styles of computation based on successful speech segmentation versus unsuccessful generalisation experiments (as published in a recent Science paper) is premature and is the effect of a preference for phonological properties of the input. In chapter 7 computer simulations of learning irregular constructions suggest that it is possible to learn from positive evidence alone, despite Gold's celebrated arguments on the unlearnability of natural languages. Evolutionary simulations in Chapter 8 show that irregularities in natural languages can emerge from full regularity and remain stable across generations of simulated agents. In Chapter 9 I conclude that the brain may endowed with a powerful statistical device for detecting structure, generalising, segmenting speech, and recovering from overgeneralisations. The experimental and computational evidence gathered here suggests that statistical language learning is more powerful than heretofore acknowledged by the current literature.EThOS - Electronic Theses Online ServiceEuropean Union (EU) (HPRN-CT-1999-00065)GBUnited Kingdo

    A Connectionist Defence of the Inscrutability Thesis and the Elimination of the Mental

    Get PDF
    This work consists of two parts. In Part I (chapters 1-5), I shall produce a Connectionist Defence of Quine's Thesis of the Inscrutability of Reference, according to which there is no objective fact of the matter as to what the ontological commitments of the speakers of a language are. I shall start by reviewing Quine's project in his original behaviouristic setting. Chapters 1, and 2 will be devoted to addressing several criticisms that Gareth Evans, and Crispin Wright, have put forward on behalf of the friend of semantic realism. Evans (1981) and, more recendy, Wright (1997) have argued on different grounds that, under certain conditions, structural simplicity may become alethic-i.e., truth-conducive-for semantic theories. Being structurally more complex than the standard semantic theory, Quine's perverse semantic route (see chapter 1) is an easy prey for Evans' and Wright's considerations. I shall argue that both Evans' and Wright's criticisms are unmotivated, and do not jeopardize Quine's overall enterprise. I shall then propose a perverse theory of reference (chapter 3) which differs substantially from the ones advanced in the previous literature on the issue. The motivation for pursuing a different perverse semantic proposal resides in the fact that the route I shall be offering is as simple, structurally speaking, as our sanctioned theory of reference is meant to be. Thanks to this feature, my strategy is not subject to certain criticisms which may put perverse proposals a la Quine in jeopardy, thereby becoming an overall better candidate for the Quinean to fulfill her goal. In chapter 4, I shall introduce and develop a criterion recently produced by Wright (1997) in terms of 'psychological simplicity' which threatens the perverse semantic proposal I offered in chapter 3. I shall argue that a Language-of-Thought (LOT)-model of human cognition could motivate Wright's criterion. I shall then introduce the reader to some basic aspects of connectionist theory, and elaborate on a particularly promising neurocomputational approach to language processing put forward by Jeff Elman (1992; 1998). I shall argue that if instead of endorsing a LOT hypothesis, we model human cognition by a recurrent neural network a la Elman, then Wright's criterion is unmotivated. In particular, I shall argue that considerations regarding 'psychological simplicity' are neutral, favouring neither a standard theory of reference, nor a perverse one. In the remainder of Part I, I shall focus upon certain problems for the defender of the Inscrutability Thesis highlighted by the friend of connectionist theory. In chapter 5 I shall introduce a mathematical technique for measuring conceptual similarity across networks that Aarre Laakso and Gary Cottrell (1998; 2000) have recently developed. I shall show how Paul Churchland makes use of Laakso and Cottrell's results to argue that connectionism can furnish us with all we need to construct a robust theory of semantics, and a robust theory of translation-robustness that may potentially be exploited by a connectionist foe of Quine to argue against the Inscrutability Thesis. The bulk of the chapter will be devoted to showing that the notion of conceptual similarity available to the connectionist leaves room for a "connectionist Quinean" to kick in with a or\Q-io-many translational mapping across networks. In Part II (chapters 6, and 7), I shall produce a Connectionist Defence of the Thesis of Eliminative Materialism, according to which propositional attitudes don't exist (see chapter 7). I shall start by rejoining to two arguments that Stephen Stich has recently put forward against the thesis of eliminative materialism. In a nutshell, Stich (1990; 1991) argues that (i) the thesis of eliminative materialism, is neither true nor false, and that (ii) even if it were true, that would be philosophically uninteresting. To support (i) and (ii) Stich relies on two premises: (a) that the job of a theory of reference is to make explicit the tacit theory of reference which underlies our intuitions about the notion of reference itself; and (b) that our intuitive notion of reference is a highly idiosyncratic one. In chapter 6 I shall address Stich's anti-eliminativist claims (i) and (ii). I shall argue that even if we agreed with premises (a) and (b), that would lend no support whatsoever for (i) and (ii). Finally, in chapter 7, I shall introduce a connectionist-inspired conditional argument for the elimination of the posits of folk psychology put forward by William Ramsey, Stephen Stich, and Joseph Garon. I shall consider an objection to the eliminativist argument raised by Andy Clark. I shall then review a counter that Stephen Stich and Ted Warfield produce on behalf of the eliminativist. The discussion in chapter 5 on 'state space semantics and conceptual similarity' will be used to show that Clark's argument is not threatened by Stich and Warfield's considerations. Then, in the remainder of Part II, I shall offer a different line of argument to counter to Clark. A line that focuses on the notion of causal efficacy. I hope to show that the thesis of eliminativist materialism is correct. Conclusions, and directions for future research will follow

    Revisiting lexical ambiguity effects in visual word recognition

    Get PDF
    2012 - 2013The aim of this work is to focus on how lexically ambiguous words are represented in the mental lexicon of speakers. The existence of words with multiple meanings/senses (e.g., credenza, mora, etc. in Italian) is a pervasive feature of natural language. Routinely speakers of almost all languages encounter ambiguous words, whose correct interpretation is made by recurring to the linguistic context in which these forms are inserted... [edited by author]XII n.s

    Introduction to Psycholiguistics

    Get PDF
    corecore