12 research outputs found

    Towards Lifelong Reasoning with Sparse and Compressive Memory Systems

    Get PDF
    Humans have a remarkable ability to remember information over long time horizons. When reading a book, we build up a compressed representation of the past narrative, such as the characters and events that have built up the story so far. We can do this even if they are separated by thousands of words from the current text, or long stretches of time between readings. During our life, we build up and retain memories that tell us where we live, what we have experienced, and who we are. Adding memory to artificial neural networks has been transformative in machine learning, allowing models to extract structure from temporal data, and more accurately model the future. However the capacity for long-range reasoning in current memory-augmented neural networks is considerably limited, in comparison to humans, despite the access to powerful modern computers. This thesis explores two prominent approaches towards scaling artificial memories to lifelong capacity: sparse access and compressive memory structures. With sparse access, the inspection, retrieval, and updating of only a very small subset of pertinent memory is considered. It is found that sparse memory access is beneficial for learning, allowing for improved data-efficiency and improved generalisation. From a computational perspective - sparsity allows scaling to memories with millions of entities on a simple CPU-based machine. It is shown that memory systems that compress the past to a smaller set of representations reduce redundancy and can speed up the learning of rare classes and improve upon classical data-structures in database systems. Compressive memory architectures are also devised for sequence prediction tasks and are observed to significantly increase the state-of-the-art in modelling natural language

    A complex systems approach to education in Switzerland

    Get PDF
    The insights gained from the study of complex systems in biological, social, and engineered systems enables us not only to observe and understand, but also to actively design systems which will be capable of successfully coping with complex and dynamically changing situations. The methods and mindset required for this approach have been applied to educational systems with their diverse levels of scale and complexity. Based on the general case made by Yaneer Bar-Yam, this paper applies the complex systems approach to the educational system in Switzerland. It confirms that the complex systems approach is valid. Indeed, many recommendations made for the general case have already been implemented in the Swiss education system. To address existing problems and difficulties, further steps are recommended. This paper contributes to the further establishment complex systems approach by shedding light on an area which concerns us all, which is a frequent topic of discussion and dispute among politicians and the public, where billions of dollars have been spent without achieving the desired results, and where it is difficult to directly derive consequences from actions taken. The analysis of the education system's different levels, their complexity and scale will clarify how such a dynamic system should be approached, and how it can be guided towards the desired performance

    Brain Computations and Connectivity [2nd edition]

    Get PDF
    This is an open access title available under the terms of a CC BY-NC-ND 4.0 International licence. It is free to read on the Oxford Academic platform and offered as a free PDF download from OUP and selected open access locations. Brain Computations and Connectivity is about how the brain works. In order to understand this, it is essential to know what is computed by different brain systems; and how the computations are performed. The aim of this book is to elucidate what is computed in different brain systems; and to describe current biologically plausible computational approaches and models of how each of these brain systems computes. Understanding the brain in this way has enormous potential for understanding ourselves better in health and in disease. Potential applications of this understanding are to the treatment of the brain in disease; and to artificial intelligence which will benefit from knowledge of how the brain performs many of its extraordinarily impressive functions. This book is pioneering in taking this approach to brain function: to consider what is computed by many of our brain systems; and how it is computed, and updates by much new evidence including the connectivity of the human brain the earlier book: Rolls (2021) Brain Computations: What and How, Oxford University Press. Brain Computations and Connectivity will be of interest to all scientists interested in brain function and how the brain works, whether they are from neuroscience, or from medical sciences including neurology and psychiatry, or from the area of computational science including machine learning and artificial intelligence, or from areas such as theoretical physics

    Advances in Multiple Viewpoint Systems and Applications in Modelling Higher Order Musical Structure

    Get PDF
    PhDStatistical approaches are capable of underpinning strong models of musical structure, perception, and cognition. Multiple viewpoint systems are probabilistic models of sequential prediction that aim to capture the multidimensional aspects of a symbolic domain with predictions from multiple finite-context models combined in an information theoretically informed way. Information theory provides an important grounding for such models. In computational terms, information content is an empirical measure of compressibility for model evaluation, and entropy a powerful weighting system for combining predictions from multiple models. In perceptual terms, clear parallels can be drawn between information content and surprise, and entropy and certainty. In cognitive terms information theory underpins explanatory models of both musical representation and expectation. The thesis makes two broad contributions to the field of statistical modelling of music cognition: firstly, advancing the general understanding of multiple viewpoint systems, and, secondly, developing bottom-up, statistical learning methods capable of capturing higher order structure. In the first category, novel methods for predicting multiple basic attributes are empirically tested, significantly outperforming established methods, and refuting the assumption found in the literature that basic attributes are statistically independent from one another. Additionally, novel techniques for improving the prediction of derived viewpoints (viewpoints that abstract information away from whatever musical surface is under consideration) are introduced and analysed, and their relation with cognitive representations explored. Finally, the performance and suitability of an established algorithm that automatically constructs locally optimal multiple viewpoint systems is tested. In the second category, the current research brings together a number of existing statistical methods for segmentation and modelling musical surfaces with the aim of representing higher-order structure. A comprehensive review and empirical evaluation of these information theoretic segmentation methods is presented. Methods for labelling higher order segments, akin to layers of abstraction in a representation, are empirically evaluated and the cognitive implications explored. The architecture and performance of the models are assessed from cognitive and musicological perspectives.Media and Arts Technology programme, EPSRC Doctoral Training Centre EP/G03723X/1

    Computations and Computers in the Sciences of Mind and Brain

    Get PDF
    Computationalism says that brains are computing mechanisms, that is, mechanisms that perform computations. At present, there is no consensus on how to formulate computationalism precisely or adjudicate the dispute between computationalism and its foes, or between different versions of computationalism. An important reason for the current impasse is the lack of a satisfactory philosophical account of computing mechanisms. The main goal of this dissertation is to offer such an account. I also believe that the history of computationalism sheds light on the current debate. By tracing different versions of computationalism to their common historical origin, we can see how the current divisions originated and understand their motivation. Reconstructing debates over computationalism in the context of their own intellectual history can contribute to philosophical progress on the relation between brains and computing mechanisms and help determine how brains and computing mechanisms are alike, and how they differ. Accordingly, my dissertation is divided into a historical part, which traces the early history of computationalism up to 1946, and a philosophical part, which offers an account of computing mechanisms. The two main ideas developed in this dissertation are that (1) computational states are to be identified functionally not semantically, and (2) computing mechanisms are to be studied by functional analysis. The resulting account of computing mechanism, which I call the functional account of computing mechanisms, can be used to identify computing mechanisms and the functions they compute. I use the functional account of computing mechanisms to taxonomize computing mechanisms based on their different computing power, and I use this taxonomy of computing mechanisms to taxonomize different versions of computationalism based on the functional properties that they ascribe to brains. By doing so, I begin to tease out empirically testable statements about the functional organization of the brain that different versions of computationalism are committed to. I submit that when computationalism is reformulated in the more explicit and precise way I propose, the disputes about computationalism can be adjudicated on the grounds of empirical evidence from neuroscience

    Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference

    Get PDF

    Aspects of emergent cyclicity in language and computation

    Get PDF
    This thesis has four parts, which correspond to the presentation and development of a theoretical framework for the study of cognitive capacities qua physical phenomena, and a case study of locality conditions over natural languages. Part I deals with computational considerations, setting the tone of the rest of the thesis, and introducing and defining critical concepts like ‘grammar’, ‘automaton’, and the relations between them . Fundamental questions concerning the place of formal language theory in linguistic inquiry, as well as the expressibility of linguistic and computational concepts in common terms, are raised in this part. Part II further explores the issues addressed in Part I with particular emphasis on how grammars are implemented by means of automata, and the properties of the formal languages that these automata generate. We will argue against the equation between effective computation and function-based computation, and introduce examples of computable procedures which are nevertheless impossible to capture using traditional function-based theories. The connection with cognition will be made in the light of dynamical frustrations: the irreconciliable tension between mutually incompatible tendencies that hold for a given dynamical system. We will provide arguments in favour of analyzing natural language as emerging from a tension between different systems (essentially, semantics and morpho-phonology) which impose orthogonal requirements over admissible outputs. The concept of level of organization or scale comes to the foreground here; and apparent contradictions and incommensurabilities between concepts and theories are revisited in a new light: that of dynamical nonlinear systems which are fundamentally frustrated. We will also characterize the computational system that emerges from such an architecture: the goal is to get a syntactic component which assigns the simplest possible structural description to sub-strings, in terms of its computational complexity. A system which can oscillate back and forth in the hierarchy of formal languages in assigning structural representations to local domains will be referred to as a computationally mixed system. Part III is where the really fun stuff starts. Field theory is introduced, and its applicability to neurocognitive phenomena is made explicit, with all due scale considerations. Physical and mathematical concepts are permanently interacting as we analyze phrase structure in terms of pseudo-fractals (in Mandelbrot’s sense) and define syntax as a (possibly unary) set of topological operations over completely Hausdorff (CH) ultrametric spaces. These operations, which makes field perturbations interfere, transform that initial completely Hausdorff ultrametric space into a metric, Hausdorff space with a weaker separation axiom. Syntax, in this proposal, is not ‘generative’ in any traditional sense –except the ‘fully explicit theory’ one-: rather, it partitions (technically, ‘parametrizes’) a topological space. Syntactic dependencies are defined as interferences between perturbations over a field, which reduce the total entropy of the system per cycles, at the cost of introducing further dimensions where attractors corresponding to interpretations for a phrase marker can be found. Part IV is a sample of what we can gain by further pursuing the physics of language approach, both in terms of empirical adequacy and theoretical elegance, not to mention the unlimited possibilities of interdisciplinary collaboration. In this section we set our focus on island phenomena as defined by Ross (1967), critically revisiting the most relevant literature on this topic, and establishing a typology of constructions that are strong islands, which cannot be violated. These constructions are particularly interesting because they limit the phase space of what is expressible via natural language, and thus reveal crucial aspects of its underlying dynamics. We will argue that a dynamically frustrated system which is characterized by displaying mixed computational dependencies can provide straightforward characterizations of cyclicity in terms of changes in dependencies in local domains
    corecore