960 research outputs found

    Binding and Normalization of Binary Sparse Distributed Representations by Context-Dependent Thinning

    Get PDF
    Distributed representations were often criticized as inappropriate for encoding of data with a complex structure. However Plate's Holographic Reduced Representations and Kanerva's Binary Spatter Codes are recent schemes that allow on-the-fly encoding of nested compositional structures by real-valued or dense binary vectors of fixed dimensionality. In this paper we consider procedures of the Context-Dependent Thinning which were developed for representation of complex hierarchical items in the architecture of Associative-Projective Neural Networks. These procedures provide binding of items represented by sparse binary codevectors (with low probability of 1s). Such an encoding is biologically plausible and allows a high storage capacity of distributed associative memory where the codevectors may be stored. In contrast to known binding procedures, Context-Dependent Thinning preserves the same low density (or sparseness) of the bound codevector for varied number of component codevectors. Besides, a bound codevector is not only similar to another one with similar component codevectors (as in other schemes), but it is also similar to the component codevectors themselves. This allows the similarity of structures to be estimated just by the overlap of their codevectors, without retrieval of the component codevectors. This also allows an easy retrieval of the component codevectors. Examples of algorithmic and neural-network implementations of the thinning procedures are considered. We also present representation examples for various types of nested structured data (propositions using role-filler and predicate-arguments representation schemes, trees, directed acyclic graphs) using sparse codevectors of fixed dimension. Such representations may provide a fruitful alternative to the symbolic representations of traditional AI, as well as to the localist and microfeature-based connectionist representations

    Why Neurons Have Thousands of Synapses, A Theory of Sequence Memory in Neocortex

    Get PDF
    Neocortical neurons have thousands of excitatory synapses. It is a mystery how neurons integrate the input from so many synapses and what kind of large-scale network behavior this enables. It has been previously proposed that non-linear properties of dendrites enable neurons to recognize multiple patterns. In this paper we extend this idea by showing that a neuron with several thousand synapses arranged along active dendrites can learn to accurately and robustly recognize hundreds of unique patterns of cellular activity, even in the presence of large amounts of noise and pattern variation. We then propose a neuron model where some of the patterns recognized by a neuron lead to action potentials and define the classic receptive field of the neuron, whereas the majority of the patterns recognized by a neuron act as predictions by slightly depolarizing the neuron without immediately generating an action potential. We then present a network model based on neurons with these properties and show that the network learns a robust model of time-based sequences. Given the similarity of excitatory neurons throughout the neocortex and the importance of sequence memory in inference and behavior, we propose that this form of sequence memory is a universal property of neocortical tissue. We further propose that cellular layers in the neocortex implement variations of the same sequence memory algorithm to achieve different aspects of inference and behavior. The neuron and network models we introduce are robust over a wide range of parameters as long as the network uses a sparse distributed code of cellular activations. The sequence capacity of the network scales linearly with the number of synapses on each neuron. Thus neurons need thousands of synapses to learn the many temporal patterns in sensory stimuli and motor sequences.Comment: Submitted for publicatio

    Towards Lifelong Reasoning with Sparse and Compressive Memory Systems

    Get PDF
    Humans have a remarkable ability to remember information over long time horizons. When reading a book, we build up a compressed representation of the past narrative, such as the characters and events that have built up the story so far. We can do this even if they are separated by thousands of words from the current text, or long stretches of time between readings. During our life, we build up and retain memories that tell us where we live, what we have experienced, and who we are. Adding memory to artificial neural networks has been transformative in machine learning, allowing models to extract structure from temporal data, and more accurately model the future. However the capacity for long-range reasoning in current memory-augmented neural networks is considerably limited, in comparison to humans, despite the access to powerful modern computers. This thesis explores two prominent approaches towards scaling artificial memories to lifelong capacity: sparse access and compressive memory structures. With sparse access, the inspection, retrieval, and updating of only a very small subset of pertinent memory is considered. It is found that sparse memory access is beneficial for learning, allowing for improved data-efficiency and improved generalisation. From a computational perspective - sparsity allows scaling to memories with millions of entities on a simple CPU-based machine. It is shown that memory systems that compress the past to a smaller set of representations reduce redundancy and can speed up the learning of rare classes and improve upon classical data-structures in database systems. Compressive memory architectures are also devised for sequence prediction tasks and are observed to significantly increase the state-of-the-art in modelling natural language

    Statistical physics of neural systems

    Get PDF
    The ability of processing and storing information is considered a characteristic trait of intelligent systems. In biological neural networks, learning is strongly believed to take place at the synaptic level, in terms of modulation of synaptic efficacy. It can be thus interpreted as the expression of a collective phenomena, emerging when neurons connect each other in constituting a complex network of interactions. In this work, we represent learning as an optimization problem, actually implementing a local search, in the synaptic space, of specific configurations, known as solutions and making a neural network able to accomplish a series of different tasks. For instance, we would like the network to adapt the strength of its synaptic connections, in order to be capable of classifying a series of objects, by assigning to each object its corresponding class-label. Supported by a series of experiments, it has been suggested that synapses may exploit a very few number of synaptic states for encoding information. It is known that this feature makes learning in neural networks a challenging task. Extending the large deviation analysis performed in the extreme case of binary synaptic couplings, in this work, we prove the existence of regions of the phase space, where solutions are organized in extremely dense clusters. This picture turns out to be invariant to the tuning of all the parameters of the model. Solutions within the clusters are more robust to noise, thus enhancing the learning performances. This has inspired the design of new learning algorithms, as well as it has clarified the effectiveness of the previously proposed ones. We further provide quantitative evidence that the gain achievable when considering a greater number of available synaptic states for encoding information, is consistent only up to a very few number of bits. This is in line with the above mentioned experimental results. Besides the challenging aspect of low precision synaptic connections, it is also known that the neuronal environment is extremely noisy. Whether stochasticity can enhance or worsen the learning performances is currently matter of debate. In this work, we consider a neural network model where the synaptic connections are random variables, sampled according to a parametrized probability distribution. We prove that, this source of stochasticity naturally drives towards regions of the phase space at high densities of solutions. These regions are directly accessible by means of gradient descent strategies, over the parameters of the synaptic couplings distribution. We further set up a statistical physics analysis, through which we show that solutions in the dense regions are characterized by robustness and good generalization performances. Stochastic neural networks are also capable of building abstract representations of input stimuli and then generating new input samples, according to the inferred statistics of the input signal. In this regard, we propose a new learning rule, called Delayed Correlation Matching (DCM), that relying on the matching between time-delayed activity correlations, makes a neural network able to store patterns of neuronal activity. When considering hidden neuronal states, the DCM learning rule is also able to train Restricted Boltzmann Machines as generative models. In this work, we further require the DCM learning rule to fulfil some biological constraints, such as locality, sparseness of the neural coding and the Dale’s principle. While retaining all these biological requirements, the DCM learning rule has shown to be effective for different network topologies, and in both on-line learning regimes and presence of correlated patterns. We further show that it is also able to prevent the creation of spurious attractor states

    AI of Brain and Cognitive Sciences: From the Perspective of First Principles

    Full text link
    Nowadays, we have witnessed the great success of AI in various applications, including image classification, game playing, protein structure analysis, language translation, and content generation. Despite these powerful applications, there are still many tasks in our daily life that are rather simple to humans but pose great challenges to AI. These include image and language understanding, few-shot learning, abstract concepts, and low-energy cost computing. Thus, learning from the brain is still a promising way that can shed light on the development of next-generation AI. The brain is arguably the only known intelligent machine in the universe, which is the product of evolution for animals surviving in the natural environment. At the behavior level, psychology and cognitive sciences have demonstrated that human and animal brains can execute very intelligent high-level cognitive functions. At the structure level, cognitive and computational neurosciences have unveiled that the brain has extremely complicated but elegant network forms to support its functions. Over years, people are gathering knowledge about the structure and functions of the brain, and this process is accelerating recently along with the initiation of giant brain projects worldwide. Here, we argue that the general principles of brain functions are the most valuable things to inspire the development of AI. These general principles are the standard rules of the brain extracting, representing, manipulating, and retrieving information, and here we call them the first principles of the brain. This paper collects six such first principles. They are attractor network, criticality, random network, sparse coding, relational memory, and perceptual learning. On each topic, we review its biological background, fundamental property, potential application to AI, and future development.Comment: 59 pages, 5 figures, review articl
    • …
    corecore