960 research outputs found
Binding and Normalization of Binary Sparse Distributed Representations by Context-Dependent Thinning
Distributed representations were often criticized as inappropriate for encoding of data with a complex structure. However Plate's Holographic Reduced Representations and Kanerva's Binary Spatter Codes are recent schemes that allow on-the-fly encoding of nested compositional structures by real-valued or dense binary vectors of fixed dimensionality.
In this paper we consider procedures of the Context-Dependent Thinning which were developed for representation of complex hierarchical items in the architecture of Associative-Projective Neural Networks. These procedures provide binding of items represented by sparse binary codevectors (with low probability of 1s). Such an encoding is biologically plausible and allows a high storage capacity of distributed associative memory where the codevectors may be stored.
In contrast to known binding procedures, Context-Dependent Thinning preserves the same low density (or sparseness) of the bound codevector for varied number of component codevectors. Besides, a bound codevector is not only similar to another one with similar component codevectors (as in other schemes), but it is also similar to the component codevectors themselves. This allows the similarity of structures to be estimated just by the overlap of their codevectors, without retrieval of the component codevectors. This also allows an easy retrieval of the component codevectors.
Examples of algorithmic and neural-network implementations of the thinning procedures are considered. We also present representation examples for various types of nested structured data (propositions using role-filler and predicate-arguments representation schemes, trees, directed acyclic graphs) using sparse codevectors of fixed dimension. Such representations may provide a fruitful alternative to the symbolic representations of traditional AI, as well as to the localist and microfeature-based connectionist representations
Why Neurons Have Thousands of Synapses, A Theory of Sequence Memory in Neocortex
Neocortical neurons have thousands of excitatory synapses. It is a mystery
how neurons integrate the input from so many synapses and what kind of
large-scale network behavior this enables. It has been previously proposed that
non-linear properties of dendrites enable neurons to recognize multiple
patterns. In this paper we extend this idea by showing that a neuron with
several thousand synapses arranged along active dendrites can learn to
accurately and robustly recognize hundreds of unique patterns of cellular
activity, even in the presence of large amounts of noise and pattern variation.
We then propose a neuron model where some of the patterns recognized by a
neuron lead to action potentials and define the classic receptive field of the
neuron, whereas the majority of the patterns recognized by a neuron act as
predictions by slightly depolarizing the neuron without immediately generating
an action potential. We then present a network model based on neurons with
these properties and show that the network learns a robust model of time-based
sequences. Given the similarity of excitatory neurons throughout the neocortex
and the importance of sequence memory in inference and behavior, we propose
that this form of sequence memory is a universal property of neocortical
tissue. We further propose that cellular layers in the neocortex implement
variations of the same sequence memory algorithm to achieve different aspects
of inference and behavior. The neuron and network models we introduce are
robust over a wide range of parameters as long as the network uses a sparse
distributed code of cellular activations. The sequence capacity of the network
scales linearly with the number of synapses on each neuron. Thus neurons need
thousands of synapses to learn the many temporal patterns in sensory stimuli
and motor sequences.Comment: Submitted for publicatio
Towards Lifelong Reasoning with Sparse and Compressive Memory Systems
Humans have a remarkable ability to remember information over long time horizons. When reading a book, we build up a compressed representation of the past narrative, such as the characters and events that have built up the story so far. We can do this even if they are separated by thousands of words from the current text, or long stretches of time between readings. During our life, we build up and retain memories that tell us where we live, what we have experienced, and who we are. Adding memory to artificial neural networks has been transformative in machine learning, allowing models to extract structure from temporal data, and more accurately model the future. However the capacity for long-range reasoning in current memory-augmented neural networks is considerably limited, in comparison to humans, despite the access to powerful modern computers. This thesis explores two prominent approaches towards scaling artificial memories to lifelong capacity: sparse access and compressive memory structures. With sparse access, the inspection, retrieval, and updating of only a very small subset of pertinent memory is considered. It is found that sparse memory access is beneficial for learning, allowing for improved data-efficiency and improved generalisation. From a computational perspective - sparsity allows scaling to memories with millions of entities on a simple CPU-based machine. It is shown that memory systems that compress the past to a smaller set of representations reduce redundancy and can speed up the learning of rare classes and improve upon classical data-structures in database systems. Compressive memory architectures are also devised for sequence prediction tasks and are observed to significantly increase the state-of-the-art in modelling natural language
Statistical physics of neural systems
The ability of processing and storing information is considered a characteristic
trait of intelligent systems. In biological neural networks, learning is strongly
believed to take place at the synaptic level, in terms of modulation of synaptic
efficacy. It can be thus interpreted as the expression of a collective phenomena,
emerging when neurons connect each other in constituting a complex network of
interactions. In this work, we represent learning as an optimization problem, actually
implementing a local search, in the synaptic space, of specific configurations, known
as solutions and making a neural network able to accomplish a series of different
tasks. For instance, we would like the network to adapt the strength of its synaptic
connections, in order to be capable of classifying a series of objects, by assigning to
each object its corresponding class-label. Supported by a series of experiments, it
has been suggested that synapses may exploit a very few number of synaptic states
for encoding information. It is known that this feature makes learning in neural
networks a challenging task. Extending the large deviation analysis performed in
the extreme case of binary synaptic couplings, in this work, we prove the existence
of regions of the phase space, where solutions are organized in extremely dense
clusters. This picture turns out to be invariant to the tuning of all the parameters of
the model. Solutions within the clusters are more robust to noise, thus enhancing the
learning performances. This has inspired the design of new learning algorithms, as
well as it has clarified the effectiveness of the previously proposed ones. We further
provide quantitative evidence that the gain achievable when considering a greater
number of available synaptic states for encoding information, is consistent only up
to a very few number of bits. This is in line with the above mentioned experimental
results. Besides the challenging aspect of low precision synaptic connections, it is
also known that the neuronal environment is extremely noisy. Whether stochasticity
can enhance or worsen the learning performances is currently matter of debate. In
this work, we consider a neural network model where the synaptic connections are random variables, sampled according to a parametrized probability distribution.
We prove that, this source of stochasticity naturally drives towards regions of the
phase space at high densities of solutions. These regions are directly accessible by
means of gradient descent strategies, over the parameters of the synaptic couplings
distribution. We further set up a statistical physics analysis, through which we
show that solutions in the dense regions are characterized by robustness and good
generalization performances. Stochastic neural networks are also capable of building
abstract representations of input stimuli and then generating new input samples,
according to the inferred statistics of the input signal. In this regard, we propose a
new learning rule, called Delayed Correlation Matching (DCM), that relying on the
matching between time-delayed activity correlations, makes a neural network able
to store patterns of neuronal activity. When considering hidden neuronal states, the
DCM learning rule is also able to train Restricted Boltzmann Machines as generative
models. In this work, we further require the DCM learning rule to fulfil some
biological constraints, such as locality, sparseness of the neural coding and the Dale’s
principle. While retaining all these biological requirements, the DCM learning
rule has shown to be effective for different network topologies, and in both on-line
learning regimes and presence of correlated patterns. We further show that it is also
able to prevent the creation of spurious attractor states
AI of Brain and Cognitive Sciences: From the Perspective of First Principles
Nowadays, we have witnessed the great success of AI in various applications,
including image classification, game playing, protein structure analysis,
language translation, and content generation. Despite these powerful
applications, there are still many tasks in our daily life that are rather
simple to humans but pose great challenges to AI. These include image and
language understanding, few-shot learning, abstract concepts, and low-energy
cost computing. Thus, learning from the brain is still a promising way that can
shed light on the development of next-generation AI. The brain is arguably the
only known intelligent machine in the universe, which is the product of
evolution for animals surviving in the natural environment. At the behavior
level, psychology and cognitive sciences have demonstrated that human and
animal brains can execute very intelligent high-level cognitive functions. At
the structure level, cognitive and computational neurosciences have unveiled
that the brain has extremely complicated but elegant network forms to support
its functions. Over years, people are gathering knowledge about the structure
and functions of the brain, and this process is accelerating recently along
with the initiation of giant brain projects worldwide. Here, we argue that the
general principles of brain functions are the most valuable things to inspire
the development of AI. These general principles are the standard rules of the
brain extracting, representing, manipulating, and retrieving information, and
here we call them the first principles of the brain. This paper collects six
such first principles. They are attractor network, criticality, random network,
sparse coding, relational memory, and perceptual learning. On each topic, we
review its biological background, fundamental property, potential application
to AI, and future development.Comment: 59 pages, 5 figures, review articl
- …