343 research outputs found
Making decisions based on context: models and applications in cognitive sciences and natural language processing
It is known that humans are capable of making decisions based on context and generalizing what they have learned. This dissertation considers two related problem areas and proposes different models that take context information into account. By including the context, the proposed models exhibit strong performance in each of the problem areas considered.
The first problem area focuses on a context association task studied in cognitive science, which evaluates the ability of a learning agent to associate specific stimuli with an appropriate response in particular spatial contexts. Four neural circuit models are proposed to model how the stimulus and context information are processed to produce a response. The neural networks are trained by modifying the strength of neural connections (weights) using principles of Hebbian learning. Such learning is considered biologically plausible, in contrast to back propagation techniques that do not have a solid neurophysiological basis. A series of theoretical results for the neural circuit models are established, guaranteeing convergence to an optimal configuration when all the stimulus-context pairs are provided during training. Among all the models, a specific model based on ideas from recommender systems trained with a primal-dual update rule, achieves perfect performance in learning and generalizing the mapping from context-stimulus pairs to correct responses.
The second problem area considered in the thesis focuses on clinical natural language processing (NLP). A particular application is the development of deep-learning models for analyzing radiology reports. Four NLP tasks are considered including anatomy named entity recognition, negation detection, incidental finding detection, and clinical concept extraction. A hierarchical Recurrent Neural Network (RNN) is proposed for anatomy named entity recognition, which is then used to produce a set of features for incidental finding detection of pulmonary nodules. A clinical context word embedding model is obtained, which is used with an RNN to model clinical concept extraction. Finally, feature-enriched RNN and transformer-based models with contextual word embedding are proposed for negation detection. All these models take the (clinical) context information into account. The models are evaluated on different datasets and are shown to achieve strong performance, largely outperforming the state-of-art
Experience-driven formation of parts-based representations in a model of layered visual memory
Growing neuropsychological and neurophysiological evidence suggests that the
visual cortex uses parts-based representations to encode, store and retrieve
relevant objects. In such a scheme, objects are represented as a set of
spatially distributed local features, or parts, arranged in stereotypical
fashion. To encode the local appearance and to represent the relations between
the constituent parts, there has to be an appropriate memory structure formed
by previous experience with visual objects. Here, we propose a model how a
hierarchical memory structure supporting efficient storage and rapid recall of
parts-based representations can be established by an experience-driven process
of self-organization. The process is based on the collaboration of slow
bidirectional synaptic plasticity and homeostatic unit activity regulation,
both running at the top of fast activity dynamics with winner-take-all
character modulated by an oscillatory rhythm. These neural mechanisms lay down
the basis for cooperation and competition between the distributed units and
their synaptic connections. Choosing human face recognition as a test task, we
show that, under the condition of open-ended, unsupervised incremental
learning, the system is able to form memory traces for individual faces in a
parts-based fashion. On a lower memory layer the synaptic structure is
developed to represent local facial features and their interrelations, while
the identities of different persons are captured explicitly on a higher layer.
An additional property of the resulting representations is the sparseness of
both the activity during the recall and the synaptic patterns comprising the
memory traces.Comment: 34 pages, 12 Figures, 1 Table, published in Frontiers in
Computational Neuroscience (Special Issue on Complex Systems Science and
Brain Dynamics),
http://www.frontiersin.org/neuroscience/computationalneuroscience/paper/10.3389/neuro.10/015.2009
Neural Distributed Autoassociative Memories: A Survey
Introduction. Neural network models of autoassociative, distributed memory
allow storage and retrieval of many items (vectors) where the number of stored
items can exceed the vector dimension (the number of neurons in the network).
This opens the possibility of a sublinear time search (in the number of stored
items) for approximate nearest neighbors among vectors of high dimension. The
purpose of this paper is to review models of autoassociative, distributed
memory that can be naturally implemented by neural networks (mainly with local
learning rules and iterative dynamics based on information locally available to
neurons). Scope. The survey is focused mainly on the networks of Hopfield,
Willshaw and Potts, that have connections between pairs of neurons and operate
on sparse binary vectors. We discuss not only autoassociative memory, but also
the generalization properties of these networks. We also consider neural
networks with higher-order connections and networks with a bipartite graph
structure for non-binary data with linear constraints. Conclusions. In
conclusion we discuss the relations to similarity search, advantages and
drawbacks of these techniques, and topics for further research. An interesting
and still not completely resolved question is whether neural autoassociative
memories can search for approximate nearest neighbors faster than other index
structures for similarity search, in particular for the case of very high
dimensional vectors.Comment: 31 page
A Review of Findings from Neuroscience and Cognitive Psychology as Possible Inspiration for the Path to Artificial General Intelligence
This review aims to contribute to the quest for artificial general
intelligence by examining neuroscience and cognitive psychology methods for
potential inspiration. Despite the impressive advancements achieved by deep
learning models in various domains, they still have shortcomings in abstract
reasoning and causal understanding. Such capabilities should be ultimately
integrated into artificial intelligence systems in order to surpass data-driven
limitations and support decision making in a way more similar to human
intelligence. This work is a vertical review that attempts a wide-ranging
exploration of brain function, spanning from lower-level biological neurons,
spiking neural networks, and neuronal ensembles to higher-level concepts such
as brain anatomy, vector symbolic architectures, cognitive and categorization
models, and cognitive architectures. The hope is that these concepts may offer
insights for solutions in artificial general intelligence.Comment: 143 pages, 49 figures, 244 reference
Brain-Inspired Computational Intelligence via Predictive Coding
Artificial intelligence (AI) is rapidly becoming one of the key technologies
of this century. The majority of results in AI thus far have been achieved
using deep neural networks trained with the error backpropagation learning
algorithm. However, the ubiquitous adoption of this approach has highlighted
some important limitations such as substantial computational cost, difficulty
in quantifying uncertainty, lack of robustness, unreliability, and biological
implausibility. It is possible that addressing these limitations may require
schemes that are inspired and guided by neuroscience theories. One such theory,
called predictive coding (PC), has shown promising performance in machine
intelligence tasks, exhibiting exciting properties that make it potentially
valuable for the machine learning community: PC can model information
processing in different brain areas, can be used in cognitive control and
robotics, and has a solid mathematical grounding in variational inference,
offering a powerful inversion scheme for a specific class of continuous-state
generative models. With the hope of foregrounding research in this direction,
we survey the literature that has contributed to this perspective, highlighting
the many ways that PC might play a role in the future of machine learning and
computational intelligence at large.Comment: 37 Pages, 9 Figure
New Learning and Control Algorithms for Neural Networks.
Neural networks offer distributed processing power, error correcting capability and structural simplicity of the basic computing element. Neural networks have been found to be attractive for applications such as associative memory, robotics, image processing, speech understanding and optimization. Neural networks are self-adaptive systems that try to configure themselves to store new information. This dissertation investigates two approaches to improve performance: better learning and supervisory control. A new learning algorithm called the Correlation Continuous Unlearning (CCU) algorithm is presented. It is based on the idea of removing undesirable information that is encountered during the learning period. The control methods proposed in the dissertation improve the convergence by affecting the order of updates using a controller. Most previous studies have focused on monolithic structures. But it is known that the human brain has a bicameral nature at the gross level and it also has several specialized structures. In this dissertation, we investigate the computing characteristics of neural networks that are not monolithic being enhanced by a controller that can run algorithms that take advantage of the known global characteristics of the stored information. Such networks have been called bicameral neural networks. Stinson and Kak considered elementary bicameral models that used asynchronous control. New control methods, the method of iteration and bicameral classifier, are now proposed. The method of iteration uses the Hamming distance between the probe and the answer to control the convergence to a correct answer, whereas the bicameral classifier takes advantage of global characteristics using a clustering algorithm. The bicameral classifier is applied to two different models of equiprobable patterns as well as the more realistic situation where patterns can have different probabilities. The CCU algorithm has also been applied to a bidirectional associative memory with greatly improved performance. For multilayered networks, indexing of patterns to enhance system performance has been studied
- …