35,681 research outputs found
A practical Bayesian framework for backpropagation networks
A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible (1) objective comparisons between solutions using alternative network architectures, (2) objective stopping rules for network pruning or growing procedures, (3) objective choice of magnitude and type of weight decay terms or additive regularizers (for penalizing large weights, etc.), (4) a measure of the effective number of well-determined parameters in a model, (5) quantified estimates of the error bars on network parameters and on network output, and (6) objective comparisons with alternative learning and interpolation models such as splines and radial basis functions. The Bayesian "evidence" automatically embodies "Occam's razor," penalizing overflexible and overcomplex models. The Bayesian approach helps detect poor underlying assumptions in learning models. For learning models well matched to a problem, a good correlation between generalization ability and the Bayesian evidence is obtained
Artificial neural networks as models of stimulus control
We evaluate the ability of artificial neural network models (multi-layer perceptrons) to predict stimulus-response relationships. A variety of empirical results are considered, such as generalization, peak-shift (supernormality) and stimulus intensity effects. The networks were trained on the same tasks as the animals in the considered experiments. The subsequent generalization tests on the networks showed that the model replicates correctly the empirical results. It is concluded that these models are valuable tools in the study of animal behaviour
Deep speckle correlation: a deep learning approach toward scalable imaging through scattering media
Imaging through scattering is an important yet challenging problem. Tremendous progress has been made by exploiting the deterministic input–output “transmission matrix” for a fixed medium. However, this “one-to-one” mapping is highly susceptible to speckle decorrelations – small perturbations to the scattering medium lead to model errors and severe degradation of the imaging performance. Our goal here is to develop a new framework that is highly scalable to both medium perturbations and measurement requirement. To do so, we propose a statistical “one-to-all” deep learning (DL) technique that encapsulates a wide range of statistical variations for the model to be resilient to speckle decorrelations. Specifically, we develop a convolutional neural network (CNN) that is able to learn the statistical information contained in the speckle intensity patterns captured on a set of diffusers having the same macroscopic parameter. We then show for the first time, to the best of our knowledge, that the trained CNN is able to generalize and make high-quality object predictions through an entirely different set of diffusers of the same class. Our work paves the way to a highly scalable DL approach for imaging through scattering media.National Science Foundation (NSF) (1711156); Directorate for Engineering (ENG). (1711156 - National Science Foundation (NSF); Directorate for Engineering (ENG))First author draf
Local Difference Measures between Complex Networks for Dynamical System Model Evaluation
Acknowledgments We thank Reik V. Donner for inspiring suggestions that initialized the work presented herein. Jan H. Feldhoff is credited for providing us with the STARS simulation data and for his contributions to fruitful discussions. Comments by the anonymous reviewers are gratefully acknowledged as they led to substantial improvements of the manuscript.Peer reviewedPublisher PD
Acquiring Word-Meaning Mappings for Natural Language Interfaces
This paper focuses on a system, WOLFIE (WOrd Learning From Interpreted
Examples), that acquires a semantic lexicon from a corpus of sentences paired
with semantic representations. The lexicon learned consists of phrases paired
with meaning representations. WOLFIE is part of an integrated system that
learns to transform sentences into representations such as logical database
queries. Experimental results are presented demonstrating WOLFIE's ability to
learn useful lexicons for a database interface in four different natural
languages. The usefulness of the lexicons learned by WOLFIE are compared to
those acquired by a similar system, with results favorable to WOLFIE. A second
set of experiments demonstrates WOLFIE's ability to scale to larger and more
difficult, albeit artificially generated, corpora. In natural language
acquisition, it is difficult to gather the annotated data needed for supervised
learning; however, unannotated data is fairly plentiful. Active learning
methods attempt to select for annotation and training only the most informative
examples, and therefore are potentially very useful in natural language
applications. However, most results to date for active learning have only
considered standard classification tasks. To reduce annotation effort while
maintaining accuracy, we apply active learning to semantic lexicons. We show
that active learning can significantly reduce the number of annotated examples
required to achieve a given level of performance
Representing the bilingual's two lexicons
A review of empirical work suggests that the lexical representations of a bilingual’s two languages are independent (Smith, 1991), but may also be sensitive to between language similarity patterns (e.g. Cristoffanini, Kirsner, and Milech, 1986). Some researchers hold that infant bilinguals do not initially differentiate between their two languages (e.g. Redlinger & Park, 1980). Yet by the age of two they appear to have acquired separate linguistic systems for each language (Lanza, 1992). This paper explores the hypothesis that the separation of lexical representations in bilinguals is a functional rather than an architectural one. It suggests that the separation may be driven by differences in the structure of the input to a common architectural system. Connectionist simulations are presented modelling the representation of two sets of lexical information. These simulations explore the conditions required to create functionally independent lexical representations in a single neural network. It is shown that a single network may acquire a second language after learning a first (avoiding the traditional problem of catastrophic interference in these networks). Further it is shown that in a single network, the functional independence of representations is dependent on inter-language similarity patterns. The latter finding is difficult to account for in a model that postulates architecturally separate lexical representations
Generalization from correlated sets of patterns in the perceptron
Generalization is a central aspect of learning theory. Here, we propose a
framework that explores an auxiliary task-dependent notion of generalization,
and attempts to quantitatively answer the following question: given two sets of
patterns with a given degree of dissimilarity, how easily will a network be
able to "unify" their interpretation? This is quantified by the volume of the
configurations of synaptic weights that classify the two sets in a similar
manner. To show the applicability of our idea in a concrete setting, we compute
this quantity for the perceptron, a simple binary classifier, using the
classical statistical physics approach in the replica-symmetric ansatz. In this
case, we show how an analytical expression measures the "distance-based
capacity", the maximum load of patterns sustainable by the network, at fixed
dissimilarity between patterns and fixed allowed number of errors. This curve
indicates that generalization is possible at any distance, but with decreasing
capacity. We propose that a distance-based definition of generalization may be
useful in numerical experiments with real-world neural networks, and to explore
computationally sub-dominant sets of synaptic solutions
- …