2,761 research outputs found

    A practical Bayesian framework for backpropagation networks

    Get PDF
    A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible (1) objective comparisons between solutions using alternative network architectures, (2) objective stopping rules for network pruning or growing procedures, (3) objective choice of magnitude and type of weight decay terms or additive regularizers (for penalizing large weights, etc.), (4) a measure of the effective number of well-determined parameters in a model, (5) quantified estimates of the error bars on network parameters and on network output, and (6) objective comparisons with alternative learning and interpolation models such as splines and radial basis functions. The Bayesian "evidence" automatically embodies "Occam's razor," penalizing overflexible and overcomplex models. The Bayesian approach helps detect poor underlying assumptions in learning models. For learning models well matched to a problem, a good correlation between generalization ability and the Bayesian evidence is obtained

    Bayesian interpolation

    Get PDF
    Although Bayesian analysis has been in use since Laplace, the Bayesian method of model-comparison has only recently been developed in depth. In this paper, the Bayesian approach to regularization and model-comparison is demonstrated by studying the inference problem of interpolating noisy data. The concepts and methods described are quite general and can be applied to many other data modeling problems. Regularizing constants are set by examining their posterior probability distribution. Alternative regularizers (priors) and alternative basis sets are objectively compared by evaluating the evidence for them. “Occam's razor” is automatically embodied by this process. The way in which Bayes infers the values of regularizing constants and noise levels has an elegant interpretation in terms of the effective number of parameters determined by the data set. This framework is due to Gull and Skilling

    Information-based objective functions for active data selection

    Get PDF
    Learning can be made more efficient if we can actively select particularly salient data points. Within a Bayesian learning framework, objective functions are discussed that measure the expected informativeness of candidate measurements. Three alternative specifications of what we want to gain information about lead to three different criteria for data selection. All these criteria depend on the assumption that the hypothesis space is correct, which may prove to be their main weakness

    Analysis of Linsker's simulations of Hebbian rules

    Get PDF
    Linsker has reported the development of center-surround receptive fields and oriented receptive fields in simulations of a Hebb-type equation in a linear network. The dynamics of the learning rule are analyzed in terms of the eigenvectors of the covariance matrix of cell activities. Analytic and computational results for Linsker's covariance matrices, and some general theorems, lead to an explanation of the emergence of center-surround and certain oriented structures. We estimate criteria for the parameter regime in which center-surround structures emerge

    The Role of Constraints in Hebbian Learning

    Get PDF
    Models of unsupervised, correlation-based (Hebbian) synaptic plasticity are typically unstable: either all synapses grow until each reaches the maximum allowed strength, or all synapses decay to zero strength. A common method of avoiding these outcomes is to use a constraint that conserves or limits the total synaptic strength over a cell. We study the dynamic effects of such constraints. Two methods of enforcing a constraint are distinguished, multiplicative and subtractive. For otherwise linear learning rules, multiplicative enforcement of a constraint results in dynamics that converge to the principal eigenvector of the operator determining unconstrained synaptic development. Subtractive enforcement, in contrast, typically leads to a final state in which almost all synaptic strengths reach either the maximum or minimum allowed value. This final state is often dominated by weight configurations other than the principal eigenvector of the unconstrained operator. Multiplicative enforcement yields a “graded” receptive field in which most mutually correlated inputs are represented, whereas subtractive enforcement yields a receptive field that is “sharpened” to a subset of maximally correlated inputs. If two equivalent input populations (e.g., two eyes) innervate a common target, multiplicative enforcement prevents their segregation (ocular dominance segregation) when the two populations are weakly correlated; whereas subtractive enforcement allows segregation under these circumstances. These results may be used to understand constraints both over output cells and over input cells. A variety of rules that can implement constrained dynamics are discussed

    Sparse Graph Codes for Quantum Error-Correction

    Full text link
    We present sparse graph codes appropriate for use in quantum error-correction. Quantum error-correcting codes based on sparse graphs are of interest for three reasons. First, the best codes currently known for classical channels are based on sparse graphs. Second, sparse graph codes keep the number of quantum interactions associated with the quantum error correction process small: a constant number per quantum bit, independent of the blocklength. Third, sparse graph codes often offer great flexibility with respect to blocklength and rate. We believe some of the codes we present are unsurpassed by previously published quantum error-correcting codes.Comment: Version 7.3e: 42 pages. Extended version, Feb 2004. A shortened version was resubmitted to IEEE Transactions on Information Theory Jan 20, 200

    Fast Hands-free Writing by Gaze Direction

    Full text link
    We describe a method for text entry based on inverse arithmetic coding that relies on gaze direction and which is faster and more accurate than using an on-screen keyboard. These benefits are derived from two innovations: the writing task is matched to the capabilities of the eye, and a language model is used to make predictable words and phrases easier to write.Comment: 3 pages. Final versio

    Elliptical slice sampling

    Get PDF
    Many probabilistic models introduce strong dependencies between variables using a latent multivariate Gaussian distribution or a Gaussian process. We present a new Markov chain Monte Carlo algorithm for performing inference in models with multivariate Gaussian priors. Its key properties are: 1) it has simple, generic code applicable to many models, 2) it has no free parameters, 3) it works well for a variety of Gaussian process based models. These properties make our method ideal for use while model building, removing the need to spend time deriving and tuning updates for more complex algorithms.Comment: 8 pages, 6 figures, appearing in AISTATS 2010 (JMLR: W&CP volume 6). Differences from first submission: some minor edits in response to feedback

    Nested sampling for Potts models

    Get PDF
    Nested sampling is a new Monte Carlo method by Skilling [1] intended for general Bayesian computation. Nested sampling provides a robust alternative to annealing-based methods for computing normalizing constants. It can also generate estimates of other quantities such as posterior expectations. The key technical requirement is an ability to draw samples uniformly from the prior subject to a constraint on the likelihood. We provide a demonstration with the Potts model, an undirected graphical model
    • 

    corecore