22,802 research outputs found

    The Information Complexity of Learning Tasks, their Structure and their Distance

    Get PDF
    We introduce an asymmetric distance in the space of learning tasks, and a framework to compute their complexity. These concepts are foundational for the practice of transfer learning, whereby a parametric model is pre-trained for a task, and then fine-tuned for another. The framework we develop is non-asymptotic, captures the finite nature of the training dataset, and allows distinguishing learning from memorization. It encompasses, as special cases, classical notions from Kolmogorov complexity, Shannon, and Fisher Information. However, unlike some of those frameworks, it can be applied to large-scale models and real-world datasets. Our framework is the first to measure complexity in a way that accounts for the effect of the optimization scheme, which is critical in Deep Learning

    Noise suppressing sensor encoding and neural signal orthonormalization

    Get PDF
    In this paper we regard first the situation where parallel channels are disturbed by noise. With the goal of maximal information conservation we deduce the conditions for a transform which "immunizes" the channels against noise influence before the signals are used in later operations. It shows up that the signals have to be decorrelated and normalized by the filter which corresponds for the case of one channel to the classical result of Shannon. Additional simulations for image encoding and decoding show that this constitutes an efficient approach for noise suppression. Furthermore, by a corresponding objective function we deduce the stochastic and deterministic learning rules for a neural network that implements the data orthonormalization. In comparison with other already existing normalization networks our network shows approximately the same in the stochastic case but, by its generic deduction ensures the convergence and enables the use as independent building block in other contexts, e.g. whitening for independent component analysis. Keywords: information conservation, whitening filter, data orthonormalization network, image encoding, noise suppression

    Hyperparameter Learning via Distributional Transfer

    Full text link
    Bayesian optimisation is a popular technique for hyperparameter learning but typically requires initial exploration even in cases where similar prior tasks have been solved. We propose to transfer information across tasks using learnt representations of training datasets used in those tasks. This results in a joint Gaussian process model on hyperparameters and data representations. Representations make use of the framework of distribution embeddings into reproducing kernel Hilbert spaces. The developed method has a faster convergence compared to existing baselines, in some cases requiring only a few evaluations of the target objective

    Detecting and Estimating Signals over Noisy and Unreliable Synapses: Information-Theoretic Analysis

    Get PDF
    The temporal precision with which neurons respond to synaptic inputs has a direct bearing on the nature of the neural code. A characterization of the neuronal noise sources associated with different sub-cellular components (synapse, dendrite, soma, axon, and so on) is needed to understand the relationship between noise and information transfer. Here we study the effect of the unreliable, probabilistic nature of synaptic transmission on information transfer in the absence of interaction among presynaptic inputs. We derive theoretical lower bounds on the capacity of a simple model of a cortical synapse under two different paradigms. In signal estimation, the signal is assumed to be encoded in the mean firing rate of the presynaptic neuron, and the objective is to estimate the continuous input signal from the postsynaptic voltage. In signal detection, the input is binary, and the presence or absence of a presynaptic action potential is to be detected from the postsynaptic voltage. The efficacy of information transfer in synaptic transmission is characterized by deriving optimal strategies under these two paradigms. On the basis of parameter values derived from neocortex, we find that single cortical synapses cannot transmit information reliably, but redundancy obtained using a small number of multiple synapses leads to a significant improvement in the information capacity of synaptic transmission
    corecore