477 research outputs found

    Sparse Coding Predicts Optic Flow Specificities of Zebrafish Pretectal Neurons

    Full text link
    Zebrafish pretectal neurons exhibit specificities for large-field optic flow patterns associated with rotatory or translatory body motion. We investigate the hypothesis that these specificities reflect the input statistics of natural optic flow. Realistic motion sequences were generated using computer graphics simulating self-motion in an underwater scene. Local retinal motion was estimated with a motion detector and encoded in four populations of directionally tuned retinal ganglion cells, represented as two signed input variables. This activity was then used as input into one of two learning networks: a sparse coding network (competitive learning) and backpropagation network (supervised learning). Both simulations develop specificities for optic flow which are comparable to those found in a neurophysiological study (Kubo et al. 2014), and relative frequencies of the various neuronal responses are best modeled by the sparse coding approach. We conclude that the optic flow neurons in the zebrafish pretectum do reflect the optic flow statistics. The predicted vectorial receptive fields show typical optic flow fields but also "Gabor" and dipole-shaped patterns that likely reflect difference fields needed for reconstruction by linear superposition.Comment: Published Conference Paper from ICANN 2018, Rhode

    A high-quality video denoising algorithm based on reliable motion estimation

    Get PDF
    11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IIIAlthough the recent advances in the sparse representations of images have achieved outstanding denosing results, removing real, structured noise in digital videos remains a challenging problem. We show the utility of reliable motion estimation to establish temporal correspondence across frames in order to achieve high-quality video denoising. In this paper, we propose an adaptive video denosing framework that integrates robust optical flow into a non-local means (NLM) framework with noise level estimation. The spatial regularization in optical flow is the key to ensure temporal coherence in removing structured noise. Furthermore, we introduce approximate K-nearest neighbor matching to significantly reduce the complexity of classical NLM methods. Experimental results show that our system is comparable with the state of the art in removing AWGN, and significantly outperforms the state of the art in removing real, structured noise

    Transformation Equivariant Boltzmann Machines

    Get PDF
    Abstract. We develop a novel modeling framework for Boltzmann machines, augmenting each hidden unit with a latent transformation assignment variable which describes the selection of the transformed view of the canonical connection weights associated with the unit. This enables the inferences of the model to transform in response to transformed input data in a stable and predictable way, and avoids learning multiple features differing only with respect to the set of transformations. Extending prior work on translation equivariant (convolutional) models, we develop translation and rotation equivariant restricted Boltzmann machines (RBMs) and deep belief nets (DBNs), and demonstrate their effectiveness in learning frequently occurring statistical structure from artificial and natural images

    Riemannian Sparse Coding for Positive Definite Matrices

    Get PDF
    International audienceInspired by the great success of sparse coding for vector valued data, our goal is to represent symmetric positive definite (SPD) data matrices as sparse linear combinations of atoms from a dictionary, where each atom itself is an SPD matrix. Since SPD matrices follow a non-Euclidean (in fact a Riemannian) geometry, existing sparse coding techniques for Euclidean data cannot be directly extended. Prior works have approached this problem by defining a sparse coding loss function using either extrinsic similarity measures (such as the log-Euclidean distance) or kernelized variants of statistical measures (such as the Stein divergence, Jeffrey's divergence, etc.). In contrast, we propose to use the intrinsic Riemannian distance on the manifold of SPD matrices. Our main contribution is a novel mathematical model for sparse coding of SPD matrices; we also present a computationally simple algorithm for optimizing our model. Experiments on several computer vision datasets showcase superior classification and retrieval performance compared with state-of-the-art approaches

    Combining visual and textual systems within the context of user feedback

    Get PDF
    It has been proven experimentally, that a combination of textual and visual representations can improve the retrieval performance ([20], [23]). It is due to the fact, that the textual and visual feature spaces often represent complementary yet correlated aspects of the same image, thus forming a composite system. In this paper, we present a model for the combination of visual and textual sub-systems within the user feedback context. The model was inspired by the measurement utilized in quantum mechanics (QM) and the tensor product of co-occurrence (density) matrices, which represents a density matrix of the composite system in QM. It provides a sound and natural framework to seamlessly integrate multiple feature spaces by considering them as a composite system, as well as a new way of measuring the relevance of an image with respect to a context. The proposed approach takes into account both intra (via co-occurrence matrices) and inter (via tensor operator) relationships between features’ dimensions. It is also computationally cheap and scalable to large data collections. We test our approach on ImageCLEF2007photo data collection and present interesting findings

    Bayesian Painting by Numbers: Flexible Priors for Colour-Invariant Object Recognition

    Get PDF
    Generative models of images should take into account transformations of geometry and reflectance. Then, they can provide explanations of images that are factorized into intrinsic properties that are useful for subsequent tasks, such as object classification. It was previously shown how images and objects within images could be described as compositions of regions called structural elements or ‘stels’. In this way, transformations of the reflectance and illumination of object parts could be accounted for using a hidden variable that is used to ‘paint’ the same stel differently in different images. For example, the stel corresponding to the petals of a flower can be red in one image and yellow in another. Previous stel models have used a fixed number of stels per image and per image class. Here, we introduce a Bayesian stel model, the colour − invariant admixture (CIA) model, which can infer different numbers of stels for different object types, as appropriate. Results on Caltech101 images show that this method is capable of automatically selecting a number of stels that reflects the complexity of the object class and that these stels are useful for object recognition.Engineering and Applied Science

    Reinforcement learning or active inference?

    Get PDF
    This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain

    Parametric study of EEG sensitivity to phase noise during face processing

    Get PDF
    <b>Background: </b> The present paper examines the visual processing speed of complex objects, here faces, by mapping the relationship between object physical properties and single-trial brain responses. Measuring visual processing speed is challenging because uncontrolled physical differences that co-vary with object categories might affect brain measurements, thus biasing our speed estimates. Recently, we demonstrated that early event-related potential (ERP) differences between faces and objects are preserved even when images differ only in phase information, and amplitude spectra are equated across image categories. Here, we use a parametric design to study how early ERP to faces are shaped by phase information. Subjects performed a two-alternative force choice discrimination between two faces (Experiment 1) or textures (two control experiments). All stimuli had the same amplitude spectrum and were presented at 11 phase noise levels, varying from 0% to 100% in 10% increments, using a linear phase interpolation technique. Single-trial ERP data from each subject were analysed using a multiple linear regression model. <b>Results: </b> Our results show that sensitivity to phase noise in faces emerges progressively in a short time window between the P1 and the N170 ERP visual components. The sensitivity to phase noise starts at about 120–130 ms after stimulus onset and continues for another 25–40 ms. This result was robust both within and across subjects. A control experiment using pink noise textures, which had the same second-order statistics as the faces used in Experiment 1, demonstrated that the sensitivity to phase noise observed for faces cannot be explained by the presence of global image structure alone. A second control experiment used wavelet textures that were matched to the face stimuli in terms of second- and higher-order image statistics. Results from this experiment suggest that higher-order statistics of faces are necessary but not sufficient to obtain the sensitivity to phase noise function observed in response to faces. <b>Conclusion: </b> Our results constitute the first quantitative assessment of the time course of phase information processing by the human visual brain. We interpret our results in a framework that focuses on image statistics and single-trial analyses
    corecore