157 research outputs found

    EquiMod: An Equivariance Module to Improve Self-Supervised Learning

    Full text link
    Self-supervised visual representation methods are closing the gap with supervised learning performance. These methods rely on maximizing the similarity between embeddings of related synthetic inputs created through data augmentations. This can be seen as a task that encourages embeddings to leave out factors modified by these augmentations, i.e. to be invariant to them. However, this only considers one side of the trade-off in the choice of the augmentations: they need to strongly modify the images to avoid simple solution shortcut learning (e.g. using only color histograms), but on the other hand, augmentations-related information may be lacking in the representations for some downstream tasks (e.g. color is important for birds and flower classification). Few recent works proposed to mitigate the problem of using only an invariance task by exploring some form of equivariance to augmentations. This has been performed by learning additional embeddings space(s), where some augmentation(s) cause embeddings to differ, yet in a non-controlled way. In this work, we introduce EquiMod a generic equivariance module that structures the learned latent space, in the sense that our module learns to predict the displacement in the embedding space caused by the augmentations. We show that applying that module to state-of-the-art invariance models, such as SimCLR and BYOL, increases the performances on CIFAR10 and ImageNet datasets. Moreover, while our model could collapse to a trivial equivariance, i.e. invariance, we observe that it instead automatically learns to keep some augmentations-related information beneficial to the representations

    Discrimination of visual pedestrians data by combining projection and prediction learning

    Get PDF
    International audiencePROPRE is a generic and semi-supervised neural learning paradigm that extracts meaningful concepts of multimodal data flows based on predictability across modalities. It consists on the combination of two computational paradigms. First, a topological projection of each data flow on a self-organizing map (SOM) to reduce input dimension. Second, each SOM activity is used to predict activities in all other SOMs. Predictability measure, that compares predicted and real activities, is used to modulate the SOM learning to favor mutually predictable stimuli. In this article, we study PROPRE applied to a classical visual pedestrian data classification task. The SOM learning modulation introduced in PROPRE improves significantly classification performance

    Learning of local predictable representations in partially learnable environments

    Get PDF
    International audiencePROPRE is a generic and cortically inspired framework that provides online input/output relationship learning. The input data flow is projected on a self-organizing map that provides an internal representation of the current stimulus. From this representation, the system predicts the value of the output target. A predictability measure, based on the monitoring of the prediction quality, modulates the projection learning so that to favor learning of representations that are helpful to predict the output. In this article, we study PROPRE when the input/output relationship is only defined in a small subspace of the input space, that we define as a partially learnable environment. This problem, which is not typical of the machine learning field, is however crucial for the robotic developmental field. Indeed, robots face high dimensional sensory-motor environments where large areas of these sensory-motor spaces are not learnable since a motor action cannot have a consequence on every perception each time. We show that the use of the predictability measure in PROPRE leads to an autonomous gathering of local representations where the input data are related to the output value, thus providing good classification performance as the system will learn the input/output function only where it is defined

    Learning to be attractive: probabilistic computation with dynamic attractor networks

    Get PDF
    International audienceIn the context of sensory or higher-level cognitive processing, we present a recurrent neural network model, similar to the popular dynamic neural field (DNF) model, for performing approximate probabilistic computations. The model is biologically plausible, avoids impractical schemes such as log-encoding and noise assumptions, and is well-suited for working in stacked hierarchies. By Lyapunov analysis, we make it very plausible that the model computes the maximum a posteriori (MAP) estimate given a certain input that may be corrupted by noise. Key points of the model are its capability to learn the required posterior distributions and represent them in its lateral weights, the interpretation of stable neural activities as MAP estimates, and of latency as the probability associated with those estimates. We demonstrate for in simple experiments that learning of posterior distributions is feasible and results in correct MAP estimates. Furthermore, a pre-activation of field sites can modify attractor states when the data model is ambiguous, effectively providing an approximate implementation of Bayesian inference

    PROPRE: PROjection and PREdiction for multimodal correlations learning. An application to pedestrians visual data discrimination

    Get PDF
    International audiencePROPRE is a generic and modular unsupervised neural learning paradigm that extracts meaningful concepts of multimodal data flows based on predictability across modalities. It consists on the combination of three modules. First, a topological projection of each data flow on a self-organizing map. Second, a decentralized prediction of each projection activity from each others map activities. Third, a predictability measure that compares predicted and real activities. This measure is used to modulate the projection learning so that to favor the mapping of predictable stimuli across modalities. In this article, we use Kohonen map for the projection module, linear regression for the prediction one and we propose multiple generic predictability measures. We illustrate the properties and performances of PROPRE paradigm on a challenging supervised classification task of visual pedestrian data. The modulation of the projection learning by the predictability measure improves significantly classification performances of the system independently of the measure used. Moreover, PROPRE provides a combination of interesting functional properties, such as a dynamical adaptation to input statistic variations, that is rarely available in other machine learning algorithms

    Active learning of local predictable representations with artificial curiosity

    Get PDF
    International audienceIn this article, we present some preliminary work on integrating an artificial curiosity mechanism in PROPRE, a generic and modular neural architecture, to obtain online, open-ended and active learning of a sensory-motor space, where large areas can be unlearnable. PROPRE consists of the combination of the projection of the input motor flow, using a self-organizing map, with the regression of the sensory output flow from this projection representation, using a linear regression. The main feature of PROPRE is the use of a predictability module that provides an interestingness measure for the current motor stimulus depending on a simple evaluation of the sensory prediction quality. This measure modulates the projection learning so that to favor the representations that predict the output better than a local average. Especially, this leads to the learning of local representations where an input/output relationship is defined. In this article, we propose an artificial curiosity mechanism based on the monitoring of learning progress in the neighborhood of each local representation. Thus, PROPRE simultaneously learns interesting representations of the input flow (depending on their capacities to predict the output) and explores actively this input space where the learning progress is the higher. We illustrate our architecture on the learning of a direct model of an arm whose hand can only be perceived in a restricted visual space. The modulation of the projection learning leads to a better performance and the use of the curiosity mechanism provides quicker learning and even improves the final performance

    Feedback modulation of BCM's neurons in multi modal environment

    Get PDF
    National audienceIn Gibson's theory, an object is defined by its interactions with people, named affordances. Affordances raise from the association of the object's sensory perceptions. Bienenstock Cooper and Munro's (BCM's) cells converge to one of the input patterns with a decentralized and unsupervised learning. We want to extend this mechanism in order to develop a generic incremental modalities association paradigm. We introduce a feedback modulation of BCM's neurons to obtain a spatial auto organization. This feedback will be the reflect of the multi modal constraints

    Auto organisation d'une carte de neurones par modulation de la règle BCM dans un cadre multimodal

    Get PDF
    National audienceThe cortex must permanently deal with stimuli coming from the environment, perceived through different sensors spatially separated. These stimuli converge in the cortex to be processed together in order to construct a multisensory and coherent view of the world. When somebody is hearing /ba/ and is simultaneously viewing the lips movements corresponding to /ga/, he is perceiving /da/. This phenomenon, known as McGurk effect, reveals the cross correlation between different modalities. Some cortical areas are mainly dedicated to the processing of a specific perception. These areas are topographically organized, meaning that two spatially close neurons respond to close stimuli. The purpose of this paper is to modify the BCM synaptic rule to obtain a self organization of a neuronal map. We introduce a feedback modulation of the learning rule, representing multimodal constraints of the environment. This feedback will be obtained by using a multimap and multilevel architecture of modality assembling

    Multi-sensory integration by constrained self-organization

    Get PDF
    International audienceWe develop on a model for multi-sensory integration to perform sensorimotor tasks. The aim of the model is to provide missing modality recall and generalization using cortico-inspired mechanisms. The architecture consists in several multilevel cortical maps with a generic structure. Each map has to self organize with a continuous, decentralized and unsupervised learning which provides robustness and adaptability. These self-organizations are constrained by the multi modal context to obtain multi-sensory generalization. More precisely, each modality is represented by a perceptive map and each perception by an activity bump, that emerges thanks to a competition mechanism, in the corresponding map. All perceptive maps are reciprocally and laterally connected to an unique associative map. A competition takes place in the associative map to generate an activity bump which represents the multi-sensory perception. Multi modal constraints are relaxed, thanks to the lateral connections, to converge towards coherent perceptions within a multi modal context. We will present a model of perceptive map using a modulated BCM (Bienenstock Cooper Munro) learning rule to create a self-organization at the map level, which can be influenced by the multimodal context. An unlearning mechanism adds robustness and plasticity to the architecture and makes the self-organization smoother

    Self-organization of neural maps using a modulated BCM rule within a multimodal architecture

    Get PDF
    International audienceHuman beings interact with the environment through different modalities, i.e. perceptions and actions. Different perceptions as view, audition or proprioception for example, are picked up by different spatially separated sensors. They are processed in the cortex by dedicated brain areas, which are self-organized, so that spatially close neurons are sensitive to close stimuli. However, the processings of these perceptive flows are not isolated. On the contrary, they are constantly interacting, as illustrated by the McGurk effect. When the phonetic stimulus /ba/ is presented simultaneously with a lip movement corresponding to a /ga/, people perceive a /da/, which does not correspond to any of the stimuli. Merging several stimuli into one multimodal perception reduces the ambiguities and the noise of each perception. This is an essential mechanism of the cortex to interact with the environment. The aim of this article is to propose a model for the assembling of modalities, inspired by the biological properties of the cortex. We have modified the Bienenstock Cooper Munro (BCM) rule to include it in a model that consists of interacting maps of multilayer cortical columns. Each map is able to self-organize thanks to a continuous decentralized and local learning modulated by a high level signal. By assembling different maps corresponding to different modalities, our model creates a multimodal context which is used as a modulating signal and thus it influences the self-organization of each map
    • …
    corecore