88 research outputs found

    Spatial histograms of soft pairwise similar patches to improve the bag-of-visual-words model

    No full text
    International audienceIn the context of category level scene classification, the bag-of-visual-words model (BoVW) is widely used for image representation. This model is appearance based and does not contain any information regarding the arrangement of the visual words in the 2D image space. To overcome this problem, recent approaches try to capture information about either the absolute or the relative spatial location of visual words. In the first category, the so-called Spatial Pyramid Representation (SPR) is very popular thanks to its simplicity and good results. Alternatively, adding information about occurrences of relative spatial configurations of visual words was proven to be effective but at the cost of higher computational complexity, specifically when relative distance and angles are taken into account. In this paper, we introduce a novel way to incorporate both distance and angle information in the BoVW representation. The novelty is first to provide a computationally efficient representation adding relative spatial information between visual words and second to use a soft pairwise voting scheme based on the distance in the descriptor space. Experiments on challenging data sets MSRC-2, 15Scene, Caltech101, Caltech256 and Pascal VOC 2007 demonstrate that our method outperforms or is competitive with concurrent ones. We also show that it provides important complementary information to the spatial pyramid matching and can improve the overall performance

    Mixed Pooling Neural Networks for Color Constancy

    No full text
    International audienceColor constancy is the ability of the human visual system to perceive constant colors for a surface despite changes in the spectrum of the illumination. In computer vision, the main approach consists in estimating the illuminant color and then to remove its impact on the color of the objects. Many image processing algorithms have been proposed to tackle this problem automatically. However, most of these approaches are handcrafted and mostly rely on strong empirical assumptions, e.g., that the average reflectance in a scene is gray. State-of-the-art approaches can perform very well on some given datasets but poorly adapt on some others. In this paper, we have investigated how neural networks-based approaches can be used to deal with the color constancy problem. We have proposed a new network architecture based on existing successful hand-crafted approaches and a large number of improvements to tackle this problem by learning a suitable deep model. We show our results on most of the standard benchmarks used in the color constancy domain

    Spatial orientations of visual word pairs to improve Bag-of-Visual-Words model

    No full text
    International audienceThis paper presents a novel approach to incorporate spatial information in the bag-of-visual-words model for category level and scene classification. In the traditional bag-of-visual-words model, feature vectors are histograms of visual words. This representation is appearance based and does not contain any information regarding the arrangement of the visual words in the 2D image space. In this framework, we present a simple and effi- cient way to infuse spatial information. Particularly, we are interested in explicit global relationships among the spatial positions of visual words. Therefore, we take advantage of the orientation of the segments formed by Pairs of Identical visual Words (PIW). An evenly distributed normalized histogram of angles of PIW is computed. Histograms pro- duced by each word type constitute a powerful description of intra type visual words relationships. Experiments on challenging datasets demonstrate that our method is com- petitive with the concurrent ones. We also show that, our method provides important complementary information to the spatial pyramid matching and can improve the overall performance

    Semantic Segmentation via Multi-task, Multi-domain Learning

    No full text
    International audienceWe present an approach that leverages multiple datasets possibly annotated using different classes to improve the semantic segmentation accuracy on each individual dataset. We propose a new selective loss function that can be integrated into deep networks to exploit training data coming from multiple datasets with possibly different tasks (e.g., different label-sets). We show how the gradient-reversal approach for domain adaptation can be used in this setup. Thorought experiments on semantic segmentation applications show the relevance of our approach

    Segmentation de scènes extérieures à partir d'ensembles d'étiquettes à granularité et sémantique variables

    No full text
    International audienceIn this work, we present an approach that leverages multiple datasets annotated using different classes (different labelsets) to improve the classification accuracy on each individual dataset. We focus on semantic full scene labeling of outdoor scenes. To achieve our goal, we use the KITTI dataset as it illustrates very well the focus of our paper : it has been sparsely labeled by multiple research groups over the past few years but the semantics and the granularity of the labels differ from one set to another. We propose a method to train deep convolutional networks using multiple datasets with potentially inconsistent labelsets and a selective loss function to train it with all the available labeled data while being reliant to inconsistent labelings. Experiments done on all the KITTI dataset's labeled subsets show that our approach consistently improves the classification accuracy by exploiting the correlations across data-sets both at the feature level and at the label level.Ce papier présente une approche permettant d'utiliser plusieurs bases de données annotées avec différents ensembles d'étiquettes pour améliorer la précision d'un classifieur entrainé sur une tâche de segmentation sémantique de scènes extérieures. Dans ce contexte, la base de données KITTI nous fournit un cas d'utilisation particulièrement pertinent : des sous-ensembles distincts de cette base ont été annotés par plusieurs équipes en utilisant des étiquettes différentes pour chaque sous-ensemble. Notre méthode permet d'entraîner un réseau de neurones convolutionnel (CNN) en utilisant plusieurs bases de données avec des étiquettes possiblement incohérentes. Nous présentons une fonction de perte sélective pour entrainer ce réseau et plusieurs approches de fusion permettant d'exploiter les corrélations entre les différents ensembles d'étiquettes. Le réseau utilise ainsi toutes les données disponibles pour améliorer les performances de classification sur chaque ensemble. Les expériences faites sur les différents sous-ensembles de la base de données KITTI montrent comment chaque proposition contribue à améliorer le classifieur

    Supervised Spectral Subspace Clustering for Visual Dictionary Creation in the Context of Image Classification

    No full text
    International audienceWhen building traditional Bag of Visual Words (BOW) for image classification, the K-means algorithm is usually used on a large set of high dimensional local descriptors to build the visual dictionary. However, it is very likely that, to find a good visual vocabulary, only a sub-part of the descriptor space of each visual word is truly relevant. We propose a novel framework for creating the visual dictionary based on a spectral subspace clustering method instead of the traditional K-means algorithm. A strategy for adding supervised information during the subspace clustering process is formulated to obtain more discriminative visual words. Experimental results on real world image dataset show that the proposed framework for dictionary creation improves the classification accuracy compared to using traditionally built BOW

    Reconnaissance d'objets grâce à l'analyse des composantes couleur adaptées au changement d'éclairage entre deux images

    Get PDF
    Dans le domaine de l'indexation d'images, les méthodes de reconnaissance d'objets couleur ont tendance à échouer lorsque les conditions d'éclairage lors des acquisitions diffèrent d'une image à l'autre. Dans cet article, nous proposons une nouvelle approche pour la recherche d'objets dans des bases d'images couleur qui permet de s'affranchir des variations d'éclairage. Pour cela, nous considérons qu'un changement d'illuminant ne perturbe que très légèrement l'ordre des niveaux des composantes couleur des pixels d'une même image. Pour comparer deux images, nous transformons les composantes couleur de manière spécifique à chaque couple formé par une image-modèle et une image-requête. Les composantes couleur des pixels de chaque couple d'images considéré sont transformées par une analyse spécifique des mesures de rang des pixels. Des tests effectués sur une base publique d'images montrent l'amélioration obtenue par notre méthode en terme de reconnaissance d'objets

    Coexistence of two sympatric cryptic bat species in French Guiana: insights from genetic, acoustic and ecological data

    Get PDF
    International audienceBackground: The distinction between lineages of neotropical bats from the Pteronotus parnellii species complex has been previously made according to mitochondrial DNA, and especially morphology and acoustics, in order to separate them into two species. In these studies, either sample sizes were too low when genetic and acoustic or morphological data were gathered on the same individuals, or genetic and other data were collected on different individuals. In this study, we intensively sampled bats in 4 caves and combined all approaches in order to analyse genetic, morphologic, and acoustic divergence between these lineages that live in the same caves in French Guiana

    Invariants colorimétriques pour la reconnaissance d'objets

    No full text
    corecore