Search CORE

30 research outputs found

String representations and distances in deep Convolutional Neural Networks for image classification

Author: Barat Cécile
Ducottet Christophe
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

International audienceRecent advances in image classification mostly rely on the use of powerful local features combined with an adapted image representation. Although Convolutional Neural Network (CNN) features learned from ImageNet were shown to be generic and very efficient, they still lack of flexibility to take into account variations in the spatial layout of visual elements. In this paper, we investigate the use of structural representations on top of pre-trained CNN features to improve image classification. Images are represented as strings of CNN features. Similarities between such representations are computed using two new edit distance variants adapted to the image classification domain. Our algorithms have been implemented and tested on several challenging datasets, 15Scenes, Caltech101, Pas-cal VOC 2007 and MIT indoor. The results show that our idea of using structural string representations and distances clearly improves the classification performance over standard approaches based on CNN and SVM with linear kernel, as well as other recognized methods of the literature

HAL-UJM

Crossref

Fusion of tf.idf Weighted Bag of Visual Features for Image Classification

Author: Barat Cécile
Ducottet Christophe
Moulin Christophe
Publication venue: HAL CCSD
Publication date: 23/06/2010
Field of study

International audienceImage representation using bag of visual words approach is commonly used in image classification. Features are extracted from images and clustered into a visual vocabulary. Images can then be represented as a normalized histogram of visual words similarly to textual documents represented as a weighted vector of terms. As a result, text categorization techniques are applicable to image classification. In this paper, our contribution is twofold. First, we propose a suitable Term-Frequency and Inverse Document Frequency weighting scheme to characterize the importance of visual words. Second, we present a method to fuse different bag-of-words obtained with different vocabularies. We show that using our tf.idf normalization and the fusion leads to better classification rates than other normalization methods, other fusion schemes or other approaches evaluated on the SIMPLIcity collection

HAL-UJM

Approximate Image Matching using Strings of Bag-of-Visual Words Representation

Author: Barat Cécile
Ducottet Christophe
Nguyen Hong-Thinh
Publication venue: 'Scitepress'
Publication date: 05/01/2014
Field of study

International audienceThe Spatial Pyramid Matching approach has become very popular to model images as sets of local bag-of words. The image comparison is then done region-by-region with an intersection kernel. Despite its success, this model presents some limitations: the grid partitioning is predefined and identical for all images and the matching is sensitive to intra- and inter-class variations. In this paper, we propose a novel approach based on approximate string matching to overcome these limitations and improve the results. First, we introduce a new image representation as strings of ordered bag-of-words. Second, we present a new edit distance specifically adapted to strings of histograms in the context of image comparison. This distance identifies local alignments between subregions and allows to remove sequences of similar subregions to better match two images. Experiments on 15 Scenes and Caltech 101 show that the proposed approach outperforms the classical spatial pyramid representation and most existing concurrent methods for classification presented in recent years

HAL-UJM

Spatial orientations of visual word pairs to improve Bag-of-Visual-Words model

Author: Barat Cécile
Ducottet Christophe
Khan Rahat
Muselet Damien
Publication venue: 'British Machine Vision Association and Society for Pattern Recognition'
Publication date: 04/09/2012
Field of study

International audienceThis paper presents a novel approach to incorporate spatial information in the bag-of-visual-words model for category level and scene classiﬁcation. In the traditional bag-of-visual-words model, feature vectors are histograms of visual words. This representation is appearance based and does not contain any information regarding the arrangement of the visual words in the 2D image space. In this framework, we present a simple and efﬁ- cient way to infuse spatial information. Particularly, we are interested in explicit global relationships among the spatial positions of visual words. Therefore, we take advantage of the orientation of the segments formed by Pairs of Identical visual Words (PIW). An evenly distributed normalized histogram of angles of PIW is computed. Histograms pro- duced by each word type constitute a powerful description of intra type visual words relationships. Experiments on challenging datasets demonstrate that our method is com- petitive with the concurrent ones. We also show that, our method provides important complementary information to the spatial pyramid matching and can improve the overall performance

HAL-UJM

Combinaison d'information visuelle et textuelle pour la recherche d'information multimédia

Author: Barat Cécile
Ducottet Christophe
Lemaitre Cédric
Moulin Christophe
Publication venue: HAL CCSD
Publication date: 08/09/2009
Field of study

International audienceNous présentons dans cet article un modèle de représentation de documents multimédia combinant des informations textuelles et des descripteurs visuels. Le texte et l'image composant un document sont chacun décrits par un vecteur de poids

tf.idf

en suivant une approche "sac-de-mots". Le modèle utilisé permet d'effectuer des requêtes multimédia pour la recherche d'information. Notre méthode est évaluée sur la base imageCLEF'08 pour laquelle nous possédons la vérité de terrain. Plusieurs expérimentations ont ét\é menées avec différents descripteurs et plusieurs combinaisons de modalités. L'analyse des résultats montre qu'un modèle de document multimédia permet d'augmenter les performances d'un système de recherche basé uniquement sur une seule modalité, qu'elle soit textuelle ou visuelle

HAL-UJM

Scheimpflug Self-Calibration Based on Tangency Points

Author: Barat Cécile
Fournel Thierry
Louhichi Hanene
Menudet Jean-François
Publication venue: HAL CCSD
Publication date: 10/09/2006
Field of study

International audienceSPIV self-calibration strongly depends on the accuracy of the detection of the projection of the control points. A new family of control points and an algorithm of image detection are proposed to overcome the bias associated to the use of dot centers as control points in SPIV self-calibration

HAL-UJM

Fisher Linear Discriminant Analysis for Text-Image Combination in Multimedia Information Retrieval

Author: Barat Cécile
Ducottet Christophe
Géry Mathias
Largeron Christine
Moulin Christophe
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

International audienceWith multimedia information retrieval, combining different modalities - text, image, audio or video provides additional information and generally improves the overall system performance. For this purpose, the linear combination method is presented as simple, flexible and effective. However, it requires to choose the weight assigned to each modality. This issue is still an open problem and is addressed in this paper. Our approach, based on Fisher Linear Discriminant Analysis, aims to learn these weights for multimedia documents composed of text and images. Text and images are both represented with the classical bag-of-words model. Our method was tested over the ImageCLEF datasets 2008 and 2009. Results demonstrate that our combination approach not only outperforms the use of the single textual modality but provides a nearly optimal learning of the weights with an efficient computation. Moreover, it is pointed out that the method allows to combine more than two modalities without increasing the complexity and thus the computing tim

HAL-UJM

Crossref

Global Bilateral Symmetry Detection Using Multiscale Mirror Histograms

Author: Barat Cecile
Barat Cécile
Colantoni Philippe
Ducottet Christophe
Elawady Mohamed
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

In recent years, there has been renewed interest in bilateral symmetry detection in images. It consists in detecting the main bilateral symmetry axis inside artificial or natural images. State-of-the-art methods combine feature point detection, pairwise comparison and voting in Hough-like space. In spite of their good performance, they fail to give reliable results over challenging real-world and artistic images. In this paper, we propose a novel symmetry detection method using multi-scale edge features combined with local orientation histograms. An experimental evaluation is conducted on public datasets plus a new aesthetic-oriented dataset. The results show that our approach outperforms all other concurrent methods

HAL-UJM

Crossref

Stirling Online Research Repository (RIOXX)

Stirling Online Research Repository

Mise en correspondance de formes à niveaux de gris par palpage morphologique

Author: BARAT Cécile
DUCOTTET Christophe
JOURLIN Michel
Publication venue: GRETSI, Groupe d’Etudes du Traitement du Signal et des Images
Publication date: 01/01/2003
Field of study

Dans cette communication, nous présentons deux nouvelles transformées de mise en correspondance de formes (pattern matching) dans les images à niveaux de gris. Elles se basent sur le principe du palpage mécanique et sont définies dans le contexte de la morphologie mathématique. La première transformée permet de localiser dans une image toutes les instances d'un même motif et porte le nom de transformée SOMP (Single Object Matching using Probing). Elle possède toutes les propriétés d'une métrique et, par conséquent, elle retourne une mesure de similarité entre l'image et le modèle recherché. D'autres propriétés relatives au bruit et au temps de calculs sont abordées. La seconde transformée, appelée transformée MOMP (Multiple Objects Matching using Probing), offre la possibilité de localiser toutes les occurrences de plusieurs motifs de formes différentes. Elle est particulièrement adaptée à la détection d'objets de différentes tailles ou perturbés par le bruit. Des résultats sont présentés pour les deux transformées

I-Revues

Combining text/image in WikipediaMM task 2009

Author: Barat Cécile
Ducottet Christophe
Géry Mathias
Largeron Christine
Lemaître Cédric
Moulin Christophe
Publication venue: HAL CCSD
Publication date: 30/09/2009
Field of study

6 pagesThis paper reports our multimedia information retrieval experiments carried out for the ImageCLEF track 2009. In 2008, we proposed a multimedia document model defined as a vector of textual and visual terms weighted using a tf.idf approch [5]. For our second participation, our goal was to improve this previous model in the following ways: 1) use of additional information for the textual part (legend and image bounding text extracted from the original documents, 2) use of different image detectors and descriptors, 3) new text / image combination approach. Results allow to evaluate the benefits of these different improvements

HAL-UJM