Search CORE

474 research outputs found

Fusion of tf.idf Weighted Bag of Visual Features for Image Classification

Author: Barat Cécile
Ducottet Christophe
Moulin Christophe
Publication venue: HAL CCSD
Publication date: 23/06/2010
Field of study

International audienceImage representation using bag of visual words approach is commonly used in image classification. Features are extracted from images and clustered into a visual vocabulary. Images can then be represented as a normalized histogram of visual words similarly to textual documents represented as a weighted vector of terms. As a result, text categorization techniques are applicable to image classification. In this paper, our contribution is twofold. First, we propose a suitable Term-Frequency and Inverse Document Frequency weighting scheme to characterize the importance of visual words. Second, we present a method to fuse different bag-of-words obtained with different vocabularies. We show that using our tf.idf normalization and the fusion leads to better classification rates than other normalization methods, other fusion schemes or other approaches evaluated on the SIMPLIcity collection

HAL-UJM

Simulation based estimation of branching models for LTR retrotransposons

Author: Chrétien Stéphane
Guyeux Christophe
Lerat Emmanuelle
Moulin Serge
Seux Nicolas
Publication venue
Publication date: 07/03/2016
Field of study

Motivation: LTR retrotransposons are mobile elements that are able, like retroviruses, to copy and move inside eukaryotic genomes. In the present work, we propose a branching model for studying the propagation of LTR retrotransposons in these genomes. This model allows to take into account both positions and degradations of LTR retrotransposons copies. In our model, the duplication rate is also allowed to vary with the degradation level. Results: Various functions have been implemented in order to simulate their spread and visualization tools are proposed. Based on these simulation tools, we show that an accurate estimation of the parameters of this propagation model can be performed. We applied this method to the study of the spread of the transposable elements ROO, GYPSY, and DM412 on a chromosome of \textit{Drosophila melanogaster}. Availability: Our proposal has been implemented using Python software. Source code is freely available on the web at https://github.com/SergeMOULIN/retrotransposons-spread.Comment: 7 pages, 3 figures, 7 tables. Submit to "Bioiformatics" on March 1, 201

arXiv.org e-Print Archive

HAL-uB

HAL - Université de Franche-Comté

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Combinaison d'information visuelle et textuelle pour la recherche d'information multimédia

Author: Barat Cécile
Ducottet Christophe
Lemaitre Cédric
Moulin Christophe
Publication venue: HAL CCSD
Publication date: 08/09/2009
Field of study

International audienceNous présentons dans cet article un modèle de représentation de documents multimédia combinant des informations textuelles et des descripteurs visuels. Le texte et l'image composant un document sont chacun décrits par un vecteur de poids

tf.idf

en suivant une approche "sac-de-mots". Le modèle utilisé permet d'effectuer des requêtes multimédia pour la recherche d'information. Notre méthode est évaluée sur la base imageCLEF'08 pour laquelle nous possédons la vérité de terrain. Plusieurs expérimentations ont ét\é menées avec différents descripteurs et plusieurs combinaisons de modalités. L'analyse des résultats montre qu'un modèle de document multimédia permet d'augmenter les performances d'un système de recherche basé uniquement sur une seule modalité, qu'elle soit textuelle ou visuelle

HAL-UJM

UJM at INEX 2009 XML Mining Track

Author: Géry Mathias
Largeron Christine
Moulin Christophe
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/12/2009
Field of study

8 pagesInternational audienceThis paper reports our experiments carried out for the INEX XML Mining track 2009, consisting in developing categorization methods for multi-labeled XML documents. We represent XML documents as vectors of indexed terms. The purpose of our experiments is twofold: firstly we aim to compare strategies that reduce the index size using an improved feature selection criteria CCD. Secondly, we compare a thresholding strategy (MCut) we proposed with common RCut, PCut strategies. The index size was reduced in such a way that the results were less good than expected. However, we obtained good improvements with the MCut thresholding strategy

HAL-UJM

Impact de l'information visuelle pour la Recherche d'Images par le contenu et le contexte

Author: Géry Mathias
Largeron Christine
Moulin Christophe
Publication venue: Centre de Publication Universitaire
Publication date: 18/03/2010
Field of study

15 pagesNational audienceLes documents multimédia composés de texte et d'images sont de plus en plus présents grâce à Internet et à l'augmentation des capacités de stockage. Cet article présente un modèle de représentation de documents multimédia qui combine l'information textuelle et l'information visuelle. En utilisant une approche par sac de mot, un document composé de texte et d'image peut être décrit par des vecteurs correspondant à chaque type d'information. Pour une requête multimédia donnée, une liste de documents pertinents est retournée en combinant linéairement les résultats obtenus séparément sur chaque modalité. Le but de cet article est d'étudier l'impact, sur les résultats, du poids attribué à l'information visuelle par rapport à l'information textuelle. Des expérimentations, réalisées sur la collection multimédia ImageCLEF extraite de l'encyclopédie Wikipedia, montrent que les résultats peuvent être améliorés après une première étape d'apprentissage de ce poids

HAL-UJM

UJM at INEX 2008 XML mining Track

Author: Géry Mathias
Largeron Christine
Moulin Christophe
Publication venue: Springer Berlin / Heidelberg
Publication date: 01/01/2008
Field of study

International audienceThis paper reports our experiments carried out for the INEX XML Mining track, consisting in developing categorization (or classification) and clustering methods for XML documents. We represent XML documents as vectors of index terms. For our first participation, the purpose of our experiments is twofold: Firstly, our overall aim is to set up a categorization text only approach that can be used as a baseline for further work which will take into account the structure of the XML documents. Secondly, our goal is to define two criteria based on terms distribution for reducing the size of the index. Results of our baseline are good and using our two criteria, we improve these results while we slightly reduce the index term. The results are slightly worse when we reduce sharply the index of terms

HAL-UJM

Fisher Linear Discriminant Analysis for Text-Image Combination in Multimedia Information Retrieval

Author: Barat Cécile
Ducottet Christophe
Géry Mathias
Largeron Christine
Moulin Christophe
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

International audienceWith multimedia information retrieval, combining different modalities - text, image, audio or video provides additional information and generally improves the overall system performance. For this purpose, the linear combination method is presented as simple, flexible and effective. However, it requires to choose the weight assigned to each modality. This issue is still an open problem and is addressed in this paper. Our approach, based on Fisher Linear Discriminant Analysis, aims to learn these weights for multimedia documents composed of text and images. Text and images are both represented with the classical bag-of-words model. Our method was tested over the ImageCLEF datasets 2008 and 2009. Results demonstrate that our combination approach not only outperforms the use of the single textual modality but provides a nearly optimal learning of the weights with an efficient computation. Moreover, it is pointed out that the method allows to combine more than two modalities without increasing the complexity and thus the computing tim

HAL-UJM

Crossref

UJM at ImageCLEFwiki 2008

Author: Barat Cecile
Ducottet Christophe
Géry Mathias
Largeron Christine
Moulin Christophe
Publication venue: HAL CCSD
Publication date: 15/09/2008
Field of study

6 pagesThis paper reports our multimedia information retrieval experiments carried out for the ImageCLEF track (ImageCLEFwiki). The task is to answer to user information needs, i.e. queries which may be composed of several modalities (text, image, concept) with ranked lists of relevant documents. The purpose of our experiments is twofold: firstly, our overall aim is to develop a multimedia document model combining text and/or image modalities. Secondly, we aim to compare results of our model using a multimedia query with a text only model. Our multimedia document model is based on a vector of textual and visual terms. The textual terms correspond to words. The visual ones result from local colour descriptors which are automatically extracted and quantized by k-means, leading to an image vocabulary. They represent the colour property of an image region. To perform a query, we compute a similarity score between each document vector (textual + visual terms) and the query using the Okapi method based on the tf.idf approach. We have submitted 6 runs either automatic or manual, using textual, visual or both information. Thanks to these 6 runs, we aim to study several aspects of our model, as the choice of the visual words and local features, the way of combining textual and visual words for a query and the performance improvements obtained when adding visual information to a pure textual model. Concerning the choice of the visual words, results show us that they are significant in some cases where the visualness of the query is meaningful. The conclusion about the combination of textual and visual words is surprising. We obtain worth results when we add directly the text to the visual words. Finally, results also inform that visual information bring complementary relevant documents that were not found with the text query. These initial results are promising and encourage the development of our multimedia model

HAL-UJM

Sorption of Aldrich Humic Acids onto Hematite: Insights into Fractionation Phenomena by Electrospray Ionization with Quadrupole Time-of-Flight Mass Spectrometry

Author: Amekraz Badia
Moulin Christophe
Reiller Pascal E.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2006
Field of study

International audienceSorption induced fractionation of purified Aldrich humic acid (PAHA) on hematite is studied through the modification of electrospray ionization (ESI)quadrupole time-of-flight (QToF) mass spectra of supernatants from retention experiments. The ESI mass spectra show an increase of the “mean molecular masses” of the molecules that constitutes humic aggregates. The low molecular weight fraction (LMWF; m/z ≤ 600 Da) is preferentially sorbed compared to two other fractions. The resolution provided by ESI-QToF mass spectrometer in the low-mass range provided evidence of further fractionation induced by sorption within the LMWF. Among the two latter fractions, the high molecular weight fraction (HMWF; m/z ≈ 1700 Da) seems to be more prone to sorption compared to the intermediate molecular weight fraction (IMWF; m/z ≈ 900 Da). The IMWF seems to be more hydrophilic as it should be richer in O, N and alkyl C from the proportion of even mass, and poorer in aromatic structures from mass defect analysis in ESI mass spectra

HAL-CEA