181 research outputs found

    PAC-Bayesian Majority Vote for Late Classifier Fusion

    Full text link
    A lot of attention has been devoted to multimedia indexing over the past few years. In the literature, we often consider two kinds of fusion schemes: The early fusion and the late fusion. In this paper we focus on late classifier fusion, where one combines the scores of each modality at the decision level. To tackle this problem, we investigate a recent and elegant well-founded quadratic program named MinCq coming from the Machine Learning PAC-Bayes theory. MinCq looks for the weighted combination, over a set of real-valued functions seen as voters, leading to the lowest misclassification rate, while making use of the voters' diversity. We provide evidence that this method is naturally adapted to late fusion procedure. We propose an extension of MinCq by adding an order- preserving pairwise loss for ranking, helping to improve Mean Averaged Precision measure. We confirm the good behavior of the MinCq-based fusion approaches with experiments on a real image benchmark.Comment: 7 pages, Research repor

    LIG and LIRIS at TRECVID 2008: High Level Feature Extraction and Collaborative Annotation

    Get PDF
    International audienceThis paper describes our participations of LIG and LIRIS to the TRECVID 2008 High Level Features detection task. We evaluated several fusion strategies and especially rank fusion. Results show that including as many low-level and intermediate features as possible is the best strategy, that SIFT features are very important, that the way in which the fusion from the various low-level and intermediate features does matter, that the type of mean (arithmetic, geometric and harmonic) does matter. LIG and LIRIS best runs respectively have a Mean Inferred Average Precision of 0.0833 and 0.0598; both above TRECVID 2008 HLF detection task median performance. LIG and LIRIS also co-organized the TRECVID 2008 collaborative annotation. 40 teams did 1235428 annotations. The development collection was annotated at least once at 100\%, at least twice at 37.6\%, at least three times at 3.99\% and at least four times at 0.06\%. Thanks to the active learning and active cleaning used approach, the annotations that were done multiple times were those for which the risk of error was maximum

    Adaptation de domaine parcimonieuse par pondération de bonnes fonctions de similaritĂ©

    No full text
    16 pagesNational audienceL'adaptation de domaine est une problĂ©matique importante dans laquelle les donnĂ©es sources d'apprentissage et les donnĂ©es cibles de test sont supposĂ©es suivre deux distributions diffĂ©rentes. Nous nous plaçons dans le cadre difficile oĂč aucune information sur les Ă©tiquettes cibles n'est disponible. D'un point de vue thĂ©orique, Ben-David et al. ont montrĂ© qu'un classifieur a de meilleures garanties de gĂ©nĂ©ralisation lorsque les distributions marginales des donnĂ©es sources et cibles sont proches. Nous prĂ©sentons une approche basĂ©e sur un cadre de Balcan et al. permettant l'apprentissage de classifieurs linĂ©aires Ă  partir de fonctions de similaritĂ© n'ayant besoin ni d'ĂȘtre symĂ©triques ni d'ĂȘtre semi-dĂ©finies positives. Nous exploitons cette propriĂ©tĂ© pour repondĂ©rer la fonction de similaritĂ© afin de construire itĂ©rativement un espace de projection dans lequel les deux distributions marginales sont proches. Notre approche, formulĂ©e sous la forme d'un programme linéaire en norme 1, infĂšre des modĂšles trĂšs parcimonieux montrant de bonnes performances d'adaptation. Nous l'Ă©valuons expĂ©rimentalement sur des donnĂ©es synthĂ©tiques et des corpus rĂ©els d'annotations d'images

    Parsimonious Unsupervised and Semi-Supervised Domain Adaptation with Good Similarity Functions

    No full text
    International audienceIn this paper, we address the problem of domain adaptation for binary classification. This problem arises when the distributions generating the source learning data and target test data are somewhat different. From a theoretical standpoint, a classifier has better generalization guarantees when the two domain marginal distributions of the input space are close. Classical approaches try mainly to build new projection spaces or to reweight the source data with the objective of moving closer the two distributions. We study an original direction based on a recent framework introduced by Balcan et al. enabling one to learn linear classifiers in an explicit projection space based on a similarity function, not necessarily symmetric nor positive semi-definite. We propose a well founded general method for learning a low-error classifier on target data which is effective with the help of an iterative procedure compatible with Balcan et al.'s framework. A reweighting scheme of the similarity function is then introduced in order to move closer the distri- butions in a new projection space. The hyperparameters and the reweighting quality are controlled by a reverse validation procedure. Our approach is based on a linear programming formulation and shows good adaptation performances with very sparse models. We first consider the challenging unsupervised case where no target label is accessible, which can be helpful when no manual annotation is possible. We also propose a generalization to the semi-supervised case allowing us to consider some few target labels when available. Finally, we evaluate our method on a synthetic problem and on a real image annotation task

    Étude de la gĂ©nĂ©ralisation de DASF Ă  l'adaptation de domaine semi-supervisĂ©e

    No full text
    National audienceAdapter un modÚle d'une distribution source vers une distribution cible différente est un problÚme important en apprentissage automatique. Dans le cadre de l'adaptation de domaine, Ben-David et al. ont montré qu'un classifieur a de meilleures garanties de généralisation lorsque la distance entre les deux distributions marginales selon l'espace d'entrée est faible. Dans le cas non-supervisé, lorsque les données d'apprentissage sont uniquement issues de la distribution source, Morvant et al. ont créé un algorithme, appelé DASF, visant à diminuer cette distance par la construction itérative d'un espace de projection défini explicitement à l'aide d'une bonne fonction de similarité (au sens de Balcan et al.). Dans cet article nous généralisons DASF au cas semi-supervisé dans lequel quelques données cibles d'apprentissage sont considérées. Notre méthode se base sur le cadre théorique de Ben-David et al. proposant la minimisation d'une combinaison convexe des erreurs empiriques source et cible. Nous réalisons une étude de la parcimonie et de la capacité en généralisation des modÚles inférés par notre méthode puis nous confirmons cette analyse sur un exemple jouet et une tùche d'annotation réelle

    The LIG Multi-Criteria System for Video Retrieval

    No full text
    International audienceThe LIG search system uses a user-controlled combination of six criteria: keywords, phonetic string, similarity to example images, semantic categories, similarity to already identified positive images, and temporal closeness to already identified positive images

    Context-based conceptual image indexing

    No full text
    International audienceAutomatic semantic classification of image databases is very useful for users searching and browsing, but it is at the same time a very challenging research problem as well. Local features based image classification is one of the key issues to bridge the semantic gap in order to detect concepts. This paper proposes a framework for incorporating contextual information into the concept detection process. The proposed method combines local and global classifiers with stacking, using SVM.We studied the impact of topologic and semantic contexts in concept detection performance and proposed solutions to handle the large amount of dimensions involved in classified data. We conducted experiments on TRECVIDĂŻÂżÂœ04 subset with 48104 images and 5 concepts. We found that the use of context yields a significant improvement both for the topologic and semantic contexts

    Classifier Fusion for SVM-Based Multimedia Semantic Indexing

    Get PDF
    International audienceConcept indexing in multimedia libraries is very useful for users searching and browsing but it is a very challenging research problem as well. Combining several modalities, features or concepts is one of the key issues for bridging the gap between signal and semantics. In this pa- per, we present three fusion schemes inspired from the classical early and late fusion schemes. First, we present a kernel-based fusion scheme which takes advantage of the kernel basis of classifiers such as SVMs. Second, we integrate a new normalization process into the early fusion scheme. Third, we present a contextual late fusion scheme to merge classification scores of several concepts. We conducted experiments in the framework of the official TRECVID'06 evaluation campaign and we obtained signif- icant improvements with the proposed fusion schemes relatively to usual fusion schemes

    Sparse Domain Adaptation in a Good Similarity-Based Projection Space

    No full text
    International audienceWe address domain adaptation (DA) for binary classification in the challenging case where no target label is available. We propose an original approach that stands in a recent framework of Balcan et al. allowing to learn linear classifiers in an explicit projection space based on good similarity functions that may be not symmetric and not positive semi-definite (PSD). Following the DA frame- work of Ben-David et al., our method looks for a relevant projection space where the source and target distributions tend to be close. This objective is achieved by the use of an additional regularizer motivated by the notion of algorithmic robustness proposed by Xu and Mannor. Our approach is formulated as a linear program with a 1-norm regularization leading to sparse models. We provide a theoretical analysis of this sparsity and a generalization bound. From a practical standpoint, to improve the efficiency of the method we propose an iterative version based on a reweighting scheme of the similarities to move closer the distributions in a new projection space. Hyperparameters and reweighting quality are controlled by a reverse validation process. The evaluation of our approach on a synthetic problem and real image annotation tasks shows good adaptation performances

    Image retrieval : a first step for a human centered approach

    No full text
    International audienceImage indexing using content analysis is known as a difficult task, involving the vision research domain. Using these tools in the context of a retrieval system is generally frustrating for users, due to a lack of interfaces development, and to the difficulty for users to understand the low-level features managed by the system. We propose in this paper a general point of view for introducing a link between such systems and potential users. This includes image features based on visual perception models, a relevance feedback model, and a graphical interface to express the information need through user-system interaction
    • 

    corecore