9 research outputs found

    Hierarchical image segmentation relying on a likelihood ratio test

    Get PDF
    International audienceHierarchical image segmentation provides a set of image seg-mentations at different detail levels in which coarser details levels can be produced by simple merges of regions from segmentations at finer detail levels. However, many image segmentation algorithms relying on similarity measures lead to no hierarchy. One of interesting similarity measures is a likelihood ratio, in which each region is modelled by a Gaussian distribution to approximate the cue distributions. In this work, we propose a hierarchical graph-based image segmentation inspired by this likelihood ratio test. Furthermore, we study how the inclusion of hierarchical property have influenced the computation of quality measures in the original method. Quantitative and qualitative assessments of the method on three well known image databases show efficiency

    Tag Propagation Approaches within Speaking Face Graphs for Multimodal Person Discovery

    Get PDF
    International audienceThe indexing of broadcast TV archives is a current problem in multimedia research. As the size of these databases grows continuously, meaningful features are needed to describe and connect their elements efficiently, such as the identification of speaking faces. In this context, this paper focuses on two approaches for unsupervised person discovery. Initial tagging of speaking faces is provided by an OCR-based method, and these tags propagate through a graph model based on audiovisual relations between speaking faces. Two propagation methods are proposed, one based on random walks and the other based on a hierarchical approach. To better evaluate their performances, these methods were compared with two graph clustering baselines. We also study the impact of different modality fusions on the graph-based tag propagation scenario. From a quantitative analysis, we observed that the graph propagation techniques always outperform the baselines. Among all compared strategies, the methods based on hierarchical propagation with late fusion and random walk with score-fusion obtained the highest MAP values. Finally, even though these two methods produce highly equivalent results according to Kappa coefficient, the random walk method performs better according to a paired t-test, and the computing time for the hierarchical propagation is more than 4 times lower than the one for the random walk propagation

    Hierarchizing graph-based image segmentation algorithms relying on region dissimilarity: the case of the Felzenszwalb-Huttenlocher method

    Get PDF
    International audienceThis article is a first attempt towards a general theory for hierarchizing non-hierarchical image segmentation method depending on a region-dissimilarity parameter which controls the desired level of simplification: each level of the hierarchy is “as close as possible” to the result that one would obtain with the non-hierarchical method using the corresponding scale as simplification parameter. The introduction of this hierarchization problem in the form of an optimization problem, as well as the proposed tools to tackle it, is an important contribution of the present article. Indeed, with the hierarchized version of a segmentation method, the user can just select the level in the hierarchy, controlling the desired number of regions or can leverage on any of the tools introduced in hierarchical analysis. The main example investigated in this study is the criterion proposed by Felzenszwalb and Huttenlocher for which we show that the results of the hierarchized version of the segmentation method are better than those of the original one with the added property that it satisfies the strong causality and location principles from scale-sets image analysis. An interesting perspective of this work, considering the current trend in computer vision, is obviously, on a specific application, to use learning techniques and train a criterion to choose the correct region

    Tag Propagation Approaches within Speaking Face Graphs for Multimodal Person Discovery

    Get PDF
    International audienceThe indexing of broadcast TV archives is a current problem in multimedia research. As the size of these databases grows continuously, meaningful features are needed to describe and connect their elements efficiently, such as the identification of speaking faces. In this context, this paper focuses on two approaches for unsupervised person discovery. Initial tagging of speaking faces is provided by an OCR-based method, and these tags propagate through a graph model based on audiovisual relations between speaking faces. Two propagation methods are proposed, one based on random walks and the other based on a hierarchical approach. To better evaluate their performances, these methods were compared with two graph clustering baselines. We also study the impact of different modality fusions on the graph-based tag propagation scenario. From a quantitative analysis, we observed that the graph propagation techniques always outperform the baselines. Among all compared strategies, the methods based on hierarchical propagation with late fusion and random walk with score-fusion obtained the highest MAP values. Finally, even though these two methods produce highly equivalent results according to Kappa coefficient, the random walk method performs better according to a paired t-test, and the computing time for the hierarchical propagation is more than 4 times lower than the one for the random walk propagation

    Hierarchical Multi-Label Propagation using Speaking Face Graphs for Multimodal Person Discovery

    No full text
    International audienceTV archives are growing in size so fast that manually indexing becomes unfeasible. Automatic indexing techniques can be applied to overcome this issue, and this work proposes an unsupervised technique for multimodal person discovery. To achieve this goal, we propose a hierarchical label propagation technique based on quasi-flat zones theory, that learns from labeled and unlabeled data and propagates names through a multimodal graph representation. In this representation, we combine audio, video, and text processing techniques to model the data as a graph of speaking faces. In the proposed mod-eling, we extract names via optical character recognition and propagate them through the graph using audiovisual relationships between speaking faces. We also use a random walk label propagation and two graph clustering strategies to serve as baselines. The proposed label propagation techniques always outper-form the clustering baselines on the quantitative assessments. Our approach also outperforms all literature methods tested on the same dataset except for one, which uses a different preprocessing step. The proposed hierarchical label propagation and the random walk baseline produce highly equivalent results according to the Kappa coefficient, but the hierarchical propagation is parameter-free and over 9 times faster than the random walk under the same configurations

    Towards large scale multimedia indexing: a case study on person discovery in broadcast news

    No full text
    The rapid growth of multimedia databases and the human interest in their peers make indices representing the location and identity of people in audio-visual documents essential for searching archives. Person discovery in the absence of prior identity knowledge requires accurate association of audio-visual cues and detected names. To this end, we present 3 different strategies to approach this problem: clustering-based naming, verification-based naming, and graph-based naming. Each of these strategies utilizes different recent advances in unsupervised face / speech representation, verification, and optimization. To have a better understanding of the approaches, this paper also provides a quantitative and qualitative comparative study of these approaches using the associated corpus of the Person Discovery challenge at MediaEval 2016. From the results of our experiments, we can observe the pros and cons of each approach, thus paving the way for future promising research directions.Peer Reviewe
    corecore