Search CORE

14,138 research outputs found

Sparse Transfer Learning for Interactive Video Search Reranking

Author: Barais Olivier
Bourcier Johann
Fouquet Francois
Gonzalez-Herrera Inti
Rudametkin Walter
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/06/2011
Field of study

Visual reranking is effective to improve the performance of the text-based video search. However, existing reranking algorithms can only achieve limited improvement because of the well-known semantic gap between low level visual features and high level semantic concepts. In this paper, we adopt interactive video search reranking to bridge the semantic gap by introducing user's labeling effort. We propose a novel dimension reduction tool, termed sparse transfer learning (STL), to effectively and efficiently encode user's labeling information. STL is particularly designed for interactive video search reranking. Technically, it a) considers the pair-wise discriminative information to maximally separate labeled query relevant samples from labeled query irrelevant ones, b) achieves a sparse representation for the subspace to encodes user's intention by applying the elastic net penalty, and c) propagates user's labeling information from labeled samples to unlabeled samples by using the data distribution knowledge. We conducted extensive experiments on the TRECVID 2005, 2006 and 2007 benchmark datasets and compared STL with popular dimension reduction algorithms. We report superior performance by using the proposed STL based interactive video search reranking.Comment: 17 page

HAL-CentraleSupelec

HAL - Lille 3

Hal - Université Grenoble Alpes

HAL Descartes

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Linnéuniversitetets forskningsdatabas

Wolverhampton Intellectual Repository and E-theses

Hal-Diderot

arXiv.org e-Print Archive

Lirias

INRIA a CCSD electronic archive server

OPUS - University of Technology Sydney

Digital Repository of Hellenic Managing Authority of the Operational Programme "Education and Lifelong Learning" (EDULLL)

Open Repository and Bibliography - Luxembourg

HAL-Rennes 1

Latent Fisher Discriminant Analysis

Author: Chen Gang
Publication venue
Publication date: 20/09/2013
Field of study

Linear Discriminant Analysis (LDA) is a well-known method for dimensionality reduction and classification. Previous studies have also extended the binary-class case into multi-classes. However, many applications, such as object detection and keyframe extraction cannot provide consistent instance-label pairs, while LDA requires labels on instance level for training. Thus it cannot be directly applied for semi-supervised classification problem. In this paper, we overcome this limitation and propose a latent variable Fisher discriminant analysis model. We relax the instance-level labeling into bag-level, is a kind of semi-supervised (video-level labels of event type are required for semantic frame extraction) and incorporates a data-driven prior over the latent variables. Hence, our method combines the latent variable inference and dimension reduction in an unified bayesian framework. We test our method on MUSK and Corel data sets and yield competitive results compared to the baseline approach. We also demonstrate its capacity on the challenging TRECVID MED11 dataset for semantic keyframe extraction and conduct a human-factors ranking-based experimental evaluation, which clearly demonstrates our proposed method consistently extracts more semantically meaningful keyframes than challenging baselines.Comment: 12 page

arXiv.org e-Print Archive

CiteSeerX

Bounded-Distortion Metric Learning

Author: Jia Jiaya
Liao Renjie
Ma Ziyang
Shi Jianping
Zhu Jun
Publication venue
Publication date: 10/05/2015
Field of study

Metric learning aims to embed one metric space into another to benefit tasks like classification and clustering. Although a greatly distorted metric space has a high degree of freedom to fit training data, it is prone to overfitting and numerical inaccuracy. This paper presents {\it bounded-distortion metric learning} (BDML), a new metric learning framework which amounts to finding an optimal Mahalanobis metric space with a bounded-distortion constraint. An efficient solver based on the multiplicative weights update method is proposed. Moreover, we generalize BDML to pseudo-metric learning and devise the semidefinite relaxation and a randomized algorithm to approximately solve it. We further provide theoretical analysis to show that distortion is a key ingredient for stability and generalization ability of our BDML algorithm. Extensive experiments on several benchmark datasets yield promising results

arXiv.org e-Print Archive

CiteSeerX

Regression on fixed-rank positive semidefinite matrices: a Riemannian approach

Author: Bonnabel Silvere
Meyer Gilles
Sepulchre Rodolphe
Publication venue
Publication date: 31/01/2011
Field of study

The paper addresses the problem of learning a regression model parameterized by a fixed-rank positive semidefinite matrix. The focus is on the nonlinear nature of the search space and on scalability to high-dimensional problems. The mathematical developments rely on the theory of gradient descent algorithms adapted to the Riemannian geometry that underlies the set of fixed-rank positive semidefinite matrices. In contrast with previous contributions in the literature, no restrictions are imposed on the range space of the learned matrix. The resulting algorithms maintain a linear complexity in the problem size and enjoy important invariance properties. We apply the proposed algorithms to the problem of learning a distance function parameterized by a positive semidefinite matrix. Good performance is observed on classical benchmarks

arXiv.org e-Print Archive

Open Repository and Bibliography - Liège

Co-Localization of Audio Sources in Images Using Binaural Features and Locally-Linear Regression

Author: Deleforge Antoine
Girin Laurent
Horaud Radu
Schechner Yoav
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2015
Field of study

This paper addresses the problem of localizing audio sources using binaural measurements. We propose a supervised formulation that simultaneously localizes multiple sources at different locations. The approach is intrinsically efficient because, contrary to prior work, it relies neither on source separation, nor on monaural segregation. The method starts with a training stage that establishes a locally-linear Gaussian regression model between the directional coordinates of all the sources and the auditory features extracted from binaural measurements. While fixed-length wide-spectrum sounds (white noise) are used for training to reliably estimate the model parameters, we show that the testing (localization) can be extended to variable-length sparse-spectrum sounds (such as speech), thus enabling a wide range of realistic applications. Indeed, we demonstrate that the method can be used for audio-visual fusion, namely to map speech signals onto images and hence to spatially align the audio and visual modalities, thus enabling to discriminate between speaking and non-speaking faces. We release a novel corpus of real-room recordings that allow quantitative evaluation of the co-localization method in the presence of one or two sound sources. Experiments demonstrate increased accuracy and speed relative to several state-of-the-art methods.Comment: 15 pages, 8 figure

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Hal-Diderot