7,742 research outputs found
Learning to detect video events from zero or very few video examples
In this work we deal with the problem of high-level event detection in video.
Specifically, we study the challenging problems of i) learning to detect video
events from solely a textual description of the event, without using any
positive video examples, and ii) additionally exploiting very few positive
training samples together with a small number of ``related'' videos. For
learning only from an event's textual description, we first identify a general
learning framework and then study the impact of different design choices for
various stages of this framework. For additionally learning from example
videos, when true positive training samples are scarce, we employ an extension
of the Support Vector Machine that allows us to exploit ``related'' event
videos by automatically introducing different weights for subsets of the videos
in the overall training set. Experimental evaluations performed on the
large-scale TRECVID MED 2014 video dataset provide insight on the effectiveness
of the proposed methods.Comment: Image and Vision Computing Journal, Elsevier, 2015, accepted for
publicatio
PAC-Bayesian Majority Vote for Late Classifier Fusion
A lot of attention has been devoted to multimedia indexing over the past few
years. In the literature, we often consider two kinds of fusion schemes: The
early fusion and the late fusion. In this paper we focus on late classifier
fusion, where one combines the scores of each modality at the decision level.
To tackle this problem, we investigate a recent and elegant well-founded
quadratic program named MinCq coming from the Machine Learning PAC-Bayes
theory. MinCq looks for the weighted combination, over a set of real-valued
functions seen as voters, leading to the lowest misclassification rate, while
making use of the voters' diversity. We provide evidence that this method is
naturally adapted to late fusion procedure. We propose an extension of MinCq by
adding an order- preserving pairwise loss for ranking, helping to improve Mean
Averaged Precision measure. We confirm the good behavior of the MinCq-based
fusion approaches with experiments on a real image benchmark.Comment: 7 pages, Research repor
Divide and Fuse: A Re-ranking Approach for Person Re-identification
As re-ranking is a necessary procedure to boost person re-identification
(re-ID) performance on large-scale datasets, the diversity of feature becomes
crucial to person reID for its importance both on designing pedestrian
descriptions and re-ranking based on feature fusion. However, in many
circumstances, only one type of pedestrian feature is available. In this paper,
we propose a "Divide and use" re-ranking framework for person re-ID. It
exploits the diversity from different parts of a high-dimensional feature
vector for fusion-based re-ranking, while no other features are accessible.
Specifically, given an image, the extracted feature is divided into
sub-features. Then the contextual information of each sub-feature is
iteratively encoded into a new feature. Finally, the new features from the same
image are fused into one vector for re-ranking. Experimental results on two
person re-ID benchmarks demonstrate the effectiveness of the proposed
framework. Especially, our method outperforms the state-of-the-art on the
Market-1501 dataset.Comment: Accepted by BMVC201
Autoencoding the Retrieval Relevance of Medical Images
Content-based image retrieval (CBIR) of medical images is a crucial task that
can contribute to a more reliable diagnosis if applied to big data. Recent
advances in feature extraction and classification have enormously improved CBIR
results for digital images. However, considering the increasing accessibility
of big data in medical imaging, we are still in need of reducing both memory
requirements and computational expenses of image retrieval systems. This work
proposes to exclude the features of image blocks that exhibit a low encoding
error when learned by a autoencoder (). We examine the
histogram of autoendcoding errors of image blocks for each image class to
facilitate the decision which image regions, or roughly what percentage of an
image perhaps, shall be declared relevant for the retrieval task. This leads to
reduction of feature dimensionality and speeds up the retrieval process. To
validate the proposed scheme, we employ local binary patterns (LBP) and support
vector machines (SVM) which are both well-established approaches in CBIR
research community. As well, we use IRMA dataset with 14,410 x-ray images as
test data. The results show that the dimensionality of annotated feature
vectors can be reduced by up to 50% resulting in speedups greater than 27% at
expense of less than 1% decrease in the accuracy of retrieval when validating
the precision and recall of the top 20 hits.Comment: To appear in proceedings of The 5th International Conference on Image
Processing Theory, Tools and Applications (IPTA'15), Nov 10-13, 2015,
Orleans, Franc
- …