63 research outputs found
A Multiple Component Matching Framework for Person Re-Identification
Person re-identification consists in recognizing an individual that has
already been observed over a network of cameras. It is a novel and challenging
research topic in computer vision, for which no reference framework exists yet.
Despite this, previous works share similar representations of human body based
on part decomposition and the implicit concept of multiple instances. Building
on these similarities, we propose a Multiple Component Matching (MCM) framework
for the person re-identification problem, which is inspired by Multiple
Component Learning, a framework recently proposed for object detection. We show
that previous techniques for person re-identification can be considered
particular implementations of our MCM framework. We then present a novel person
re-identification technique as a direct, simple implementation of our
framework, focused in particular on robustness to varying lighting conditions,
and show that it can attain state of the art performances.Comment: Accepted paper, 16th Int. Conf. on Image Analysis and Processing
(ICIAP 2011), Ravenna, Italy, 14/09/201
Multi-Level Visual Alphabets
A central debate in visual perception theory is the argument for indirect versus direct perception; i.e., the use of intermediate, abstract, and hierarchical representations versus direct semantic interpretation of images through interaction with the outside world. We present a content-based representation that combines both approaches. The previously developed Visual Alphabet method is extended with a hierarchy of representations, each level feeding into the next one, but based on features that are not abstract but directly relevant to the task at hand. Explorative benchmark experiments are carried out on face images to investigate and explain the impact of the key parameters such as pattern size, number of prototypes, and distance measures used. Results show that adding an additional middle layer improves results, by encoding the spatial co-occurrence of lower-level pattern prototypes
Object Tracking with Multiple Instance Learning and Gaussian Mixture Model
Recently, Multiple Instance Learning (MIL) technique has been introduced for object tracking\linebreak applications, which has shown its good performance to handle drifting problem. While some instances in positive bags not only contain objects, but also contain the background, it is not reliable to simply assume that each feature of instances in positive bags obeys a single Gaussian distribution. In this paper, a tracker based on online multiple instance boosting has been developed, which employs Gaussian Mixture Model (GMM) and single Gaussian distribution respectively to model features of instances in positive and negative bags. The differences between samples and the model are integrated into the process of updating the parameters for GMM. With the Haar-like features extracted from the bags, a set of weak classifiers are trained to construct a strong classifier, which is used to track the object location at a new frame. And the classifier can be updated online frame by frame. Experimental results have shown that our tracker is more stable and efficient when dealing with the illumination, rotation, pose and appearance changes
A Convex Relaxation for Weakly Supervised Classifiers
This paper introduces a general multi-class approach to weakly supervised
classification. Inferring the labels and learning the parameters of the model
is usually done jointly through a block-coordinate descent algorithm such as
expectation-maximization (EM), which may lead to local minima. To avoid this
problem, we propose a cost function based on a convex relaxation of the
soft-max loss. We then propose an algorithm specifically designed to
efficiently solve the corresponding semidefinite program (SDP). Empirically,
our method compares favorably to standard ones on different datasets for
multiple instance learning and semi-supervised learning as well as on
clustering tasks.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012
Multimodal Visual Concept Learning with Weakly Supervised Techniques
Despite the availability of a huge amount of video data accompanied by
descriptive texts, it is not always easy to exploit the information contained
in natural language in order to automatically recognize video concepts. Towards
this goal, in this paper we use textual cues as means of supervision,
introducing two weakly supervised techniques that extend the Multiple Instance
Learning (MIL) framework: the Fuzzy Sets Multiple Instance Learning (FSMIL) and
the Probabilistic Labels Multiple Instance Learning (PLMIL). The former encodes
the spatio-temporal imprecision of the linguistic descriptions with Fuzzy Sets,
while the latter models different interpretations of each description's
semantics with Probabilistic Labels, both formulated through a convex
optimization algorithm. In addition, we provide a novel technique to extract
weak labels in the presence of complex semantics, that consists of semantic
similarity computations. We evaluate our methods on two distinct problems,
namely face and action recognition, in the challenging and realistic setting of
movies accompanied by their screenplays, contained in the COGNIMUSE database.
We show that, on both tasks, our method considerably outperforms a
state-of-the-art weakly supervised approach, as well as other baselines.Comment: CVPR 201
On Classification with Bags, Groups and Sets
Many classification problems can be difficult to formulate directly in terms
of the traditional supervised setting, where both training and test samples are
individual feature vectors. There are cases in which samples are better
described by sets of feature vectors, that labels are only available for sets
rather than individual samples, or, if individual labels are available, that
these are not independent. To better deal with such problems, several
extensions of supervised learning have been proposed, where either training
and/or test objects are sets of feature vectors. However, having been proposed
rather independently of each other, their mutual similarities and differences
have hitherto not been mapped out. In this work, we provide an overview of such
learning scenarios, propose a taxonomy to illustrate the relationships between
them, and discuss directions for further research in these areas
- …