11,988 research outputs found
Quality-based Multimodal Classification Using Tree-Structured Sparsity
Recent studies have demonstrated advantages of information fusion based on
sparsity models for multimodal classification. Among several sparsity models,
tree-structured sparsity provides a flexible framework for extraction of
cross-correlated information from different sources and for enforcing group
sparsity at multiple granularities. However, the existing algorithm only solves
an approximated version of the cost functional and the resulting solution is
not necessarily sparse at group levels. This paper reformulates the
tree-structured sparse model for multimodal classification task. An accelerated
proximal algorithm is proposed to solve the optimization problem, which is an
efficient tool for feature-level fusion among either homogeneous or
heterogeneous sources of information. In addition, a (fuzzy-set-theoretic)
possibilistic scheme is proposed to weight the available modalities, based on
their respective reliability, in a joint optimization problem for finding the
sparsity codes. This approach provides a general framework for quality-based
fusion that offers added robustness to several sparsity-based multimodal
classification algorithms. To demonstrate their efficacy, the proposed methods
are evaluated on three different applications - multiview face recognition,
multimodal face recognition, and target classification.Comment: To Appear in 2014 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR 2014
Learning Adaptive Discriminative Correlation Filters via Temporal Consistency Preserving Spatial Feature Selection for Robust Visual Tracking
With efficient appearance learning models, Discriminative Correlation Filter
(DCF) has been proven to be very successful in recent video object tracking
benchmarks and competitions. However, the existing DCF paradigm suffers from
two major issues, i.e., spatial boundary effect and temporal filter
degradation. To mitigate these challenges, we propose a new DCF-based tracking
method. The key innovations of the proposed method include adaptive spatial
feature selection and temporal consistent constraints, with which the new
tracker enables joint spatial-temporal filter learning in a lower dimensional
discriminative manifold. More specifically, we apply structured spatial
sparsity constraints to multi-channel filers. Consequently, the process of
learning spatial filters can be approximated by the lasso regularisation. To
encourage temporal consistency, the filter model is restricted to lie around
its historical value and updated locally to preserve the global structure in
the manifold. Last, a unified optimisation framework is proposed to jointly
select temporal consistency preserving spatial features and learn
discriminative filters with the augmented Lagrangian method. Qualitative and
quantitative evaluations have been conducted on a number of well-known
benchmarking datasets such as OTB2013, OTB50, OTB100, Temple-Colour, UAV123 and
VOT2018. The experimental results demonstrate the superiority of the proposed
method over the state-of-the-art approaches
An Attention-based Collaboration Framework for Multi-View Network Representation Learning
Learning distributed node representations in networks has been attracting
increasing attention recently due to its effectiveness in a variety of
applications. Existing approaches usually study networks with a single type of
proximity between nodes, which defines a single view of a network. However, in
reality there usually exists multiple types of proximities between nodes,
yielding networks with multiple views. This paper studies learning node
representations for networks with multiple views, which aims to infer robust
node representations across different views. We propose a multi-view
representation learning approach, which promotes the collaboration of different
views and lets them vote for the robust representations. During the voting
process, an attention mechanism is introduced, which enables each node to focus
on the most informative views. Experimental results on real-world networks show
that the proposed approach outperforms existing state-of-the-art approaches for
network representation learning with a single view and other competitive
approaches with multiple views.Comment: CIKM 201
- …