139 research outputs found
HPFormer: Hyperspectral image prompt object tracking
Hyperspectral imagery contains abundant spectral information beyond the
visible RGB bands, providing rich discriminative details about objects in a
scene. Leveraging such data has the potential to enhance visual tracking
performance. While prior hyperspectral trackers employ CNN or hybrid
CNN-Transformer architectures, we propose a novel approach HPFormer on
Transformers to capitalize on their powerful representation learning
capabilities. The core of HPFormer is a Hyperspectral Hybrid Attention (HHA)
module which unifies feature extraction and fusion within one component through
token interactions. Additionally, a Transform Band Module (TBM) is introduced
to selectively aggregate spatial details and spectral signatures from the full
hyperspectral input for injecting informative target representations. Extensive
experiments demonstrate state-of-the-art performance of HPFormer on benchmark
NIR and VIS tracking datasets. Our work provides new insights into harnessing
the strengths of transformers and hyperspectral fusion to advance robust object
tracking
MobiFace: A Novel Dataset for Mobile Face Tracking in the Wild
Face tracking serves as the crucial initial step in mobile applications
trying to analyse target faces over time in mobile settings. However, this
problem has received little attention, mainly due to the scarcity of dedicated
face tracking benchmarks. In this work, we introduce MobiFace, the first
dataset for single face tracking in mobile situations. It consists of 80
unedited live-streaming mobile videos captured by 70 different smartphone users
in fully unconstrained environments. Over bounding boxes are manually
labelled. The videos are carefully selected to cover typical smartphone usage.
The videos are also annotated with 14 attributes, including 6 newly proposed
attributes and 8 commonly seen in object tracking. 36 state-of-the-art
trackers, including facial landmark trackers, generic object trackers and
trackers that we have fine-tuned or improved, are evaluated. The results
suggest that mobile face tracking cannot be solved through existing approaches.
In addition, we show that fine-tuning on the MobiFace training data
significantly boosts the performance of deep learning-based trackers,
suggesting that MobiFace captures the unique characteristics of mobile face
tracking. Our goal is to offer the community a diverse dataset to enable the
design and evaluation of mobile face trackers. The dataset, annotations and the
evaluation server will be on \url{https://mobiface.github.io/}.Comment: To appear on The 14th IEEE International Conference on Automatic Face
and Gesture Recognition (FG 2019
- …