3 research outputs found
Material Based Object Tracking in Hyperspectral Videos: Benchmark and Algorithms
Traditional color images only depict color intensities in red, green and blue
channels, often making object trackers fail in challenging scenarios, e.g.,
background clutter and rapid changes of target appearance. Alternatively,
material information of targets contained in a large amount of bands of
hyperspectral images (HSI) is more robust to these difficult conditions. In
this paper, we conduct a comprehensive study on how material information can be
utilized to boost object tracking from three aspects: benchmark dataset,
material feature representation and material based tracking. In terms of
benchmark, we construct a dataset of fully-annotated videos, which contain both
hyperspectral and color sequences of the same scene. Material information is
represented by spectral-spatial histogram of multidimensional gradient, which
describes the 3D local spectral-spatial structure in an HSI, and fractional
abundances of constituted material components which encode the underlying
material distribution. These two types of features are embedded into
correlation filters, yielding material based tracking. Experimental results on
the collected benchmark dataset show the potentials and advantages of material
based object tracking.Comment: Update result
SPARK: Spatial-aware Online Incremental Attack Against Visual Tracking
Adversarial attacks of deep neural networks have been intensively studied on
image, audio, natural language, patch, and pixel classification tasks.
Nevertheless, as a typical, while important real-world application, the
adversarial attacks of online video object tracking that traces an object's
moving trajectory instead of its category are rarely explored. In this paper,
we identify a new task for the adversarial attack to visual tracking: online
generating imperceptible perturbations that mislead trackers along an incorrect
(Untargeted Attack, UA) or specified trajectory (Targeted Attack, TA). To this
end, we first propose a \textit{spatial-aware} basic attack by adapting
existing attack methods, i.e., FGSM, BIM, and C&W, and comprehensively analyze
the attacking performance. We identify that online object tracking poses two
new challenges: 1) it is difficult to generate imperceptible perturbations that
can transfer across frames, and 2) real-time trackers require the attack to
satisfy a certain level of efficiency. To address these challenges, we further
propose the spatial-aware online incremental attack (a.k.a. SPARK) that
performs spatial-temporal sparse incremental perturbations online and makes the
adversarial attack less perceptible. In addition, as an optimization-based
method, SPARK quickly converges to very small losses within several iterations
by considering historical incremental perturbations, making it much more
efficient than basic attacks. The in-depth evaluation on state-of-the-art
trackers (i.e., SiamRPN++ with AlexNet, MobileNetv2, and ResNet-50, and SiamDW)
on OTB100, VOT2018, UAV123, and LaSOT demonstrates the effectiveness and
transferability of SPARK in misleading the trackers under both UA and TA with
minor perturbations.Comment: 18 pages, 5 figures. This paper has been accepted to ECCV202
Exploring Image Enhancement for Salient Object Detection in Low Light Images
Low light images captured in a non-uniform illumination environment usually
are degraded with the scene depth and the corresponding environment lights.
This degradation results in severe object information loss in the degraded
image modality, which makes the salient object detection more challenging due
to low contrast property and artificial light influence. However, existing
salient object detection models are developed based on the assumption that the
images are captured under a sufficient brightness environment, which is
impractical in real-world scenarios. In this work, we propose an image
enhancement approach to facilitate the salient object detection in low light
images. The proposed model directly embeds the physical lighting model into the
deep neural network to describe the degradation of low light images, in which
the environment light is treated as a point-wise variate and changes with local
content. Moreover, a Non-Local-Block Layer is utilized to capture the
difference of local content of an object against its local neighborhood
favoring regions. To quantitative evaluation, we construct a low light Images
dataset with pixel-level human-labeled ground-truth annotations and report
promising results on four public datasets and our benchmark dataset.Comment: Appearing at ACM Transactions on Multimedia Computing,
Communications, and Application