3,858 research outputs found
Discriminative Scale Space Tracking
Accurate scale estimation of a target is a challenging research problem in
visual object tracking. Most state-of-the-art methods employ an exhaustive
scale search to estimate the target size. The exhaustive search strategy is
computationally expensive and struggles when encountered with large scale
variations. This paper investigates the problem of accurate and robust scale
estimation in a tracking-by-detection framework. We propose a novel scale
adaptive tracking approach by learning separate discriminative correlation
filters for translation and scale estimation. The explicit scale filter is
learned online using the target appearance sampled at a set of different
scales. Contrary to standard approaches, our method directly learns the
appearance change induced by variations in the target scale. Additionally, we
investigate strategies to reduce the computational cost of our approach.
Extensive experiments are performed on the OTB and the VOT2014 datasets.
Compared to the standard exhaustive scale search, our approach achieves a gain
of 2.5% in average overlap precision on the OTB dataset. Additionally, our
method is computationally efficient, operating at a 50% higher frame rate
compared to the exhaustive scale search. Our method obtains the top rank in
performance by outperforming 19 state-of-the-art trackers on OTB and 37
state-of-the-art trackers on VOT2014.Comment: To appear in TPAMI. This is the journal extension of the
VOT2014-winning DSST tracking metho
Coding local and global binary visual features extracted from video sequences
Binary local features represent an effective alternative to real-valued
descriptors, leading to comparable results for many visual analysis tasks,
while being characterized by significantly lower computational complexity and
memory requirements. When dealing with large collections, a more compact
representation based on global features is often preferred, which can be
obtained from local features by means of, e.g., the Bag-of-Visual-Word (BoVW)
model. Several applications, including for example visual sensor networks and
mobile augmented reality, require visual features to be transmitted over a
bandwidth-limited network, thus calling for coding techniques that aim at
reducing the required bit budget, while attaining a target level of efficiency.
In this paper we investigate a coding scheme tailored to both local and global
binary features, which aims at exploiting both spatial and temporal redundancy
by means of intra- and inter-frame coding. In this respect, the proposed coding
scheme can be conveniently adopted to support the Analyze-Then-Compress (ATC)
paradigm. That is, visual features are extracted from the acquired content,
encoded at remote nodes, and finally transmitted to a central controller that
performs visual analysis. This is in contrast with the traditional approach, in
which visual content is acquired at a node, compressed and then sent to a
central unit for further processing, according to the Compress-Then-Analyze
(CTA) paradigm. In this paper we experimentally compare ATC and CTA by means of
rate-efficiency curves in the context of two different visual analysis tasks:
homography estimation and content-based retrieval. Our results show that the
novel ATC paradigm based on the proposed coding primitives can be competitive
with CTA, especially in bandwidth limited scenarios.Comment: submitted to IEEE Transactions on Image Processin
Self-Selective Correlation Ship Tracking Method for Smart Ocean System
In recent years, with the development of the marine industry, navigation
environment becomes more complicated. Some artificial intelligence
technologies, such as computer vision, can recognize, track and count the
sailing ships to ensure the maritime security and facilitates the management
for Smart Ocean System. Aiming at the scaling problem and boundary effect
problem of traditional correlation filtering methods, we propose a
self-selective correlation filtering method based on box regression (BRCF). The
proposed method mainly include: 1) A self-selective model with negative samples
mining method which effectively reduces the boundary effect in strengthening
the classification ability of classifier at the same time; 2) A bounding box
regression method combined with a key points matching method for the scale
prediction, leading to a fast and efficient calculation. The experimental
results show that the proposed method can effectively deal with the problem of
ship size changes and background interference. The success rates and precisions
were higher than Discriminative Scale Space Tracking (DSST) by over 8
percentage points on the marine traffic dataset of our laboratory. In terms of
processing speed, the proposed method is higher than DSST by nearly 22 Frames
Per Second (FPS)
Deep learning cardiac motion analysis for human survival prediction
Motion analysis is used in computer vision to understand the behaviour of
moving objects in sequences of images. Optimising the interpretation of dynamic
biological systems requires accurate and precise motion tracking as well as
efficient representations of high-dimensional motion trajectories so that these
can be used for prediction tasks. Here we use image sequences of the heart,
acquired using cardiac magnetic resonance imaging, to create time-resolved
three-dimensional segmentations using a fully convolutional network trained on
anatomical shape priors. This dense motion model formed the input to a
supervised denoising autoencoder (4Dsurvival), which is a hybrid network
consisting of an autoencoder that learns a task-specific latent code
representation trained on observed outcome data, yielding a latent
representation optimised for survival prediction. To handle right-censored
survival outcomes, our network used a Cox partial likelihood loss function. In
a study of 302 patients the predictive accuracy (quantified by Harrell's
C-index) was significantly higher (p < .0001) for our model C=0.73 (95 CI:
0.68 - 0.78) than the human benchmark of C=0.59 (95 CI: 0.53 - 0.65). This
work demonstrates how a complex computer vision task using high-dimensional
medical image data can efficiently predict human survival
- …