77 research outputs found
Optimisation of a Siamese Neural Network for Real-Time Energy Efficient Object Tracking
In this paper the research on optimisation of visual object tracking using a
Siamese neural network for embedded vision systems is presented. It was assumed
that the solution shall operate in real-time, preferably for a high resolution
video stream, with the lowest possible energy consumption. To meet these
requirements, techniques such as the reduction of computational precision and
pruning were considered. Brevitas, a tool dedicated for optimisation and
quantisation of neural networks for FPGA implementation, was used. A number of
training scenarios were tested with varying levels of optimisations - from
integer uniform quantisation with 16 bits to ternary and binary networks. Next,
the influence of these optimisations on the tracking performance was evaluated.
It was possible to reduce the size of the convolutional filters up to 10 times
in relation to the original network. The obtained results indicate that using
quantisation can significantly reduce the memory and computational complexity
of the proposed network while still enabling precise tracking, thus allow to
use it in embedded vision systems. Moreover, quantisation of weights positively
affects the network training by decreasing overfitting.Comment: 12 pages, accepted for ICCVG 202
Deformable Object Tracking with Gated Fusion
The tracking-by-detection framework receives growing attentions through the
integration with the Convolutional Neural Networks (CNNs). Existing
tracking-by-detection based methods, however, fail to track objects with severe
appearance variations. This is because the traditional convolutional operation
is performed on fixed grids, and thus may not be able to find the correct
response while the object is changing pose or under varying environmental
conditions. In this paper, we propose a deformable convolution layer to enrich
the target appearance representations in the tracking-by-detection framework.
We aim to capture the target appearance variations via deformable convolution,
which adaptively enhances its original features. In addition, we also propose a
gated fusion scheme to control how the variations captured by the deformable
convolution affect the original appearance. The enriched feature representation
through deformable convolution facilitates the discrimination of the CNN
classifier on the target object and background. Extensive experiments on the
standard benchmarks show that the proposed tracker performs favorably against
state-of-the-art methods
- …