659 research outputs found
End-to-end representation learning for Correlation Filter based tracking
The Correlation Filter is an algorithm that trains a linear template to
discriminate between images and their translations. It is well suited to object
tracking because its formulation in the Fourier domain provides a fast
solution, enabling the detector to be re-trained once per frame. Previous works
that use the Correlation Filter, however, have adopted features that were
either manually designed or trained for a different task. This work is the
first to overcome this limitation by interpreting the Correlation Filter
learner, which has a closed-form solution, as a differentiable layer in a deep
neural network. This enables learning deep features that are tightly coupled to
the Correlation Filter. Experiments illustrate that our method has the
important practical benefit of allowing lightweight architectures to achieve
state-of-the-art performance at high framerates.Comment: To appear at CVPR 201
Fast Visual Tracking with Squeeze and Excitation Region Proposal Network
Funding Information: This work was funded by the National Natural Science Foundation of China (Grant No. 62272063, 62072056, 61902041 and 61801170), Open research fund of Key Lab of Broadband Wireless Communication and Sensor Network Technology (Nanjing University of Posts and Telecommunications), Ministry of Education, project of Education Department Cooperation Cultivation (Grant No. 201602011005 and No. 201702135098), China Postdoctoral Science Foundation (Grant No.Peer reviewedPublisher PD
Real-time camera operation and tracking for the streaming of teaching activities
The primary driving force of this work comes from the Lab’s urgent needs to offer students
the opportunity to attend a remote event from home or anywhere in the world in real-time.
The main objective of this work is to build a real-time tracker to follow the movements of
the lecturer. After that we will build a framework to handle a PTZ (Pan Tilt and Zoom)
camera based on the lecturer movements. That is, if the lecturer goes to the left, the camera
will turn to the left.
To tackle this project we will follow a project developed by Gebrehiwot, A. which
involved building a real-time tracker. The problem of this tracker is that was implemented
on Ubuntu and running with a very complex CNN which required the use a good GPU on
our computer. As Gebrehiwot, A. rightly points out at the end of his report, not everyone
has an Ubuntu partition or a GPU on their computers so we started moving the real time
tracker to Windows. To achieve this objective we used Anaconda Windows which made
our work much easier. After that we implemented a lightweight backbone of the tracker
allowing us to run it on computers with a fewer processing power. Once that all this
process was done, we put into practice the mentioned framework for handling the
movement of the PTZ camera. This framework uses the implemented lightweight tracker
to follow the lecturer moves and depending on these movements the camera will pan and
tilt automatically. We tested this framework on streaming platforms like YouTube proving
that can greatly improve the quality of online classes.
Finally we draw conclusions from the work done and propose future work to improve the
framework
Optimisation of a Siamese Neural Network for Real-Time Energy Efficient Object Tracking
In this paper the research on optimisation of visual object tracking using a
Siamese neural network for embedded vision systems is presented. It was assumed
that the solution shall operate in real-time, preferably for a high resolution
video stream, with the lowest possible energy consumption. To meet these
requirements, techniques such as the reduction of computational precision and
pruning were considered. Brevitas, a tool dedicated for optimisation and
quantisation of neural networks for FPGA implementation, was used. A number of
training scenarios were tested with varying levels of optimisations - from
integer uniform quantisation with 16 bits to ternary and binary networks. Next,
the influence of these optimisations on the tracking performance was evaluated.
It was possible to reduce the size of the convolutional filters up to 10 times
in relation to the original network. The obtained results indicate that using
quantisation can significantly reduce the memory and computational complexity
of the proposed network while still enabling precise tracking, thus allow to
use it in embedded vision systems. Moreover, quantisation of weights positively
affects the network training by decreasing overfitting.Comment: 12 pages, accepted for ICCVG 202
- …