659 research outputs found

    End-to-end representation learning for Correlation Filter based tracking

    Get PDF
    The Correlation Filter is an algorithm that trains a linear template to discriminate between images and their translations. It is well suited to object tracking because its formulation in the Fourier domain provides a fast solution, enabling the detector to be re-trained once per frame. Previous works that use the Correlation Filter, however, have adopted features that were either manually designed or trained for a different task. This work is the first to overcome this limitation by interpreting the Correlation Filter learner, which has a closed-form solution, as a differentiable layer in a deep neural network. This enables learning deep features that are tightly coupled to the Correlation Filter. Experiments illustrate that our method has the important practical benefit of allowing lightweight architectures to achieve state-of-the-art performance at high framerates.Comment: To appear at CVPR 201

    Fast Visual Tracking with Squeeze and Excitation Region Proposal Network

    Get PDF
    Funding Information: This work was funded by the National Natural Science Foundation of China (Grant No. 62272063, 62072056, 61902041 and 61801170), Open research fund of Key Lab of Broadband Wireless Communication and Sensor Network Technology (Nanjing University of Posts and Telecommunications), Ministry of Education, project of Education Department Cooperation Cultivation (Grant No. 201602011005 and No. 201702135098), China Postdoctoral Science Foundation (Grant No.Peer reviewedPublisher PD

    Real-time camera operation and tracking for the streaming of teaching activities

    Full text link
    The primary driving force of this work comes from the Lab’s urgent needs to offer students the opportunity to attend a remote event from home or anywhere in the world in real-time. The main objective of this work is to build a real-time tracker to follow the movements of the lecturer. After that we will build a framework to handle a PTZ (Pan Tilt and Zoom) camera based on the lecturer movements. That is, if the lecturer goes to the left, the camera will turn to the left. To tackle this project we will follow a project developed by Gebrehiwot, A. which involved building a real-time tracker. The problem of this tracker is that was implemented on Ubuntu and running with a very complex CNN which required the use a good GPU on our computer. As Gebrehiwot, A. rightly points out at the end of his report, not everyone has an Ubuntu partition or a GPU on their computers so we started moving the real time tracker to Windows. To achieve this objective we used Anaconda Windows which made our work much easier. After that we implemented a lightweight backbone of the tracker allowing us to run it on computers with a fewer processing power. Once that all this process was done, we put into practice the mentioned framework for handling the movement of the PTZ camera. This framework uses the implemented lightweight tracker to follow the lecturer moves and depending on these movements the camera will pan and tilt automatically. We tested this framework on streaming platforms like YouTube proving that can greatly improve the quality of online classes. Finally we draw conclusions from the work done and propose future work to improve the framework

    Optimisation of a Siamese Neural Network for Real-Time Energy Efficient Object Tracking

    Full text link
    In this paper the research on optimisation of visual object tracking using a Siamese neural network for embedded vision systems is presented. It was assumed that the solution shall operate in real-time, preferably for a high resolution video stream, with the lowest possible energy consumption. To meet these requirements, techniques such as the reduction of computational precision and pruning were considered. Brevitas, a tool dedicated for optimisation and quantisation of neural networks for FPGA implementation, was used. A number of training scenarios were tested with varying levels of optimisations - from integer uniform quantisation with 16 bits to ternary and binary networks. Next, the influence of these optimisations on the tracking performance was evaluated. It was possible to reduce the size of the convolutional filters up to 10 times in relation to the original network. The obtained results indicate that using quantisation can significantly reduce the memory and computational complexity of the proposed network while still enabling precise tracking, thus allow to use it in embedded vision systems. Moreover, quantisation of weights positively affects the network training by decreasing overfitting.Comment: 12 pages, accepted for ICCVG 202
    corecore