Search CORE

15,863 research outputs found

Detect to Track and Track to Detect

Author: Feichtenhofer Christoph
Pinz Axel
Zisserman Andrew
Publication venue
Publication date: 01/01/2017
Field of study

Recent approaches for high accuracy detection and tracking of object categories in video consist of complex multistage solutions that become more cumbersome each year. In this paper we propose a ConvNet architecture that jointly performs detection and tracking, solving the task in a simple and effective way. Our contributions are threefold: (i) we set up a ConvNet architecture for simultaneous detection and tracking, using a multi-task objective for frame-based object detection and across-frame track regression; (ii) we introduce correlation features that represent object co-occurrences across time to aid the ConvNet during tracking; and (iii) we link the frame level detections based on our across-frame tracklets to produce high accuracy detections at the video level. Our ConvNet architecture for spatiotemporal object detection is evaluated on the large-scale ImageNet VID dataset where it achieves state-of-the-art results. Our approach provides better single model performance than the winning method of the last ImageNet challenge while being conceptually much simpler. Finally, we show that by increasing the temporal stride we can dramatically increase the tracker speed.Comment: ICCV 2017. Code and models: https://github.com/feichtenhofer/Detect-Track Results: https://www.robots.ox.ac.uk/~vgg/research/detect-track

arXiv.org e-Print Archive

Oxford University Research Archive

Real time motion estimation using a neural architecture implemented on GPUs

Author: Angelopoulou Anastassia
Azorin-Lopez Jorge
Garcia-Rodriguez Jose
García-Chamizo Juan Manuel
Orts-Escolano Sergio
Psarrou Alexandra
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

This work describes a neural network based architecture that represents and estimates object motion in videos. This architecture addresses multiple computer vision tasks such as image segmentation, object representation or characterization, motion analysis and tracking. The use of a neural network architecture allows for the simultaneous estimation of global and local motion and the representation of deformable objects. This architecture also avoids the problem of finding corresponding features while tracking moving objects. Due to the parallel nature of neural networks, the architecture has been implemented on GPUs that allows the system to meet a set of requirements such as: time constraints management, robustness, high processing speed and re-configurability. Experiments are presented that demonstrate the validity of our architecture to solve problems of mobile agents tracking and motion analysis.This work was partially funded by the Spanish Government DPI2013-40534-R grant and Valencian Government GV/2013/005 grant

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Real time motion estimation using a neural architecture implemented on GPUs

Author: Angelopoulou A.
Angelopoulou A.
Azorin-Lopez J.
Azorin-Lopez J.
Garcia-Chamizo J.M.
Garcia-Chamizo J.M.
Garcia-Rodriguez J.
Garcia-Rodriguez J.
Orts Escolano S.
Orts Escolano S.
Psarrou A.
Psarrou A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

WestminsterResearch

Learning Intelligent Dialogs for Bounding Box Annotation

Author: Ferrari Vittorio
Konyushkova Ksenia
Lampert Christoph
Uijlings Jasper
Publication venue
Publication date: 01/01/2018
Field of study

We introduce Intelligent Annotation Dialogs for bounding box annotation. We train an agent to automatically choose a sequence of actions for a human annotator to produce a bounding box in a minimal amount of time. Specifically, we consider two actions: box verification, where the annotator verifies a box generated by an object detector, and manual box drawing. We explore two kinds of agents, one based on predicting the probability that a box will be positively verified, and the other based on reinforcement learning. We demonstrate that (1) our agents are able to learn efficient annotation strategies in several scenarios, automatically adapting to the image difficulty, the desired quality of the boxes, and the detector strength; (2) in all scenarios the resulting annotation dialogs speed up annotation compared to manual box drawing alone and box verification alone, while also outperforming any fixed combination of verification and drawing in most scenarios; (3) in a realistic scenario where the detector is iteratively re-trained, our agents evolve a series of strategies that reflect the shifting trade-off between verification and drawing as the detector grows stronger.Comment: This paper appeared at CVPR 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

IST Austria: PubRep (Institute of Science and Technology)