Search CORE

53,624 research outputs found

Proceedings of the 1st Computer Science Student Workshop: Koc University Istinye Campus, Istanbul, Turkey, February 21, 2010

Author
Publication venue: Sabancı University
Publication date: 01/01/2010
Field of study

Deep Learning for Detecting Multiple Space-Time Action Tubes in Videos

Author: Cuzzolin Fabio
Saha Suman
Sapienza Michael
Singh Gurkirt
Torr Philip H. S.
Publication venue
Publication date: 01/01/2016
Field of study

In this work, we propose an approach to the spatiotemporal localisation (detection) and classification of multiple concurrent actions within temporally untrimmed videos. Our framework is composed of three stages. In stage 1, appearance and motion detection networks are employed to localise and score actions from colour images and optical flow. In stage 2, the appearance network detections are boosted by combining them with the motion detection scores, in proportion to their respective spatial overlap. In stage 3, sequences of detection boxes most likely to be associated with a single action instance, called action tubes, are constructed by solving two energy maximisation problems via dynamic programming. While in the first pass, action paths spanning the whole video are built by linking detection boxes over time using their class-specific scores and their spatial overlap, in the second pass, temporal trimming is performed by ensuring label consistency for all constituting detection boxes. We demonstrate the performance of our algorithm on the challenging UCF101, J-HMDB-21 and LIRIS-HARL datasets, achieving new state-of-the-art results across the board and significantly increasing detection speed at test time. We achieve a huge leap forward in action detection performance and report a 20% and 11% gain in mAP (mean average precision) on UCF-101 and J-HMDB-21 datasets respectively when compared to the state-of-the-art.Comment: Accepted by British Machine Vision Conference 201

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Oxford Brookes University: RADAR

Recommended from our members

Pseudorandom number generation with self programmable cellular automata

Author: Guan SU
Tan SK
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

In this paper, we propose a new class of cellular automata – self programming cellular automata (SPCA) with specific application to pseudorandom number generation. By changing a cell's state transition rules in relation to factors such as its neighboring cell's states, behavioral complexity can be increased and utilized. Interplay between the state transition neighborhood and rule selection neighborhood leads to a new composite neighborhood and state transition rule that is the linear combination of two different mappings with different temporal dependencies. It is proved that when the transitional matrices for both the state transition and rule selection neighborhood are non-singular, SPCA will not exhibit non-group behavior. Good performance can be obtained using simple neighborhoods with certain CA length, transition rules etc. Certain configurations of SPCA pass all DIEHARD and ENT tests with an implementation cost lower than current reported work. Output sampling methods are also suggested to improve output efficiency by sampling the outputs of the new rule selection neighborhoods

Brunel University Research Archive

Dense Piecewise Planar RGB-D SLAM for Indoor Environments

Author: Kosecka Jana
Le Phi-Hung
Publication venue
Publication date: 01/08/2017
Field of study

The paper exploits weak Manhattan constraints to parse the structure of indoor environments from RGB-D video sequences in an online setting. We extend the previous approach for single view parsing of indoor scenes to video sequences and formulate the problem of recovering the floor plan of the environment as an optimal labeling problem solved using dynamic programming. The temporal continuity is enforced in a recursive setting, where labeling from previous frames is used as a prior term in the objective function. In addition to recovery of piecewise planar weak Manhattan structure of the extended environment, the orthogonality constraints are also exploited by visual odometry and pose graph optimization. This yields reliable estimates in the presence of large motions and absence of distinctive features to track. We evaluate our method on several challenging indoors sequences demonstrating accurate SLAM and dense mapping of low texture environments. On existing TUM benchmark we achieve competitive results with the alternative approaches which fail in our environments.Comment: International Conference on Intelligent Robots and Systems (IROS) 201

arXiv.org e-Print Archive

Crossref