100 research outputs found
Special Issue on dynamic textures in video
Cataloged from PDF version of article
Learning a perspective-embedded deconvolution network for crowd counting
© 2017 IEEE. We present a novel deep learning framework for crowd counting by learning a perspective-embedded deconvolution network. Perspective is an inherent property of most surveillance scenes. Unlike the traditional approaches that exploit the perspective as a separate normalization, we propose to fuse the perspective into a deconvolution network, aiming to obtain a robust, accurate and consistent crowd density map. Through layer-wise fusion, we merge perspective maps at different resolutions into the deconvolution network. With the injection of perspective, our network is driven to learn to combine the underlying scene geometric constraints adaptively, thus enabling an accurate interpretation from high-level feature maps to the pixel-wise crowd density map. In addition, our network allows generating density map for arbitrary-sized input in an end-to-end fashion. The proposed method achieves competitive result on the WorldExpo2010 crowd dataset
A new approach for in-vehicle camera traffic sign detection and recognition
In this paper we discuss theoretical foundations and a practical realization of a circular traffic sign detection and recognition system operating on board of a vehicle. To initially detect sign candidates in the scene, we utilize the circular Hough transform with an appropriate post-processing in the vote space. Track of an already established candidate is maintained using a function that encodes the relationship between a unique feature representation of the target object and the affine transinformation it is subject to. This function is learned on-the-fly via regression from random distortions applied to the last stable image of the sign. Finally, we adopt a novel AdaBoost algorithm to learn a sign similarity measure from example image pairs labeled either "same" or "different". This enables construction of an efficient multi-class classifier. Prototype implementation has been evaluated on a video captured in crowded street scenes. Good detection and recognition performance was achieved for a 14 class problem which reveals a high potential of our approach
Unsupervised Activity Extraction on Long-Term Video Recordings employing Soft Computing Relations
International audienceIn this work we present a novel approach for activity extraction and knowledge discovery from video employing fuzzy relations. Spatial and temporal properties from detected mobile objects are modeled with fuzzy relations. These can then be aggregated employing typical soft-computing algebra. A clustering algorithm based on the transitive closure calculation of the fuzzy relations allows finding spatio-temporal patterns of activity. We present results obtained on videos corresponding to different sequences of apron monitoring in the Toulouse airport in France
A robust tracking system for low frame rate video
Tracking in low frame rate (LFR) videos is one of the most important problems in the tracking literature. Most existing approaches treat LFR video tracking as an abrupt motion tracking problem. However, in LFR video tracking applications, LFR not only causes abrupt motions, but also large appearance changes of objects because the objects’ poses and the illumination may undergo large changes from one frame to the next. This adds extra difficulties to LFR video tracking. In this paper, we propose a robust and general tracking system for LFR videos. The tracking system consists of four major parts: dominant color-spatial based object representation, bin-ratio based similarity measure, annealed particle swarm optimization (PSO) based searching, and an integral image based parameter calculation. The first two parts are combined to provide a good solution to the appearance changes, and the abrupt motion is effectively captured by the annealed PSO based searching. Moreover, an integral image of model parameters is constructed, which provides a look-up table for parameters calculation. This greatly reduces the computational load. Experimental results demonstrate that the proposed tracking system can effectively tackle the difficulties caused by LFR
The visual object tracking VOT2016 challenge results
The Visual Object Tracking challenge VOT2016 aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results of 70 trackers are presented, with a large number of trackers being published at major computer vision conferences and journals in the recent years. The number of tested state-of-the-art trackers makes the VOT 2016 the largest and most challenging benchmark on short-term tracking to date. For each participating tracker, a short description is provided in the Appendix. The VOT2016 goes beyond its predecessors by (i) introducing a new semi-automatic ground truth bounding box annotation methodology and (ii) extending the evaluation system with the no-reset experiment. The dataset, the evaluation kit as well as the results are publicly available at the challenge website (http: //votchallenge.net)
- …