1,027 research outputs found

    Advantages of dynamic analysis in HOG-PCA feature space for video moving object classification

    Get PDF
    Classification of moving objects for video surveillance applications still remains a challenging problem due to the video inherently changing conditions such as lighting or resolution. This paper proposes a new approach for vehicle/pedestrian object classification based on the learning of a static kNN classifier, a dynamic Hidden Markov Model (HMM)-based classifier, and the definition of a fusion rule that combines the two outputs. The main novelty consists in the study of the dynamic aspects of the moving objects by analysing the trajectories of the features followed in the HOG-PCA feature space, instead of the classical trajectory study based on the frame coordinates. The complete hybrid system was tested on the VIRAT database and worked in real time, yielding up to 100% peak accuracy rate in the tested video sequences

    Unsupervised Understanding of Location and Illumination Changes in Egocentric Videos

    Full text link
    Wearable cameras stand out as one of the most promising devices for the upcoming years, and as a consequence, the demand of computer algorithms to automatically understand the videos recorded with them is increasing quickly. An automatic understanding of these videos is not an easy task, and its mobile nature implies important challenges to be faced, such as the changing light conditions and the unrestricted locations recorded. This paper proposes an unsupervised strategy based on global features and manifold learning to endow wearable cameras with contextual information regarding the light conditions and the location captured. Results show that non-linear manifold methods can capture contextual patterns from global features without compromising large computational resources. The proposed strategy is used, as an application case, as a switching mechanism to improve the hand-detection problem in egocentric videos.Comment: Submitted for publicatio

    Assessing the Performance of Handcrafted Features for Human action Recognition

    Get PDF
    Recognition of Human action such as running, punching, bending, kicking etc. plays an vital role in futuristic applications like intelligent video surveillance, health care monitoring, robotics, smart automation system, computer gaming etc. This field relies on various approaches based on hand crafted features like PCA, HOG, LBPH, DWT, STIP, SWF, SWFHOG and deep learning techniques like CNN, RNN and their variants. Though many approaches have been proposed and implemented by researchers, the literature survey suggests that a detailed understanding of the approaches and a comparison of advantages and limitations is required to develop more accurate action recognition method. This paper focuses on this issue and gives detailed analysis of results obtained by implementing algorithms on standardize open source datasets of varying complexity namely Weizmann, KTH, UT Interaction and UCF sports.  The results are compared based on the classification accuracy as it is one of the performance measure for checking reliability of the method. The comparison shows that, SHFHOG feature gives the best classification accuracy as compared to other handcrafted features and also outperforms the simple CNN

    A robust and efficient video representation for action recognition

    Get PDF
    This paper introduces a state-of-the-art video representation and applies it to efficient action recognition and detection. We first propose to improve the popular dense trajectory features by explicit camera motion estimation. More specifically, we extract feature point matches between frames using SURF descriptors and dense optical flow. The matches are used to estimate a homography with RANSAC. To improve the robustness of homography estimation, a human detector is employed to remove outlier matches from the human body as human motion is not constrained by the camera. Trajectories consistent with the homography are considered as due to camera motion, and thus removed. We also use the homography to cancel out camera motion from the optical flow. This results in significant improvement on motion-based HOF and MBH descriptors. We further explore the recent Fisher vector as an alternative feature encoding approach to the standard bag-of-words histogram, and consider different ways to include spatial layout information in these encodings. We present a large and varied set of evaluations, considering (i) classification of short basic actions on six datasets, (ii) localization of such actions in feature-length movies, and (iii) large-scale recognition of complex events. We find that our improved trajectory features significantly outperform previous dense trajectories, and that Fisher vectors are superior to bag-of-words encodings for video recognition tasks. In all three tasks, we show substantial improvements over the state-of-the-art results

    Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields

    Full text link
    This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatio-temporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state-of-the-art. Especially, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure
    • …
    corecore