Search CORE

12,204 research outputs found

A Survey on Content-Aware Video Analysis for Sports

Author: Shih Huang-Chia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/03/2017
Field of study

Sports data analysis is becoming increasingly large-scale, diversified, and shared, but difficulty persists in rapidly accessing the most crucial information. Previous surveys have focused on the methodologies of sports video analysis from the spatiotemporal viewpoint instead of a content-based viewpoint, and few of these studies have considered semantics. This study develops a deeper interpretation of content-aware sports video analysis by examining the insight offered by research into the structure of content under different scenarios. On the basis of this insight, we provide an overview of the themes particularly relevant to the research on content-aware systems for broadcast sports. Specifically, we focus on the video content analysis techniques applied in sportscasts over the past decade from the perspectives of fundamentals and general review, a content hierarchical model, and trends and challenges. Content-aware analysis methods are discussed with respect to object-, event-, and context-oriented groups. In each group, the gap between sensation and content excitement must be bridged using proper strategies. In this regard, a content-aware approach is required to determine user demands. Finally, the paper summarizes the future trends and challenges for sports video analysis. We believe that our findings can advance the field of research on content-aware video analysis for broadcast sports.Comment: Accepted for publication in IEEE Transactions on Circuits and Systems for Video Technology (TCSVT

arXiv.org e-Print Archive

A Survey Of Activity Recognition And Understanding The Behavior In Video Survelliance

Author: Kumar Dhananjay
Revathi A. R.
Publication venue
Publication date: 29/07/2012
Field of study

This paper presents a review of human activity recognition and behaviour understanding in video sequence. The key objective of this paper is to provide a general review on the overall process of a surveillance system used in the current trend. Visual surveillance system is directed on automatic identification of events of interest, especially on tracking and classification of moving objects. The processing step of the video surveillance system includes the following stages: Surrounding model, object representation, object tracking, activity recognition and behaviour understanding. It describes techniques that use to define a general set of activities that are applicable to a wide range of scenes and environments in video sequence.Comment: 14 pages, 5 figures, 5 table

arXiv.org e-Print Archive

Advances in Human Action Recognition: A Survey

Author: Buckles Bill P.
Cheng Guangchun
Namuduri Kamesh
Saudagar Abdullah N.
Wan Yiwen
Publication venue
Publication date: 23/01/2015
Field of study

Human action recognition has been an important topic in computer vision due to its many applications such as video surveillance, human machine interaction and video retrieval. One core problem behind these applications is automatically recognizing low-level actions and high-level activities of interest. The former is usually the basis for the latter. This survey gives an overview of the most recent advances in human action recognition during the past several years, following a well-formed taxonomy proposed by a previous survey. From this state-of-the-art survey, researchers can view a panorama of progress in this area for future research

arXiv.org e-Print Archive

Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-related Applications

Author: Cohn Jeffrey F.
Corneanu Ciprian
Escalera Sergio
Oliu Marc
Publication venue
Publication date: 10/06/2016
Field of study

Facial expressions are an important way through which humans interact socially. Building a system capable of automatically recognizing facial expressions from images and video has been an intense field of study in recent years. Interpreting such expressions remains challenging and much research is needed about the way they relate to human affect. This paper presents a general overview of automatic RGB, 3D, thermal and multimodal facial expression analysis. We define a new taxonomy for the field, encompassing all steps from face detection to facial expression recognition, and describe and classify the state of the art methods accordingly. We also present the important datasets and the bench-marking of most influential methods. We conclude with a general discussion about trends, important questions and future lines of research

arXiv.org e-Print Archive

Fast and Robust Dynamic Hand Gesture Recognition via Key Frames Extraction and Feature Fusion

Author: Liu Hong
Sebe Nicu
Tang Hao
Xiao Wei
Publication venue
Publication date: 14/01/2019
Field of study

Gesture recognition is a hot topic in computer vision and pattern recognition, which plays a vitally important role in natural human-computer interface. Although great progress has been made recently, fast and robust hand gesture recognition remains an open problem, since the existing methods have not well balanced the performance and the efficiency simultaneously. To bridge it, this work combines image entropy and density clustering to exploit the key frames from hand gesture video for further feature extraction, which can improve the efficiency of recognition. Moreover, a feature fusion strategy is also proposed to further improve feature representation, which elevates the performance of recognition. To validate our approach in a "wild" environment, we also introduce two new datasets called HandGesture and Action3D datasets. Experiments consistently demonstrate that our strategy achieves competitive results on Northwestern University, Cambridge, HandGesture and Action3D hand gesture datasets. Our code and datasets will release at https://github.com/Ha0Tang/HandGestureRecognition.Comment: 11 pages, 3 figures, accepted to NeuroComputin

arXiv.org e-Print Archive

Dynamic Matrix Decomposition for Action Recognition

Author: Basit Abdul
Publication venue
Publication date: 20/02/2019
Field of study

Designing a technique for the automatic analysis of different actions in videos in order to detect the presence of interested activities is of high significance nowadays. In this paper, we explore a robust and dynamic appearance technique for the purpose of identifying different action activities. We also exploit a low-rank and structured sparse matrix decomposition (LSMD) method to better model these activities.. Our method is effective in encoding localized spatio-temporal features which enables the analysis of local motion taking place in the video. Our proposed model use adjacent frame differences as the input to the method thereby forcing it to capture the changes occurring in the video. The performance of our model is tested on a benchmark dataset in terms of detection accuracy. Results achieved with our model showed the promising capability of our model in detecting action activities

arXiv.org e-Print Archive

Computational models of attention

Author: Borji Ali
Itti Laurent
Publication venue
Publication date: 24/10/2015
Field of study

This chapter reviews recent computational models of visual attention. We begin with models for the bottom-up or stimulus-driven guidance of attention to salient visual items, which we examine in seven different broad categories. We then examine more complex models which address the top-down or goal-oriented guidance of attention towards items that are more relevant to the task at hand

arXiv.org e-Print Archive

Spatiotemporal Knowledge Distillation for Efficient Estimation of Aerial Video Saliency

Author: Fu Kui
Ge Shiming
Li Jia
Zhao Shengwei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/04/2019
Field of study

The performance of video saliency estimation techniques has achieved significant advances along with the rapid development of Convolutional Neural Networks (CNNs). However, devices like cameras and drones may have limited computational capability and storage space so that the direct deployment of complex deep saliency models becomes infeasible. To address this problem, this paper proposes a dynamic saliency estimation approach for aerial videos via spatiotemporal knowledge distillation. In this approach, five components are involved, including two teachers, two students and the desired spatiotemporal model. The knowledge of spatial and temporal saliency is first separately transferred from the two complex and redundant teachers to their simple and compact students, and the input scenes are also degraded from high-resolution to low-resolution to remove the probable data redundancy so as to greatly speed up the feature extraction process. After that, the desired spatiotemporal model is further trained by distilling and encoding the spatial and temporal saliency knowledge of two students into a unified network. In this manner, the inter-model redundancy can be further removed for the effective estimation of dynamic saliency on aerial videos. Experimental results show that the proposed approach outperforms ten state-of-the-art models in estimating visual saliency on aerial videos, while its speed reaches up to 28,738 FPS on the GPU platform

arXiv.org e-Print Archive

Enabling Edge Cloud Intelligence for Activity Learning in Smart Home

Author: Bouguettaya Athman
Dong Hai
Huang Bing
Publication venue
Publication date: 14/05/2020
Field of study

We propose a novel activity learning framework based on Edge Cloud architecture for the purpose of recognizing and predicting human activities. Although activity recognition has been vastly studied by many researchers, the temporal features that constitute an activity, which can provide useful insights for activity models, have not been exploited to their full potentials by mining algorithms. In this paper, we utilize temporal features for activity recognition and prediction in a single smart home setting. We discover activity patterns and temporal relations such as the order of activities from real data to develop a prompting system. Analysis of real data collected from smart homes was used to validate the proposed method

arXiv.org e-Print Archive

Human Action Recognition and Prediction: A Survey

Author: Fu Yun
Kong Yu
Publication venue
Publication date: 01/07/2018
Field of study

Derived from rapid advances in computer vision and machine learning, video analysis tasks have been moving from inferring the present state to predicting the future state. Vision-based action recognition and prediction from videos are such tasks, where action recognition is to infer human actions (present state) based upon complete action executions, and action prediction to predict human actions (future state) based upon incomplete action executions. These two tasks have become particularly prevalent topics recently because of their explosively emerging real-world applications, such as visual surveillance, autonomous driving vehicle, entertainment, and video retrieval, etc. Many attempts have been devoted in the last a few decades in order to build a robust and effective framework for action recognition and prediction. In this paper, we survey the complete state-of-the-art techniques in the action recognition and prediction. Existing models, popular algorithms, technical difficulties, popular action databases, evaluation protocols, and promising future directions are also provided with systematic discussions

arXiv.org e-Print Archive