12,204 research outputs found
A Survey on Content-Aware Video Analysis for Sports
Sports data analysis is becoming increasingly large-scale, diversified, and
shared, but difficulty persists in rapidly accessing the most crucial
information. Previous surveys have focused on the methodologies of sports video
analysis from the spatiotemporal viewpoint instead of a content-based
viewpoint, and few of these studies have considered semantics. This study
develops a deeper interpretation of content-aware sports video analysis by
examining the insight offered by research into the structure of content under
different scenarios. On the basis of this insight, we provide an overview of
the themes particularly relevant to the research on content-aware systems for
broadcast sports. Specifically, we focus on the video content analysis
techniques applied in sportscasts over the past decade from the perspectives of
fundamentals and general review, a content hierarchical model, and trends and
challenges. Content-aware analysis methods are discussed with respect to
object-, event-, and context-oriented groups. In each group, the gap between
sensation and content excitement must be bridged using proper strategies. In
this regard, a content-aware approach is required to determine user demands.
Finally, the paper summarizes the future trends and challenges for sports video
analysis. We believe that our findings can advance the field of research on
content-aware video analysis for broadcast sports.Comment: Accepted for publication in IEEE Transactions on Circuits and Systems
for Video Technology (TCSVT
A Survey Of Activity Recognition And Understanding The Behavior In Video Survelliance
This paper presents a review of human activity recognition and behaviour
understanding in video sequence. The key objective of this paper is to provide
a general review on the overall process of a surveillance system used in the
current trend. Visual surveillance system is directed on automatic
identification of events of interest, especially on tracking and classification
of moving objects. The processing step of the video surveillance system
includes the following stages: Surrounding model, object representation, object
tracking, activity recognition and behaviour understanding. It describes
techniques that use to define a general set of activities that are applicable
to a wide range of scenes and environments in video sequence.Comment: 14 pages, 5 figures, 5 table
Advances in Human Action Recognition: A Survey
Human action recognition has been an important topic in computer vision due
to its many applications such as video surveillance, human machine interaction
and video retrieval. One core problem behind these applications is
automatically recognizing low-level actions and high-level activities of
interest. The former is usually the basis for the latter. This survey gives an
overview of the most recent advances in human action recognition during the
past several years, following a well-formed taxonomy proposed by a previous
survey. From this state-of-the-art survey, researchers can view a panorama of
progress in this area for future research
Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-related Applications
Facial expressions are an important way through which humans interact
socially. Building a system capable of automatically recognizing facial
expressions from images and video has been an intense field of study in recent
years. Interpreting such expressions remains challenging and much research is
needed about the way they relate to human affect. This paper presents a general
overview of automatic RGB, 3D, thermal and multimodal facial expression
analysis. We define a new taxonomy for the field, encompassing all steps from
face detection to facial expression recognition, and describe and classify the
state of the art methods accordingly. We also present the important datasets
and the bench-marking of most influential methods. We conclude with a general
discussion about trends, important questions and future lines of research
Fast and Robust Dynamic Hand Gesture Recognition via Key Frames Extraction and Feature Fusion
Gesture recognition is a hot topic in computer vision and pattern
recognition, which plays a vitally important role in natural human-computer
interface. Although great progress has been made recently, fast and robust hand
gesture recognition remains an open problem, since the existing methods have
not well balanced the performance and the efficiency simultaneously. To bridge
it, this work combines image entropy and density clustering to exploit the key
frames from hand gesture video for further feature extraction, which can
improve the efficiency of recognition. Moreover, a feature fusion strategy is
also proposed to further improve feature representation, which elevates the
performance of recognition. To validate our approach in a "wild" environment,
we also introduce two new datasets called HandGesture and Action3D datasets.
Experiments consistently demonstrate that our strategy achieves competitive
results on Northwestern University, Cambridge, HandGesture and Action3D hand
gesture datasets. Our code and datasets will release at
https://github.com/Ha0Tang/HandGestureRecognition.Comment: 11 pages, 3 figures, accepted to NeuroComputin
Dynamic Matrix Decomposition for Action Recognition
Designing a technique for the automatic analysis of different actions in
videos in order to detect the presence of interested activities is of high
significance nowadays. In this paper, we explore a robust and dynamic
appearance technique for the purpose of identifying different action
activities. We also exploit a low-rank and structured sparse matrix
decomposition (LSMD) method to better model these activities.. Our method is
effective in encoding localized spatio-temporal features which enables the
analysis of local motion taking place in the video. Our proposed model use
adjacent frame differences as the input to the method thereby forcing it to
capture the changes occurring in the video. The performance of our model is
tested on a benchmark dataset in terms of detection accuracy. Results achieved
with our model showed the promising capability of our model in detecting action
activities
Computational models of attention
This chapter reviews recent computational models of visual attention. We
begin with models for the bottom-up or stimulus-driven guidance of attention to
salient visual items, which we examine in seven different broad categories. We
then examine more complex models which address the top-down or goal-oriented
guidance of attention towards items that are more relevant to the task at hand
Spatiotemporal Knowledge Distillation for Efficient Estimation of Aerial Video Saliency
The performance of video saliency estimation techniques has achieved
significant advances along with the rapid development of Convolutional Neural
Networks (CNNs). However, devices like cameras and drones may have limited
computational capability and storage space so that the direct deployment of
complex deep saliency models becomes infeasible. To address this problem, this
paper proposes a dynamic saliency estimation approach for aerial videos via
spatiotemporal knowledge distillation. In this approach, five components are
involved, including two teachers, two students and the desired spatiotemporal
model. The knowledge of spatial and temporal saliency is first separately
transferred from the two complex and redundant teachers to their simple and
compact students, and the input scenes are also degraded from high-resolution
to low-resolution to remove the probable data redundancy so as to greatly speed
up the feature extraction process. After that, the desired spatiotemporal model
is further trained by distilling and encoding the spatial and temporal saliency
knowledge of two students into a unified network. In this manner, the
inter-model redundancy can be further removed for the effective estimation of
dynamic saliency on aerial videos. Experimental results show that the proposed
approach outperforms ten state-of-the-art models in estimating visual saliency
on aerial videos, while its speed reaches up to 28,738 FPS on the GPU platform
Enabling Edge Cloud Intelligence for Activity Learning in Smart Home
We propose a novel activity learning framework based on Edge Cloud
architecture for the purpose of recognizing and predicting human activities.
Although activity recognition has been vastly studied by many researchers, the
temporal features that constitute an activity, which can provide useful
insights for activity models, have not been exploited to their full potentials
by mining algorithms. In this paper, we utilize temporal features for activity
recognition and prediction in a single smart home setting. We discover activity
patterns and temporal relations such as the order of activities from real data
to develop a prompting system. Analysis of real data collected from smart homes
was used to validate the proposed method
Human Action Recognition and Prediction: A Survey
Derived from rapid advances in computer vision and machine learning, video
analysis tasks have been moving from inferring the present state to predicting
the future state. Vision-based action recognition and prediction from videos
are such tasks, where action recognition is to infer human actions (present
state) based upon complete action executions, and action prediction to predict
human actions (future state) based upon incomplete action executions. These two
tasks have become particularly prevalent topics recently because of their
explosively emerging real-world applications, such as visual surveillance,
autonomous driving vehicle, entertainment, and video retrieval, etc. Many
attempts have been devoted in the last a few decades in order to build a robust
and effective framework for action recognition and prediction. In this paper,
we survey the complete state-of-the-art techniques in the action recognition
and prediction. Existing models, popular algorithms, technical difficulties,
popular action databases, evaluation protocols, and promising future directions
are also provided with systematic discussions
- …