1,493 research outputs found
Crowded Scene Analysis: A Survey
Automated scene analysis has been a topic of great interest in computer
vision and cognitive science. Recently, with the growth of crowd phenomena in
the real world, crowded scene analysis has attracted much attention. However,
the visual occlusions and ambiguities in crowded scenes, as well as the complex
behaviors and scene semantics, make the analysis a challenging task. In the
past few years, an increasing number of works on crowded scene analysis have
been reported, covering different aspects including crowd motion pattern
learning, crowd behavior and activity analysis, and anomaly detection in
crowds. This paper surveys the state-of-the-art techniques on this topic. We
first provide the background knowledge and the available features related to
crowded scenes. Then, existing models, popular algorithms, evaluation
protocols, as well as system performance are provided corresponding to
different aspects of crowded scene analysis. We also outline the available
datasets for performance evaluation. Finally, some research problems and
promising future directions are presented with discussions.Comment: 20 pages in IEEE Transactions on Circuits and Systems for Video
Technology, 201
Crowd Behavior Analysis: A Review where Physics meets Biology
Although the traits emerged in a mass gathering are often non-deliberative,
the act of mass impulse may lead to irre- vocable crowd disasters. The two-fold
increase of carnage in crowd since the past two decades has spurred significant
advances in the field of computer vision, towards effective and proactive crowd
surveillance. Computer vision stud- ies related to crowd are observed to
resonate with the understanding of the emergent behavior in physics (complex
systems) and biology (animal swarm). These studies, which are inspired by
biology and physics, share surprisingly common insights, and interesting
contradictions. However, this aspect of discussion has not been fully explored.
Therefore, this survey provides the readers with a review of the
state-of-the-art methods in crowd behavior analysis from the physics and
biologically inspired perspectives. We provide insights and comprehensive
discussions for a broader understanding of the underlying prospect of blending
physics and biology studies in computer vision.Comment: Accepted in Neurocomputing, 31 pages, 180 reference
A diffusion and clustering-based approach for finding coherent motions and understanding crowd scenes
This paper addresses the problem of detecting coherent motions in crowd
scenes and presents its two applications in crowd scene understanding: semantic
region detection and recurrent activity mining. It processes input motion
fields (e.g., optical flow fields) and produces a coherent motion filed, named
as thermal energy field. The thermal energy field is able to capture both
motion correlation among particles and the motion trends of individual
particles which are helpful to discover coherency among them. We further
introduce a two-step clustering process to construct stable semantic regions
from the extracted time-varying coherent motions. These semantic regions can be
used to recognize pre-defined activities in crowd scenes. Finally, we introduce
a cluster-and-merge process which automatically discovers recurrent activities
in crowd scenes by clustering and merging the extracted coherent motions.
Experiments on various videos demonstrate the effectiveness of our approach.Comment: This manuscript is the accepted version for TIP (IEEE Transactions on
Image Processing), 201
High-frequency crowd insights for public safety and congestion control
We present results from several projects aimed at enabling the real-time
understanding of crowds and their behaviour in the built environment. We make
use of CCTV video cameras that are ubiquitous throughout the developed and
developing world and as such are able to play the role of a reliable sensing
mechanism. We outline the novel methods developed for our crowd insights
engine, and illustrate examples of its use in different contexts in the urban
landscape. Applications of the technology range from maintaining security in
public spaces to quantifying the adequacy of public transport level of service
Review on Computer Vision Techniques in Emergency Situation
In emergency situations, actions that save lives and limit the impact of
hazards are crucial. In order to act, situational awareness is needed to decide
what to do. Geolocalized photos and video of the situations as they evolve can
be crucial in better understanding them and making decisions faster. Cameras
are almost everywhere these days, either in terms of smartphones, installed
CCTV cameras, UAVs or others. However, this poses challenges in big data and
information overflow. Moreover, most of the time there are no disasters at any
given location, so humans aiming to detect sudden situations may not be as
alert as needed at any point in time. Consequently, computer vision tools can
be an excellent decision support. The number of emergencies where computer
vision tools has been considered or used is very wide, and there is a great
overlap across related emergency research. Researchers tend to focus on
state-of-the-art systems that cover the same emergency as they are studying,
obviating important research in other fields. In order to unveil this overlap,
the survey is divided along four main axes: the types of emergencies that have
been studied in computer vision, the objective that the algorithms can address,
the type of hardware needed and the algorithms used. Therefore, this review
provides a broad overview of the progress of computer vision covering all sorts
of emergencies.Comment: 25 page
An Intelligent Extraversion Analysis Scheme from Crowd Trajectories for Surveillance
In recent years, crowd analysis is important for applications such as smart
cities, intelligent transportation system, customer behavior prediction, and
visual surveillance. Understanding the characteristics of the individual motion
in a crowd can be beneficial for social event detection and abnormal detection,
but it has rarely been studied. In this paper, we focus on the extraversion
measure of individual motions in crowds based on trajectory data. Extraversion
is one of typical personalities that is often observed in human crowd behaviors
and it can reflect not only the characteristics of the individual motion, but
also the that of the holistic crowd motions. To our best knowledge, this is the
first attempt to analyze individual extraversion of crowd motions based on
trajectories. To accomplish this, we first present a effective composite motion
descriptor, which integrates the basic individual motion information and social
metrics, to describe the extraversion of each individual in a crowd. The social
metrics consider both the neighboring distribution and their interaction
pattern. Since our major goal is to learn a universal scoring function that can
measure the degrees of extraversion across varied crowd scenes, we incorporate
and adapt the active learning technique to the relative attribute approach.
Specifically, we assume the social groups in any crowds contain individuals
with the similar degree of extraversion. Based on such assumption, we
significantly reduce the computation cost by clustering and ranking the
trajectories actively. Finally, we demonstrate the performance of our proposed
method by measuring the degree of extraversion for real individual trajectories
in crowds and analyzing crowd scenes from a real-world dataset.Comment: require modificatio
Energy-based Models for Video Anomaly Detection
Automated detection of abnormalities in data has been studied in research
area in recent years because of its diverse applications in practice including
video surveillance, industrial damage detection and network intrusion
detection. However, building an effective anomaly detection system is a
non-trivial task since it requires to tackle challenging issues of the shortage
of annotated data, inability of defining anomaly objects explicitly and the
expensive cost of feature engineering procedure. Unlike existing appoaches
which only partially solve these problems, we develop a unique framework to
cope the problems above simultaneously. Instead of hanlding with ambiguous
definition of anomaly objects, we propose to work with regular patterns whose
unlabeled data is abundant and usually easy to collect in practice. This allows
our system to be trained completely in an unsupervised procedure and liberate
us from the need for costly data annotation. By learning generative model that
capture the normality distribution in data, we can isolate abnormal data points
that result in low normality scores (high abnormality scores). Moreover, by
leverage on the power of generative networks, i.e. energy-based models, we are
also able to learn the feature representation automatically rather than
replying on hand-crafted features that have been dominating anomaly detection
research over many decades. We demonstrate our proposal on the specific
application of video anomaly detection and the experimental results indicate
that our method performs better than baselines and are comparable with
state-of-the-art methods in many benchmark video anomaly detection datasets
Modeling and Inferring Human Intents and Latent Functional Objects for Trajectory Prediction
This paper is about detecting functional objects and inferring human
intentions in surveillance videos of public spaces. People in the videos are
expected to intentionally take shortest paths toward functional objects subject
to obstacles, where people can satisfy certain needs (e.g., a vending machine
can quench thirst), by following one of three possible intent behaviors: reach
a single functional object and stop, or sequentially visit several functional
objects, or initially start moving toward one goal but then change the intent
to move toward another. Since detecting functional objects in low-resolution
surveillance videos is typically unreliable, we call them "dark matter"
characterized by the functionality to attract people. We formulate the
Agent-based Lagrangian Mechanics wherein human trajectories are
probabilistically modeled as motions of agents in many layers of "dark-energy"
fields, where each agent can select a particular force field to affect its
motions, and thus define the minimum-energy Dijkstra path toward the
corresponding source "dark matter". For evaluation, we compiled and annotated a
new dataset. The results demonstrate our effectiveness in predicting human
intent behaviors and trajectories, and localizing functional objects, as well
as discovering distinct functional classes of objects by clustering human
motion behavior in the vicinity of functional objects
Understanding People Flow in Transportation Hubs
In this paper, we aim to monitor the flow of people in large public
infrastructures. We propose an unsupervised methodology to cluster people flow
patterns into the most typical and meaningful configurations. By processing 3D
images from a network of depth cameras, we build a descriptor for the flow
pattern. We define a data-irregularity measure that assesses how well each
descriptor fits a data model. This allows us to rank flow patterns from highly
distinctive (outliers) to very common ones. By discarding outliers, we obtain
more reliable key configurations (classes). Synthetic experiments show that the
proposed method is superior to standard clustering methods. We applied it in an
operational scenario during 14 days in the X-ray screening area of an
international airport. Results show that our methodology is able to
successfully summarize the representative patterns for such a long observation
period, providing relevant information for airport management. Beyond regular
flows, our method identifies a set of rare events corresponding to uncommon
activities (cleaning, special security and circulating staff).Comment: 10 pages, 19 figure, 1 tabl
A new network-based algorithm for human activity recognition in video
In this paper, a new network-transmission-based (NTB) algorithm is proposed
for human activity recognition in videos. The proposed NTB algorithm models the
entire scene as an error-free network. In this network, each node corresponds
to a patch of the scene and each edge represents the activity correlation
between the corresponding patches. Based on this network, we further model
people in the scene as packages while human activities can be modeled as the
process of package transmission in the network. By analyzing these specific
"package transmission" processes, various activities can be effectively
detected. The implementation of our NTB algorithm into abnormal activity
detection and group activity recognition are described in detail in the paper.
Experimental results demonstrate the effectiveness of our proposed algorithm.Comment: This manuscript is the accepted version for TCSVT (IEEE Transactions
on Circuits and Systems for Video Technology
- …