9,986 research outputs found
Unsupervised Synthesis of Anomalies in Videos: Transforming the Normal
Abnormal activity recognition requires detection of occurrence of anomalous
events that suffer from a severe imbalance in data. In a video, normal is used
to describe activities that conform to usual events while the irregular events
which do not conform to the normal are referred to as abnormal. It is far more
common to observe normal data than to obtain abnormal data in visual
surveillance. In this paper, we propose an approach where we can obtain
abnormal data by transforming normal data. This is a challenging task that is
solved through a multi-stage pipeline approach. We utilize a number of
techniques from unsupervised segmentation in order to synthesize new samples of
data that are transformed from an existing set of normal examples. Further,
this synthesis approach has useful applications as a data augmentation
technique. An incrementally trained Bayesian convolutional neural network (CNN)
is used to carefully select the set of abnormal samples that can be added.
Finally through this synthesis approach we obtain a comparable set of abnormal
samples that can be used for training the CNN for the classification of normal
vs abnormal samples. We show that this method generalizes to multiple settings
by evaluating it on two real world datasets and achieves improved performance
over other probabilistic techniques that have been used in the past for this
task.Comment: Accepted in IJCNN 201
Background subtraction on depth videos with convolutional neural networks
Background subtraction is a significant component of computer vision systems.
It is widely used in video surveillance, object tracking, anomaly detection,
etc. A new data source for background subtraction appeared as the emergence of
low-cost depth sensors like Microsof t Kinect, Asus Xtion PRO, etc. In this
paper, we propose a background subtraction approach on depth videos, which is
based on convolutional neural networks (CNNs), called BGSNet-D (BackGround
Subtraction neural Networks for Depth videos). The method can be used in color
unavailable scenarios like poor lighting situations, and can also be applied to
combine with existing RGB background subtraction methods. A preprocessing
strategy is designed to reduce the influences incurred by noise from depth
sensors. The experimental results on the SBM-RGBD dataset show that the
proposed method outperforms existing methods on depth data
Comparative study of motion detection methods for video surveillance systems
The objective of this study is to compare several change detection methods
for a mono static camera and identify the best method for different complex
environments and backgrounds in indoor and outdoor scenes. To this end, we used
the CDnet video dataset as a benchmark that consists of many challenging
problems, ranging from basic simple scenes to complex scenes affected by bad
weather and dynamic backgrounds. Twelve change detection methods, ranging from
simple temporal differencing to more sophisticated methods, were tested and
several performance metrics were used to precisely evaluate the results.
Because most of the considered methods have not previously been evaluated on
this recent large scale dataset, this work compares these methods to fill a
lack in the literature, and thus this evaluation joins as complementary
compared with the previous comparative evaluations. Our experimental results
show that there is no perfect method for all challenging cases, each method
performs well in certain cases and fails in others. However, this study enables
the user to identify the most suitable method for his or her needs.Comment: 69 pages, 18 figures, journal pape
A Review on Deep Learning Techniques Applied to Semantic Segmentation
Image semantic segmentation is more and more being of interest for computer
vision and machine learning researchers. Many applications on the rise need
accurate and efficient segmentation mechanisms: autonomous driving, indoor
navigation, and even virtual or augmented reality systems to name a few. This
demand coincides with the rise of deep learning approaches in almost every
field or application target related to computer vision, including semantic
segmentation or scene understanding. This paper provides a review on deep
learning methods for semantic segmentation applied to various application
areas. Firstly, we describe the terminology of this field as well as mandatory
background concepts. Next, the main datasets and challenges are exposed to help
researchers decide which are the ones that best suit their needs and their
targets. Then, existing methods are reviewed, highlighting their contributions
and their significance in the field. Finally, quantitative results are given
for the described methods and the datasets in which they were evaluated,
following up with a discussion of the results. At last, we point out a set of
promising future works and draw our own conclusions about the state of the art
of semantic segmentation using deep learning techniques.Comment: Submitted to TPAMI on Apr. 22, 201
Background Subtraction in Real Applications: Challenges, Current Models and Future Directions
Computer vision applications based on videos often require the detection of
moving objects in their first step. Background subtraction is then applied in
order to separate the background and the foreground. In literature, background
subtraction is surely among the most investigated field in computer vision
providing a big amount of publications. Most of them concern the application of
mathematical and machine learning models to be more robust to the challenges
met in videos. However, the ultimate goal is that the background subtraction
methods developed in research could be employed in real applications like
traffic surveillance. But looking at the literature, we can remark that there
is often a gap between the current methods used in real applications and the
current methods in fundamental research. In addition, the videos evaluated in
large-scale datasets are not exhaustive in the way that they only covered a
part of the complete spectrum of the challenges met in real applications. In
this context, we attempt to provide the most exhaustive survey as possible on
real applications that used background subtraction in order to identify the
real challenges met in practice, the current used background models and to
provide future directions. Thus, challenges are investigated in terms of
camera, foreground objects and environments. In addition, we identify the
background models that are effectively used in these applications in order to
find potential usable recent background models in terms of robustness, time and
memory requirements.Comment: Submitted to Computer Science Revie
Unsupervised Deep Context Prediction for Background Foreground Separation
In many advanced video based applications background modeling is a
pre-processing step to eliminate redundant data, for instance in tracking or
video surveillance applications. Over the past years background subtraction is
usually based on low level or hand-crafted features such as raw color
components, gradients, or local binary patterns. The background subtraction
algorithms performance suffer in the presence of various challenges such as
dynamic backgrounds, photometric variations, camera jitters, and shadows. To
handle these challenges for the purpose of accurate background modeling we
propose a unified framework based on the algorithm of image inpainting. It is
an unsupervised visual feature learning hybrid Generative Adversarial algorithm
based on context prediction. We have also presented the solution of random
region inpainting by the fusion of center region inpaiting and random region
inpainting with the help of poisson blending technique. Furthermore we also
evaluated foreground object detection with the fusion of our proposed method
and morphological operations. The comparison of our proposed method with 12
state-of-the-art methods shows its stability in the application of background
estimation and foreground detection.Comment: 17 page
Dynamic Matrix Decomposition for Action Recognition
Designing a technique for the automatic analysis of different actions in
videos in order to detect the presence of interested activities is of high
significance nowadays. In this paper, we explore a robust and dynamic
appearance technique for the purpose of identifying different action
activities. We also exploit a low-rank and structured sparse matrix
decomposition (LSMD) method to better model these activities.. Our method is
effective in encoding localized spatio-temporal features which enables the
analysis of local motion taking place in the video. Our proposed model use
adjacent frame differences as the input to the method thereby forcing it to
capture the changes occurring in the video. The performance of our model is
tested on a benchmark dataset in terms of detection accuracy. Results achieved
with our model showed the promising capability of our model in detecting action
activities
Audio Surveillance: a Systematic Review
Despite surveillance systems are becoming increasingly ubiquitous in our
living environment, automated surveillance, currently based on video sensory
modality and machine intelligence, lacks most of the time the robustness and
reliability required in several real applications. To tackle this issue, audio
sensory devices have been taken into account, both alone or in combination with
video, giving birth, in the last decade, to a considerable amount of research.
In this paper audio-based automated surveillance methods are organized into a
comprehensive survey: a general taxonomy, inspired by the more widespread video
surveillance field, is proposed in order to systematically describe the methods
covering background subtraction, event classification, object tracking and
situation analysis. For each of these tasks, all the significant works are
reviewed, detailing their pros and cons and the context for which they have
been proposed. Moreover, a specific section is devoted to audio features,
discussing their expressiveness and their employment in the above described
tasks. Differently, from other surveys on audio processing and analysis, the
present one is specifically targeted to automated surveillance, highlighting
the target applications of each described methods and providing the reader
tables and schemes useful to retrieve the most suited algorithms for a specific
requirement
Skin-color based videos categorization
On dedicated websites, people can upload videos and share it with the rest of
the world. Currently these videos are cat- egorized manually by the help of the
user community. In this paper, we propose a combination of color spaces with
the Bayesian network approach for robust detection of skin color followed by an
automated video categorization. Exper- imental results show that our method can
achieve satisfactory performance for categorizing videos based on skin color.Comment: International Journal of Computer Science Issues (IJCSI), Volume 9,
Issue 1, No 3, January 201
cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey
The "cvpaper.challenge" is a group composed of members from AIST, Tokyo Denki
Univ. (TDU), and Univ. of Tsukuba that aims to systematically summarize papers
on computer vision, pattern recognition, and related fields. For this
particular review, we focused on reading the ALL 602 conference papers
presented at the CVPR2015, the premier annual computer vision event held in
June 2015, in order to grasp the trends in the field. Further, we are proposing
"DeepSurvey" as a mechanism embodying the entire process from the reading
through all the papers, the generation of ideas, and to the writing of paper.Comment: Survey Pape
- …