9,210 research outputs found
Comparative study of motion detection methods for video surveillance systems
The objective of this study is to compare several change detection methods
for a mono static camera and identify the best method for different complex
environments and backgrounds in indoor and outdoor scenes. To this end, we used
the CDnet video dataset as a benchmark that consists of many challenging
problems, ranging from basic simple scenes to complex scenes affected by bad
weather and dynamic backgrounds. Twelve change detection methods, ranging from
simple temporal differencing to more sophisticated methods, were tested and
several performance metrics were used to precisely evaluate the results.
Because most of the considered methods have not previously been evaluated on
this recent large scale dataset, this work compares these methods to fill a
lack in the literature, and thus this evaluation joins as complementary
compared with the previous comparative evaluations. Our experimental results
show that there is no perfect method for all challenging cases, each method
performs well in certain cases and fails in others. However, this study enables
the user to identify the most suitable method for his or her needs.Comment: 69 pages, 18 figures, journal pape
A Survey on Content-Aware Video Analysis for Sports
Sports data analysis is becoming increasingly large-scale, diversified, and
shared, but difficulty persists in rapidly accessing the most crucial
information. Previous surveys have focused on the methodologies of sports video
analysis from the spatiotemporal viewpoint instead of a content-based
viewpoint, and few of these studies have considered semantics. This study
develops a deeper interpretation of content-aware sports video analysis by
examining the insight offered by research into the structure of content under
different scenarios. On the basis of this insight, we provide an overview of
the themes particularly relevant to the research on content-aware systems for
broadcast sports. Specifically, we focus on the video content analysis
techniques applied in sportscasts over the past decade from the perspectives of
fundamentals and general review, a content hierarchical model, and trends and
challenges. Content-aware analysis methods are discussed with respect to
object-, event-, and context-oriented groups. In each group, the gap between
sensation and content excitement must be bridged using proper strategies. In
this regard, a content-aware approach is required to determine user demands.
Finally, the paper summarizes the future trends and challenges for sports video
analysis. We believe that our findings can advance the field of research on
content-aware video analysis for broadcast sports.Comment: Accepted for publication in IEEE Transactions on Circuits and Systems
for Video Technology (TCSVT
Roadmap for Reliable Ensemble Forecasting of the Sun-Earth System
The authors of this report met on 28-30 March 2018 at the New Jersey
Institute of Technology, Newark, New Jersey, for a 3-day workshop that brought
together a group of data providers, expert modelers, and computer and data
scientists, in the solar discipline. Their objective was to identify challenges
in the path towards building an effective framework to achieve transformative
advances in the understanding and forecasting of the Sun-Earth system from the
upper convection zone of the Sun to the Earth's magnetosphere. The workshop
aimed to develop a research roadmap that targets the scientific challenge of
coupling observations and modeling with emerging data-science research to
extract knowledge from the large volumes of data (observed and simulated) while
stimulating computer science with new research applications. The desire among
the attendees was to promote future trans-disciplinary collaborations and
identify areas of convergence across disciplines. The workshop combined a set
of plenary sessions featuring invited introductory talks and workshop progress
reports, interleaved with a set of breakout sessions focused on specific topics
of interest. Each breakout group generated short documents, listing the
challenges identified during their discussions in addition to possible ways of
attacking them collectively. These documents were combined into this
report-wherein a list of prioritized activities have been collated, shared and
endorsed.Comment: Workshop Repor
Background Subtraction in Real Applications: Challenges, Current Models and Future Directions
Computer vision applications based on videos often require the detection of
moving objects in their first step. Background subtraction is then applied in
order to separate the background and the foreground. In literature, background
subtraction is surely among the most investigated field in computer vision
providing a big amount of publications. Most of them concern the application of
mathematical and machine learning models to be more robust to the challenges
met in videos. However, the ultimate goal is that the background subtraction
methods developed in research could be employed in real applications like
traffic surveillance. But looking at the literature, we can remark that there
is often a gap between the current methods used in real applications and the
current methods in fundamental research. In addition, the videos evaluated in
large-scale datasets are not exhaustive in the way that they only covered a
part of the complete spectrum of the challenges met in real applications. In
this context, we attempt to provide the most exhaustive survey as possible on
real applications that used background subtraction in order to identify the
real challenges met in practice, the current used background models and to
provide future directions. Thus, challenges are investigated in terms of
camera, foreground objects and environments. In addition, we identify the
background models that are effectively used in these applications in order to
find potential usable recent background models in terms of robustness, time and
memory requirements.Comment: Submitted to Computer Science Revie
Multisource and Multitemporal Data Fusion in Remote Sensing
The sharp and recent increase in the availability of data captured by
different sensors combined with their considerably heterogeneous natures poses
a serious challenge for the effective and efficient processing of remotely
sensed data. Such an increase in remote sensing and ancillary datasets,
however, opens up the possibility of utilizing multimodal datasets in a joint
manner to further improve the performance of the processing approaches with
respect to the application at hand. Multisource data fusion has, therefore,
received enormous attention from researchers worldwide for a wide variety of
applications. Moreover, thanks to the revisit capability of several spaceborne
sensors, the integration of the temporal information with the spatial and/or
spectral/backscattering information of the remotely sensed data is possible and
helps to move from a representation of 2D/3D data to 4D data structures, where
the time variable adds new information as well as challenges for the
information extraction algorithms. There are a huge number of research works
dedicated to multisource and multitemporal data fusion, but the methods for the
fusion of different modalities have expanded in different paths according to
each research community. This paper brings together the advances of multisource
and multitemporal data fusion approaches with respect to different research
communities and provides a thorough and discipline-specific starting point for
researchers at different levels (i.e., students, researchers, and senior
researchers) willing to conduct novel investigations on this challenging topic
by supplying sufficient detail and references
A Systematic Evaluation and Benchmark for Person Re-Identification: Features, Metrics, and Datasets
Person re-identification (re-id) is a critical problem in video analytics
applications such as security and surveillance. The public release of several
datasets and code for vision algorithms has facilitated rapid progress in this
area over the last few years. However, directly comparing re-id algorithms
reported in the literature has become difficult since a wide variety of
features, experimental protocols, and evaluation metrics are employed. In order
to address this need, we present an extensive review and performance evaluation
of single- and multi-shot re-id algorithms. The experimental protocol
incorporates the most recent advances in both feature extraction and metric
learning. To ensure a fair comparison, all of the approaches were implemented
using a unified code library that includes 11 feature extraction algorithms and
22 metric learning and ranking techniques. All approaches were evaluated using
a new large-scale dataset that closely mimics a real-world problem setting, in
addition to 16 other publicly available datasets: VIPeR, GRID, CAVIAR,
DukeMTMC4ReID, 3DPeS, PRID, V47, WARD, SAIVT-SoftBio, CUHK01, CHUK02, CUHK03,
RAiD, iLIDSVID, HDA+ and Market1501. The evaluation codebase and results will
be made publicly available for community use.Comment: Preliminary work on person Re-Id benchmark. S. Karanam and M. Gou
contributed equally. 14 pages, 6 figures, 4 tables. For supplementary
material, see
http://robustsystems.coe.neu.edu/sites/robustsystems.coe.neu.edu/files/systems/supmat/ReID_benchmark_supp.zi
Spatio-Temporal Data Mining: A Survey of Problems and Methods
Large volumes of spatio-temporal data are increasingly collected and studied
in diverse domains including, climate science, social sciences, neuroscience,
epidemiology, transportation, mobile health, and Earth sciences.
Spatio-temporal data differs from relational data for which computational
approaches are developed in the data mining community for multiple decades, in
that both spatial and temporal attributes are available in addition to the
actual measurements/attributes. The presence of these attributes introduces
additional challenges that needs to be dealt with. Approaches for mining
spatio-temporal data have been studied for over a decade in the data mining
community. In this article we present a broad survey of this relatively young
field of spatio-temporal data mining. We discuss different types of
spatio-temporal data and the relevant data mining questions that arise in the
context of analyzing each of these datasets. Based on the nature of the data
mining problem studied, we classify literature on spatio-temporal data mining
into six major categories: clustering, predictive learning, change detection,
frequent pattern mining, anomaly detection, and relationship mining. We discuss
the various forms of spatio-temporal data mining problems in each of these
categories.Comment: Accepted for publication at ACM Computing Survey
Deep ConvLSTM with self-attention for human activity decoding using wearables
Decoding human activity accurately from wearable sensors can aid in
applications related to healthcare and context awareness. The present
approaches in this domain use recurrent and/or convolutional models to capture
the spatio-temporal features from time-series data from multiple sensors. We
propose a deep neural network architecture that not only captures the
spatio-temporal features of multiple sensor time-series data but also selects,
learns important time points by utilizing a self-attention mechanism. We show
the validity of the proposed approach across different data sampling strategies
on six public datasets and demonstrate that the self-attention mechanism gave a
significant improvement in performance over deep networks using a combination
of recurrent and convolution networks. We also show that the proposed approach
gave a statistically significant performance enhancement over previous
state-of-the-art methods for the tested datasets. The proposed methods open
avenues for better decoding of human activity from multiple body sensors over
extended periods of time. The code implementation for the proposed model is
available at https://github.com/isukrit/encodingHumanActivity.Comment: 8 pages, 2 figures, 3 tables. IEEE Sensors Journal, 202
Monitoring COVID-19 social distancing with person detection and tracking via fine-tuned YOLO v3 and Deepsort techniques
The rampant coronavirus disease 2019 (COVID-19) has brought global crisis
with its deadly spread to more than 180 countries, and about 3,519,901
confirmed cases along with 247,630 deaths globally as on May 4, 2020. The
absence of any active therapeutic agents and the lack of immunity against
COVID-19 increases the vulnerability of the population. Since there are no
vaccines available, social distancing is the only feasible approach to fight
against this pandemic. Motivated by this notion, this article proposes a deep
learning based framework for automating the task of monitoring social
distancing using surveillance video. The proposed framework utilizes the YOLO
v3 object detection model to segregate humans from the background and Deepsort
approach to track the identified people with the help of bounding boxes and
assigned IDs. The results of the YOLO v3 model are further compared with other
popular state-of-the-art models, e.g. faster region-based CNN (convolution
neural network) and single shot detector (SSD) in terms of mean average
precision (mAP), frames per second (FPS) and loss values defined by object
classification and localization. Later, the pairwise vectorized L2 norm is
computed based on the three-dimensional feature space obtained by using the
centroid coordinates and dimensions of the bounding box. The violation index
term is proposed to quantize the non adoption of social distancing protocol.
From the experimental analysis, it is observed that the YOLO v3 with Deepsort
tracking scheme displayed best results with balanced mAP and FPS score to
monitor the social distancing in real-time
UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content
Recent years have witnessed an explosion of user-generated content (UGC)
videos shared and streamed over the Internet, thanks to the evolution of
affordable and reliable consumer capture devices, and the tremendous popularity
of social media platforms. Accordingly, there is a great need for accurate
video quality assessment (VQA) models for UGC/consumer videos to monitor,
control, and optimize this vast content. Blind quality prediction of
in-the-wild videos is quite challenging, since the quality degradations of UGC
content are unpredictable, complicated, and often commingled. Here we
contribute to advancing the UGC-VQA problem by conducting a comprehensive
evaluation of leading no-reference/blind VQA (BVQA) features and models on a
fixed evaluation architecture, yielding new empirical insights on both
subjective video quality studies and VQA model design. By employing a feature
selection strategy on top of leading VQA model features, we are able to extract
60 of the 763 statistical features used by the leading models to create a new
fusion-based BVQA model, which we dub the \textbf{VID}eo quality
\textbf{EVAL}uator (VIDEVAL), that effectively balances the trade-off between
VQA performance and efficiency. Our experimental results show that VIDEVAL
achieves state-of-the-art performance at considerably lower computational cost
than other leading models. Our study protocol also defines a reliable benchmark
for the UGC-VQA problem, which we believe will facilitate further research on
deep learning-based VQA modeling, as well as perceptually-optimized efficient
UGC video processing, transcoding, and streaming. To promote reproducible
research and public evaluation, an implementation of VIDEVAL has been made
available online: \url{https://github.com/tu184044109/VIDEVAL_release}.Comment: 13 pages, 11 figures, 11 table
- …