5,943 research outputs found
Selected topics in video coding and computer vision
Video applications ranging from multimedia communication to computer vision have been extensively studied in the past decades. However, the emergence of new applications continues to raise questions that are only partially answered by existing techniques. This thesis studies three selected topics related to video: intra prediction in block-based video coding, pedestrian detection and tracking in infrared imagery, and multi-view video alignment.;In the state-of-art video coding standard H.264/AVC, intra prediction is defined on the hierarchical quad-tree based block partitioning structure which fails to exploit the geometric constraint of edges. We propose a geometry-adaptive block partitioning structure and a new intra prediction algorithm named geometry-adaptive intra prediction (GAIP). A new texture prediction algorithm named geometry-adaptive intra displacement prediction (GAIDP) is also developed by extending the original intra displacement prediction (IDP) algorithm with the geometry-adaptive block partitions. Simulations on various test sequences demonstrate that intra coding performance of H.264/AVC can be significantly improved by incorporating the proposed geometry adaptive algorithms.;In recent years, due to the decreasing cost of thermal sensors, pedestrian detection and tracking in infrared imagery has become a topic of interest for night vision and all weather surveillance applications. We propose a novel approach for detecting and tracking pedestrians in infrared imagery based on a layered representation of infrared images. Pedestrians are detected from the foreground layer by a Principle Component Analysis (PCA) based scheme using the appearance cue. To facilitate the task of pedestrian tracking, we formulate the problem of shot segmentation and present a graph matching-based tracking algorithm. Simulations with both OSU Infrared Image Database and WVU Infrared Video Database are reported to demonstrate the accuracy and robustness of our algorithms.;Multi-view video alignment is a process to facilitate the fusion of non-synchronized multi-view video sequences for various applications including automatic video based surveillance and video metrology. In this thesis, we propose an accurate multi-view video alignment algorithm that iteratively aligns two sequences in space and time. To achieve an accurate sub-frame temporal alignment, we generalize the existing phase-correlation algorithm to 3-D case. We also present a novel method to obtain the ground-truth of the temporal alignment by using supplementary audio signals sampled at a much higher rate. The accuracy of our algorithm is verified by simulations using real-world sequences
Towards automated visual surveillance using gait for identity recognition and tracking across multiple non-intersecting cameras
Despite the fact that personal privacy has become a major concern, surveillance technology is now becoming ubiquitous in modern society. This is mainly due to the increasing number of crimes as well as the essential necessity to provide secure and safer environment. Recent research studies have confirmed now the possibility of recognizing people by the way they walk i.e. gait. The aim of this research study is to investigate the use of gait for people detection as well as identification across different cameras. We present a new approach for people tracking and identification between different non-intersecting un-calibrated stationary cameras based on gait analysis. A vision-based markerless extraction method is being deployed for the derivation of gait kinematics as well as anthropometric measurements in order to produce a gait signature. The novelty of our approach is motivated by the recent research in biometrics and forensic analysis using gait. The experimental results affirmed the robustness of our approach to successfully detect walking people as well as its potency to extract gait features for different camera viewpoints achieving an identity recognition rate of 73.6 % processed for 2270 video sequences. Furthermore, experimental results confirmed the potential of the proposed method for identity tracking in real surveillance systems to recognize walking individuals across different views with an average recognition rate of 92.5 % for cross-camera matching for two different non-overlapping views.<br/
leave a trace - A People Tracking System Meets Anomaly Detection
Video surveillance always had a negative connotation, among others because of
the loss of privacy and because it may not automatically increase public
safety. If it was able to detect atypical (i.e. dangerous) situations in real
time, autonomously and anonymously, this could change. A prerequisite for this
is a reliable automatic detection of possibly dangerous situations from video
data. This is done classically by object extraction and tracking. From the
derived trajectories, we then want to determine dangerous situations by
detecting atypical trajectories. However, due to ethical considerations it is
better to develop such a system on data without people being threatened or even
harmed, plus with having them know that there is such a tracking system
installed. Another important point is that these situations do not occur very
often in real, public CCTV areas and may be captured properly even less. In the
artistic project leave a trace the tracked objects, people in an atrium of a
institutional building, become actor and thus part of the installation.
Visualisation in real-time allows interaction by these actors, which in turn
creates many atypical interaction situations on which we can develop our
situation detection. The data set has evolved over three years and hence, is
huge. In this article we describe the tracking system and several approaches
for the detection of atypical trajectories
Tracking by Prediction: A Deep Generative Model for Mutli-Person localisation and Tracking
Current multi-person localisation and tracking systems have an over reliance
on the use of appearance models for target re-identification and almost no
approaches employ a complete deep learning solution for both objectives. We
present a novel, complete deep learning framework for multi-person localisation
and tracking. In this context we first introduce a light weight sequential
Generative Adversarial Network architecture for person localisation, which
overcomes issues related to occlusions and noisy detections, typically found in
a multi person environment. In the proposed tracking framework we build upon
recent advances in pedestrian trajectory prediction approaches and propose a
novel data association scheme based on predicted trajectories. This removes the
need for computationally expensive person re-identification systems based on
appearance features and generates human like trajectories with minimal
fragmentation. The proposed method is evaluated on multiple public benchmarks
including both static and dynamic cameras and is capable of generating
outstanding performance, especially among other recently proposed deep neural
network based approaches.Comment: To appear in IEEE Winter Conference on Applications of Computer
Vision (WACV), 201
The Visual Social Distancing Problem
One of the main and most effective measures to contain the recent viral
outbreak is the maintenance of the so-called Social Distancing (SD). To comply
with this constraint, workplaces, public institutions, transports and schools
will likely adopt restrictions over the minimum inter-personal distance between
people. Given this actual scenario, it is crucial to massively measure the
compliance to such physical constraint in our life, in order to figure out the
reasons of the possible breaks of such distance limitations, and understand if
this implies a possible threat given the scene context. All of this, complying
with privacy policies and making the measurement acceptable. To this end, we
introduce the Visual Social Distancing (VSD) problem, defined as the automatic
estimation of the inter-personal distance from an image, and the
characterization of the related people aggregations. VSD is pivotal for a
non-invasive analysis to whether people comply with the SD restriction, and to
provide statistics about the level of safety of specific areas whenever this
constraint is violated. We then discuss how VSD relates with previous
literature in Social Signal Processing and indicate which existing Computer
Vision methods can be used to manage such problem. We conclude with future
challenges related to the effectiveness of VSD systems, ethical implications
and future application scenarios.Comment: 9 pages, 5 figures. All the authors equally contributed to this
manuscript and they are listed by alphabetical order. Under submissio
- …