46 research outputs found
Improved robustness and efficiency for automatic visual site monitoring
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (p. 219-228).Knowing who people are, where they are, what they are doing, and how they interact with other people and things is valuable from commercial, security, and space utilization perspectives. Video sensors backed by computer vision algorithms are a natural way to gather this data. Unfortunately, key technical issues persist in extracting features and models that are simultaneously efficient to compute and robust to issues such as adverse lighting conditions, distracting background motions, appearance changes over time, and occlusions. In this thesis, we present a set of techniques and model enhancements to better handle these problems, focusing on contributions in four areas. First, we improve background subtraction so it can better handle temporally irregular dynamic textures. This allows us to achieve a 5.5% drop in false positive rate on the Wallflower waving trees video. Secondly, we adapt the Dalal and Triggs Histogram of Oriented Gradients pedestrian detector to work on large-scale scenes with dense crowds and harsh lighting conditions: challenges which prevent us from easily using a background subtraction solution. These scenes contain hundreds of simultaneously visible people. To make using the algorithm computationally feasible, we have produced a novel implementation that runs on commodity graphics hardware and is up to 76 faster than our CPU-only implementation. We demonstrate the utility of this detector by modeling scene-level activities with a Hierarchical Dirichlet Process.(cont.) Third, we show how one can improve the quality of pedestrian silhouettes for recognizing individual people. We combine general appearance information from a large population of pedestrians with semi-periodic shape information from individual silhouette sequences. Finally, we show how one can combine a variety of detection and tracking techniques to robustly handle a variety of event detection scenarios such as theft and left-luggage detection. We present the only complete set of results on a standardized collection of very challenging videos.by Gerald Edwin Dalley.Ph.D
Twofold Structured Features-Based Siamese Network for Infrared Target Tracking
Nowadays, infrared target tracking has been a critical technology in the
field of computer vision and has many applications, such as motion analysis,
pedestrian surveillance, intelligent detection, and so forth. Unfortunately,
due to the lack of color, texture and other detailed information, tracking
drift often occurs when the tracker encounters infrared targets that vary in
size or shape. To address this issue, we present a twofold structured
features-based Siamese network for infrared target tracking. First of all, in
order to improve the discriminative capacity for infrared targets, a novel
feature fusion network is proposed to fuse both shallow spatial information and
deep semantic information into the extracted features in a comprehensive
manner. Then, a multi-template update module based on template update mechanism
is designed to effectively deal with interferences from target appearance
changes which are prone to cause early tracking failures. Finally, both
qualitative and quantitative experiments are carried out on VOT-TIR 2016
dataset, which demonstrates that our method achieves the balance of promising
tracking performance and real-time tracking speed against other out-of-the-art
trackers.Comment: 13 pages,9 figures,references adde
Diskriminativni korelacijski filter s segmentacijo in uporabo konteksta za robustno sledenje
Visual object tracking is an area in the field of computer vision, which has seen great popularity increase due to a large availability of video data. There are many different tracking tasks, such as multiple object tracking, long-term tracking and specialized trackers, expected to perform well in a very specific domain. In this work, we focus on online generic short-term single object tracking, which can be considered the base visual tracking task and can be adaptable to any of the previously mentioned tasks. We propose a new tracker, based on correlation filtering, augmented with context information and a predicted object segmentation mask. The results on benchmarks fall far behind the current state-of-the-art, however the proposed method consistently outperforms baseline trackers, which shows the methods potential for future improvements.Vizualno sledenje objektom je področje računalniškega vida, ki je v zadnjih letih doživelo velik razcvet, zahvaljujoč dostopnosti video vsebin. Problem lahko razdelimo na več podnalog, na primer sledenje več objektom, dolgoročno sledenje ali specializirano sledenje za točno določeno domeno. V tem delu se omejimo na splošne kratkoročne sledilnike, ki sledijo enemu objektu. To lahko namreč razumemo kot najbolj osnovno nalogo vizualnega sledenja, ki jo lahko razširimo za delovanje na prej omenjenih problemih. V delu predstavimo nov sledilnik, ki temelji na sledenju s korelacijskimi filtri, razširimo pa ga z uporabo kontekstne informacije in segmentacijske maske. V primerjavi z ostalimi sledilniki predlagana metoda sicer ne dosega rezultatov, primerljivih z najmodernejšimi sledilniki, vendar pa dosledno dosega boljše rezultate od osnovnejših sledilnikov, kar kaže na potencial metode za nadaljnje izboljšave
Object Tracking
Object tracking consists in estimation of trajectory of moving objects in the sequence of images. Automation of the computer object tracking is a difficult task. Dynamics of multiple parameters changes representing features and motion of the objects, and temporary partial or full occlusion of the tracked objects have to be considered. This monograph presents the development of object tracking algorithms, methods and systems. Both, state of the art of object tracking methods and also the new trends in research are described in this book. Fourteen chapters are split into two sections. Section 1 presents new theoretical ideas whereas Section 2 presents real-life applications. Despite the variety of topics contained in this monograph it constitutes a consisted knowledge in the field of computer object tracking. The intention of editor was to follow up the very quick progress in the developing of methods as well as extension of the application
A Bayesian hierarchy for robust gaze estimation in human–robot interaction
In this text, we present a probabilistic solution for robust gaze estimation in the context of human–robot interaction. Gaze estimation, in the sense of continuously assessing gaze direction of an interlocutor so as to determine his/her focus of visual attention, is important in several important computer vision applications, such as the development of non-intrusive gaze-tracking equipment for psychophysical experiments in neuroscience, specialised telecommunication devices, video surveillance, human–computer interfaces (HCI) and artificial cognitive systems for human–robot interaction (HRI), our application of interest. We have developed a robust solution based on a probabilistic approach that inherently deals with the uncertainty of sensor models, but also and in particular with uncertainty arising from distance, incomplete data and scene dynamics. This solution comprises a hierarchical formulation in the form of a mixture model that loosely follows how geometrical cues provided by facial features are believed to be used by the human perceptual system for gaze estimation. A quantitative analysis of the proposed framework's performance was undertaken through a thorough set of experimental sessions. Results show that the framework performs according to the difficult requirements of HRI applications, namely by exhibiting correctness, robustness and adaptiveness
Supervised dictionary learning for action recognition and localization
PhDImage sequences with humans and human activities are everywhere.
With the amount of produced and distributed data increasing at an
unprecedented rate, there has been a lot of interest in building systems
that can understand and interpret the visual data, and in particular detect
and recognise human actions. Dictionary based approaches learn a
dictionary from descriptors extracted from the videos in the first stage
and a classifier or a detector in the second stage. The major drawback
of such an approach is that the dictionary is learned in an unsupervised
manner without considering the task (classification or detection) that
follows it. In this work we develop task dependent(supervised) dictionaries
for action recognition and localization, i.e., dictionaries that are
best suited for the subsequent task. In the first part of the work, we
propose a supervised max-margin framework for linear and non-linear
Non-Negative Matrix Factorization (NMF). To achieve this, we impose
max-margin constraints within the formulation of NMF and simultaneously
solve for the classifier and the dictionary. The dictionary (basis
matrix) thus obtained maximizes the margin of the classifier in the low
dimensional space (in the linear case) or in the high dimensional feature
space (in the non-linear case). In the second part the work, we
develop methodologies for action localization. We first propose a dictionary
weighting approach where we learn local and global weights for
the dictionary by considering the localization information of the training
sequences. We next extend this approach to learn a task-dependent
dictionary for action localization that incorporates the localization information
of the training sequences into dictionary learning. The results
on publicly available datasets show that the performance of the system
is improved by using the supervised information while learning dictionary.QMUL; EPSRC PhD scholarship program (EP/G033935/1)
Soft computing and non-parametric techniques for effective video surveillance systems
Esta tesis propone varios objetivos interconectados para el diseño de un sistema de vĂdeovigilancia cuyo funcionamiento es pensado para un amplio rango de condiciones. Primeramente se propone una mĂ©trica de evaluaciĂłn del detector y sistema de seguimiento basada en una mĂnima referencia. Dicha tĂ©cnica es una respuesta a la demanda de ajuste de forma rápida y fácil del sistema adecuándose a distintos entornos. TambiĂ©n se propone una tĂ©cnica de optimizaciĂłn basada en Estrategias Evolutivas y la combinaciĂłn de funciones de idoneidad en varios pasos. El objetivo es obtener los parámetros de ajuste del detector y el sistema de seguimiento adecuados para el mejor funcionamiento en una amplia gama de situaciones posibles Finalmente, se propone la construcciĂłn de un clasificador basado en tĂ©cnicas no paramĂ©tricas que pudieran modelar la distribuciĂłn de datos de entrada independientemente de la fuente de generaciĂłn de dichos datos. Se escogen actividades detectables a corto plazo que siguen un patrĂłn de tiempo que puede ser fácilmente modelado mediante HMMs. La propuesta consiste en una modificaciĂłn del algoritmo de Baum-Welch con el fin de modelar las probabilidades de emisiĂłn del HMM mediante una tĂ©cnica no paramĂ©trica basada en estimaciĂłn de densidad con kernels (KDE). _____________________________________This thesis proposes several interconnected objectives for the design of a video-monitoring
system whose operation is thought for a wide rank of conditions.
Firstly an evaluation technique of the detector and tracking system is proposed and it is based
on a minimum reference or ground-truth. This technique is an answer to the demand of fast and
easy adjustment of the system adapting itself to different contexts.
Also, this thesis proposes a technique of optimization based on Evolutionary Strategies and
the combination of fitness functions. The objective is to obtain the parameters of adjustment of
the detector and tracking system for the best operation in an ample range of possible situations.
Finally, it is proposed the generation of a classifier in which a non-parametric statistic technique
models the distribution of data regardless the source generation of such data. Short term
detectable activities are chosen that follow a time pattern that can easily be modeled by Hidden
Markov Models (HMMs). The proposal consists in a modification of the Baum-Welch algorithm
with the purpose of modeling the emission probabilities of the HMM by means of a nonparametric
technique based on the density estimation with kernels (KDE)