11 research outputs found

    Developing object detection, tracking and image mosaicing algorithms for visual surveillance

    Get PDF
    Visual surveillance systems are becoming increasingly important in the last decades due to proliferation of cameras. These systems have been widely used in scientific, commercial and end-user applications where they can store, extract and infer huge amount of information automatically without human help. In this thesis, we focus on developing object detection, tracking and image mosaicing algorithms for a visual surveillance system. First, we review some real-time object detection algorithms that exploit motion cue and enhance one of them that is suitable for use in dynamic scenes. This algorithm adopts a nonparametric probabilistic model over the whole image and exploits pixel adjacencies to detect foreground regions under even small baseline motion. Then we develop a multiple object tracking algorithm which utilizes this algorithm as its detection step. The algorithm analyzes multiple object interactions in a probabilistic framework using virtual shells to track objects in case of severe occlusions. The final part of the thesis is devoted to an image mosaicing algorithm that stitches ordered images to create a large and visually attractive mosaic for large sequence of images. The proposed mosaicing method eliminates nonlinear optimization techniques with the capability of real-time operation on large datasets. Experimental results show that developed algorithms work quite successfully in dynamic and cluttered environments with real-time performance

    Tracking of objects in videos and separation in classes

    Get PDF
    Orientador: Clésio Luis TozziDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: A crescente utilização de câmeras de vídeo para o monitoramento de ambientes, auxiliando no controle de entrada, saída e trânsito de indivíduos ou veículos tem aumentado a busca por sistemas visando a automatização do processo de monitoramento por vídeos. Como requisitos para estes sistemas identificam-se o tratamento da entrada e saída de objetos na cena, variações na forma e movimentação dos alvos seguidos, interações entre os alvos como encontros e separações, variações na iluminação da cena e o tratamento de ruídos presentes no vídeo. O presente trabalho analisa e avalia as principais etapas de um sistema de rastreamento de múltiplos objetos através de uma câmera de vídeo fixa e propõe um sistema de rastreamento baseado em sistemas encontrados na literatura. O sistema proposto é composto de três fases: identificação do foreground através de técnicas de subtração de fundo; associação de objetos quadro a quadro através de métricas de cor, área e posição do centróide - com o auxílio da aplicação do filtro de Kalman - e, finalmente, classificação dos objetos a cada quadro segundo um sistema de gerenciamento de objetos. Com o objetivo de verificar a eficiência do sistema de rastreamento proposto, testes foram realizados utilizando vídeos das bases de dados PETS e CAVIAR. A etapa de subtração de fundo foi avaliada através da comparação do modelo Eigenbackground, utilizado no presente sistema, com o modelo Mistura de Gaussianas, modelo de subtração de fundo mais utilizado em sistemas de rastreamento. O sistema de gerenciamento de objeto foi avaliado por meio da classificação e contagem manual dos objetos a cada quadro do vídeo. Estes resultados foram comparados à saída do sistema de gerenciamento de objetos. Os resultados obtidos mostraram que o sistema de rastreamento proposto foi capaz de reconhecer e rastrear objetos em movimento em sequências de vídeos, lidando com oclusões e separações, mostrando adequabilidade para aplicação em sistemas de segurança em tempo realAbstract: There are immediate needs for the use of video cameras in environment monitoring, which can be verified by the task of assisting the entrance, exit and transit registering of people or vehicles in a area. In this context, automated surveillance systems based on video images are increasingly gaining interest. As requisites for these systems, it can be identified the treatment of entrances and exits of objects on a scene, shape variation and movement of followed targets, interactions between targets (such as meetings and splits), lighting variations and video noises. This work analyses and evaluates the main steps of a multiple target tracking system through a fixed video camera and proposes a tracking system based on approaches found in the literature. The proposed system is composed of three steps: foreground identification through background subtraction techniques; object association through color, area and centroid position matching, by using the Kalman filter to estimate the object's position in the next frame, and, lastly, object classification according an object management system. In order to assess the efficiency of the proposed tracking system, tests were performed by using videos from PETS and CAVIAR datasets. The background subtraction step was evaluated by means of a comparison between the Eigenbackground model, used in the proposed tracking system, and the Mixture of Gaussians model, one of the most used background subtraction models. The object management system was evaluated through manual classification and counting of objects on each video frame. These results were compared with the output of the object management system. The obtained results showed that the proposed tracking system was able to recognize and track objects in movement on videos, as well as dealing with occlusions and separations, and, at the same time, encouraging future studies in order for its application on real time security systemsMestradoEngenharia de ComputaçãoMestre em Engenharia Elétric

    Méthodes de vision à la motion et leurs applications

    Get PDF
    La détection de mouvement est une opération de base souvent utilisée en vision par ordinateur, que ce soit pour la détection de piétons, la détection d’anomalies, l’analyse de scènes vidéo ou le suivi d’objets en temps réel. Bien qu’un très grand nombre d’articles ait été publiés sur le sujet, plusieurs questions restent en suspens. Par exemple, il n’est toujours pas clair comment détecter des objets en mouvement dans des vidéos contenant des situations difficiles à gérer comme d'importants mouvements de fonds et des changements d’illumination. De plus, il n’y a pas de consensus sur comment quantifier les performances des méthodes de détection de mouvement. Aussi, il est souvent difficile d’incorporer de l’information de mouvement à des opérations de haut niveau comme par exemple la détection de piétons. Dans cette thèse, j’aborde quatre problèmes en lien avec la détection de mouvement: 1. Comment évaluer efficacement des méthodes de détection de mouvement? Pour répondre à cette question, nous avons mis sur pied une procédure d’évaluation de telles méthodes. Cela a mené à la création de la plus grosse base de données 100\% annotée au monde dédiée à la détection de mouvement et organisé une compétition internationale (CVPR 2014). J’ai également exploré différentes métriques d’évaluation ainsi que des stratégies de combinaison de méthodes de détection de mouvement. 2. L’annotation manuelle de chaque objet en mouvement dans un grand nombre de vidéos est un immense défi lors de la création d’une base de données d’analyse vidéo. Bien qu’il existe des méthodes de segmentation automatiques et semi-automatiques, ces dernières ne sont jamais assez précises pour produire des résultats de type “vérité terrain”. Pour résoudre ce problème, nous avons proposé une méthode interactive de segmentation d’objets en mouvement basée sur l’apprentissage profond. Les résultats obtenus sont aussi précis que ceux obtenus par un être humain tout en étant 40 fois plus rapide. 3. Les méthodes de détection de piétons sont très souvent utilisées en analyse de la vidéo. Malheureusement, elles souffrent parfois d’un grand nombre de faux positifs ou de faux négatifs tout dépendant de l’ajustement des paramètres de la méthode. Dans le but d’augmenter les performances des méthodes de détection de piétons, nous avons proposé un filtre non linéaire basée sur la détection de mouvement permettant de grandement réduire le nombre de faux positifs. 4. L’initialisation de fond ({\em background initialization}) est le processus par lequel on cherche à retrouver l’image de fond d’une vidéo sans les objets en mouvement. Bien qu’un grand nombre de méthodes ait été proposé, tout comme la détection de mouvement, il n’existe aucune base de donnée ni procédure d’évaluation pour de telles méthodes. Nous avons donc mis sur pied la plus grosse base de données au monde pour ce type d’applications et avons organisé une compétition internationale (ICPR 2016).Abstract : Motion detection is a basic video analytic operation on which many high-level computer vision tasks are built upon, e.g., pedestrian detection, anomaly detection, scene understanding and object tracking strategies. Even though a large number of motion detection methods have been proposed in the last decades, some important questions are still unanswered, including: (1) how to separate the foreground from the background accurately even under extremely challenging circumstances? (2) how to evaluate different motion detection methods? And (3) how to use motion information extracted by motion detection to help improving high-level computer vision tasks? In this thesis, we address four problems related to motion detection: 1. How can we benchmark (and on which videos) motion detection method? Current datasets are either too small with a limited number of scenarios, or only provide bounding box ground truth that indicates the rough location of foreground objects. As a solution, we built the largest and most objective motion detection dataset in the world with pixel accurate ground truth to evaluate and compare motion detection methods. We also explore various evaluation metrics as well as different combination strategies. 2. Providing pixel accurate ground truth is a huge challenge when building a motion detection dataset. While automatic labeling methods suffer from a too large false detection rate to be used as ground truth, manual labeling of hundreds of thousands of frames is extremely time consuming. To solve this problem, we proposed an interactive deep learning method for segmenting moving objects from videos. The proposed method can reach human-level accuracies while lowering the labeling time by a factor of 40. 3. Pedestrian detectors always suffer from either false positive detections or false negative detections all depending on the parameter tuning. Unfortunately, manual adjustment of parameters for a large number of videos is not feasible in practice. In order to make pedestrian detectors more robust on a large variety of videos, we combined motion detection with various state-of-the-art pedestrian detectors. This is done by a novel motion-based nonlinear filtering process which improves detectors by a significant margin. 4. Scene background initialization is the process by which a method tries to recover the RGB background image of a video without foreground objects in it. However, one of the reasons that background modeling is challenging is that there is no good dataset and benchmarking framework to estimate the performance of background modeling methods. To fix this problem, we proposed an extensive survey as well as a novel benchmarking framework for scene background initialization

    Real-time Foreground Object Detection Combining the PBAS Background Modelling Algorithm and Feedback from Scene Analysis Module

    Get PDF
    The article presents a hardware implementation of the foreground object detection algorithm PBAS (Pixel-Based Adaptive Segmenter) with a scene analysis module. A mechanism for static object detection is proposed, which is based on consecutive frame differencing. The method allows to distinguish stopped foreground objects (e.g. a car at the intersection, abandoned luggage) from false detections (so-called ghosts) using edge similarity. The improved algorithm was compared with the original version on popular test sequences from the changedetection.net dataset. The obtained results indicate that the proposed approach allows to improve the performance of the method for sequences with the stopped objects. The algorithm has been implemented and successfully verified on a hardware platform with Virtex 7 FPGA device. The PBAS segmentation, consecutive frame differencing, Sobel edge detection and advanced one-pass connected component analysis modules were designed. The system is capable of processing 50 frames with a resolution of 720 × 576 pixels per second.

    A practical vision system for the detection of moving objects

    Get PDF
    The main goal of this thesis is to review and offer robust and efficient algorithms for the detection (or the segmentation) of foreground objects in indoor and outdoor scenes using colour image sequences captured by a stationary camera. For this purpose, the block diagram of a simple vision system is offered in Chapter 2. First this block diagram gives the idea of a precise order of blocks and their tasks, which should be performed to detect moving foreground objects. Second, a check mark () on the top right corner of a block indicates that this thesis contains a review of the most recent algorithms and/or some relevant research about it. In many computer vision applications, segmenting and extraction of moving objects in video sequences is an essential task. Background subtraction has been widely used for this purpose as the first step. In this work, a review of the efficiency of a number of important background subtraction and modelling algorithms, along with their major features, are presented. In addition, two background approaches are offered. The first approach is a Pixel-based technique whereas the second one works at object level. For each approach, three algorithms are presented. They are called Selective Update Using Non-Foreground Pixels of the Input Image , Selective Update Using Temporal Averaging and Selective Update Using Temporal Median , respectively in this thesis. The first approach has some deficiencies, which makes it incapable to produce a correct dynamic background. Three methods of the second approach use an invariant colour filter and a suitable motion tracking technique, which selectively exclude foreground objects (or blobs) from the background frames. The difference between the three algorithms of the second approach is in updating process of the background pixels. It is shown that the Selective Update Using Temporal Median method produces the correct background image for each input frame. Representing foreground regions using their boundaries is also an important task. Thus, an appropriate RLE contour tracing algorithm has been implemented for this purpose. However, after the thresholding process, the boundaries of foreground regions often have jagged appearances. Thus, foreground regions may not correctly be recognised reliably due to their corrupted boundaries. A very efficient boundary smoothing method based on the RLE data is proposed in Chapter 7. It just smoothes the external and internal boundaries of foreground objects and does not distort the silhouettes of foreground objects. As a result, it is very fast and does not blur the image. Finally, the goal of this thesis has been presenting simple, practical and efficient algorithms with little constraints which can run in real time

    Towards perceptual intelligence : statistical modeling of human individual and interactive behaviors

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Architecture, 2000.Includes bibliographical references (p. 279-297).This thesis presents a computational framework for the automatic recognition and prediction of different kinds of human behaviors from video cameras and other sensors, via perceptually intelligent systems that automatically sense and correctly classify human behaviors, by means of Machine Perception and Machine Learning techniques. In the thesis I develop the statistical machine learning algorithms (dynamic graphical models) necessary for detecting and recognizing individual and interactive behaviors. In the case of the interactions two Hidden Markov Models (HMMs) are coupled in a novel architecture called Coupled Hidden Markov Models (CHMMs) that explicitly captures the interactions between them. The algorithms for learning the parameters from data as well as for doing inference with those models are developed and described. Four systems that experimentally evaluate the proposed paradigm are presented: (1) LAFTER, an automatic face detection and tracking system with facial expression recognition; (2) a Tai-Chi gesture recognition system; (3) a pedestrian surveillance system that recognizes typical human to human interactions; (4) and a SmartCar for driver maneuver recognition. These systems capture human behaviors of different nature and increasing complexity: first, isolated, single-user facial expressions, then, two-hand gestures and human-to-human interactions, and finally complex behaviors where human performance is mediated by a machine, more specifically, a car. The metric that is used for quantifying the quality of the behavior models is their accuracy: how well they are able to recognize the behaviors on testing data. Statistical machine learning usually suffers from lack of data for estimating all the parameters in the models. In order to alleviate this problem, synthetically generated data are used to bootstrap the models creating 'prior models' that are further trained using much less real data than otherwise it would be required. The Bayesian nature of the approach let us do so. The predictive power of these models lets us categorize human actions very soon after the beginning of the action. Because of the generic nature of the typical behaviors of each of the implemented systems there is a reason to believe that this approach to modeling human behavior would generalize to other dynamic human-machine systems. This would allow us to recognize automatically people's intended action, and thus build control systems that dynamically adapt to suit the human's purposes better.by Nuria M. Oliver.Ph.D

    Detección de objetos en entornos dinámicos para videovigilancia

    Get PDF
    La videovigilancia por medios automáticos es un campo de investigación muy activo debido a la necesidad de seguridad y control. En este sentido, existen situaciones que dificultan el correcto funcionamiento de los algoritmos ya existentes. Esta tesis se centra en la detección de movimiento y aborda varias de las problemáticas habituales, planteando nuevos enfoques que, en la gran mayoría de las ocasiones, superan a otras propuestas pertenecientes al estado del arte. En particular estudiamos: - La importancia del espacio de color de cara a la detección de movimiento. - Los efectos del ruido en el vídeo de entrada. - Un nuevo modelo de fondo denominado MFBM que acepta cualquier número y tipo de rasgo de entrada. - Un método para paliar las dificultades que suponen los cambios de iluminación. - Un método no panorámico para detectar movimiento en cámaras no estáticas. Durante la tesis se han utilizado diferentes repositorios públicos que son ampliamente utilizados en el ámbito de la detección de movimiento. Además, los resultados obtenidos han sido comparados con los de otras propuestas existentes. Todo el código utilizado ha sido colgado en la Web de forma pública. En esta tesis se llega a las siguientes conclusiones: - El espacio de color con el que se codifique el vídeo de entrada repercute notablemente en el rendimiento de los métodos de detección. El modelo RGB no siempre es la mejor opción. También se ha comprobado que ponderar los canales de color del vídeo de entrada mejora el rendimiento de los métodos. - El ruido en el vídeo de entrada a la hora de realizar la detección de movimiento es un factor a tener en cuenta ya que condiciona el rendimiento de los métodos. Resulta llamativo que, si bien el ruido suele ser perjudicial, en ocasiones puede mejorar la detección. - El modelo MFBM supera a los demás métodos competidores estudiados, todos ellos pertenecientes al estado del arte. - Los problemas derivados de los cambios de iluminación se reducen significativamente al utilizar el método propuesto. - El método propuesto para detectar movimiento con cámaras no estáticas supera en la gran mayoría de las ocasiones a otras propuestas existentes. Se han consultado 280 entradas bibliográficas, entre ellas podemos destacar: - C. Wren, A. Azarbayejani, T. Darrell, and A. Pentl, “Pfinder: real-time tracking of the human body,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 19, pp. 780–785, 1997. - C. Stauffer and W. Grimson, “Adaptive background mixture models for real-time tracking,” in Proc. IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 1999. - L. Li, W. Huang, I.-H. Gu, and Q. Tian, “Statistical modeling of complex backgrounds for foreground object detection,” Image Processing, IEEE Transactions on, vol. 13, pp. 1459–1472, 2004. - T. Bouwmans, “Traditional and recent approaches in background modeling for foreground detection: An overview,” Computer Science Review, vol. 11-12, pp. 31 – 66, 2014

    An Eigenbackground Subtraction Method using Recursive Error Compensation

    No full text
    Eigenbackground subtraction is a common method for movingobject detection. The method uses the difference between input image and the reconstructed background image to detect foreground objects based on eigenvalue decomposition. In the method, foreground regions are represented in the reconstructed image using eigenbackground. This results in errors that are spread out over the entire reconstructed reference image. This will also result in degradation of quality of reconstructed background leading to inaccurate moving object detection. In order to compensate these regions, an efficient recursive error compensation method is proposed. The experimental results show that a better approximation of the background is constructed by the proposed method and more accurate foreground objects can be detected based on the reconstructed background
    corecore