6 research outputs found

    Binary object recognition system on FPGA with bSOM

    Get PDF
    Tri-state Self Organizing Map (bSOM), which takes binary inputs and maintains tri-state weights, has been used for classification rather than clustering in this paper. The major contribution here is the demonstration of the potential use of the modified bSOM in security surveillance, as a recognition system on FPGA

    Patch-Based Experiments with Object Classification in Video Surveillance

    Get PDF
    We present a patch-based algorithm for the purpose of object classification in video surveillance. Within detected regions-of-interest (ROIs) of moving objects in the scene, a feature vector is calculated based on template matching of a large set of image patches. Instead of matching direct image pixels, we use Gabor-filtered versions of the input image at several scales. This approach has been adopted from recent experiments in generic object-recognition tasks. We present results for a new typical video surveillance dataset containing over 9,000 object images. Furthermore, we compare our system performance with another existing smaller surveillance dataset. We have found that with 50 training samples or higher, our detection rate is on the average above 95%. Because of the inherent scalability of the algorithm, an embedded system implementation is well within reach

    Person detection : unmanned system and small sensor applications

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2008.Includes bibliographical references (p. 97-99).The ability to quickly and reliably detect people in images and video is highly desired. Several object recognition algorithms have demonstrated successful detection of multiclass objects with varied scale, position and orientation. This study examines the effectiveness of these methods when applied to detecting humans in two distinct domains: A) Leave-behind sensing and B) Aerial surveillance. Using novel image sets that are significantly more realistic and difficult than standard datasets, a variety of tests are conducted to compare the algorithms in terms of classification success rate. Dalal and Triggs' Histogram of Oriented Gradients algorithm, when trained with image samples taken from inside MIT's Stata Center, detects with no false positives all but one person in six minutes of video taken from inside a separate building. An enhanced version of Riesenhuber and Poggio's cortex-like recognition model, trained to detect people, correctly classifies 95% of images taken from a small UAV when trained with an independent set of images. These results illustrate the potential to accurately and reliably determine the presence of people in video from unmanned aircraft and indoor sensors.by Paul Edward Rosendall.S.M

    Tracking dynamic regions of texture and shape

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (p. 137-142).The tracking of visual phenomena is a problem of fundamental importance in computer vision. Tracks are used in many contexts, including object recognition, classification, camera calibration, and scene understanding. However, the use of such data is limited by the types of objects we are able to track and the environments in which we can track them. Objects whose shape or appearance can change in complex ways are difficult to track as it is difficult to represent or predict the appearance of such objects. Furthermore, other elements of the scene may interact with the tracked object, changing its appearance, or hiding part or all of it from view. In this thesis, we address the problem of tracking deformable, dynamically textured regions under challenging conditions involving visual clutter, distractions, and multiple and prolonged occlusion. We introduce a model of appearance capable of compactly representing regions undergoing nonuniform, nonrepeating changes to both its textured appearance and shape. We describe methods of maintaining such a model and show how it enables efficient and effective occlusion reasoning. By treating the visual appearance as a dynamically changing textured region, we show how such a model enables the tracking of groups of people. By tracking groups of people instead of each individual independently, we are able to track in environments where it would otherwise be difficult, or impossible. We demonstrate the utility of the model by tracking many regions under diverse conditions, including indoor and outdoor scenes, near-field and far-field camera positions, through occlusion and through complex interactions with other visual elements, and by tracking such varied phenomena as meteorological data, seismic imagery, and groups of people.by Joshua Migdal.Ph.D

    Domain adaptation for pedestrian detection

    Get PDF
    Object detection is an essential component of many computer vision systems. The increase in the amount of collected digital data and new applications of computer vision have generated a demand for object detectors for many different types of scenes digitally captured in diverse settings. The appearance of objects captured across these different scenarios can vary significantly, causing readily available state-of-the-art object detectors to perform poorly in many of the scenes. One solution is to annotate and collect labelled data for each new scene and train a scene-specific object detector that is specialised to perform well for that scene, but such a method is labour intensive and impractical. In this thesis, we propose three novel contributions to learn scene-specific pedestrian detectors for scenes with minimal human supervision effort. In the first and second contributions, we formulate the problem as unsupervised domain adaptation in which a readily available generic pedestrian detector is automatically adapted to specific scenes (without any labelled data from the scenes). In the third contribution, we formulate it as a weakly supervised learning algorithm requiring annotations of only pedestrian centres. The first contribution is a detector adaptation algorithm using joint dataset feature learning. We use state-of-the-art deep learning for the purpose of detector adaptation by exploiting the assumption that the data lies on a low dimensional manifold. The algorithm significantly outperforms a state-of-the-art approach that makes use of a similar manifold assumption. The second contribution presents an efficient detector adaptation algorithm that makes effective use of cues (e.g spatio-temporal constraints) available in video. We show that, for videos, such cues can dramatically help with the detector adaptation. We extensively compare our approach with state-of-the-art algorithms and show that our algorithm outperforms the competing approaches despite being simpler to implement and apply. In the third contribution, we approach the task of reducing manual annotation effort by formulating the problem as a weakly supervised learning algorithm that requires annotation of only approximate centres of pedestrians (instead of the usual precise bounding boxes). Instead of assuming the availability of a generic detector and adapting it to new scenes as in the first two contributions, we collect manual annotation for new scenes but make the annotation task easier and faster. Our algorithm reduces the amount of manual annotation effort by approximately four times while maintaining a similar detection performance as the standard training methods. We evaluate each of the proposed algorithms on two challenging publicly available video datasets

    Detección de objetos en entornos dinámicos para videovigilancia

    Get PDF
    La videovigilancia por medios automáticos es un campo de investigación muy activo debido a la necesidad de seguridad y control. En este sentido, existen situaciones que dificultan el correcto funcionamiento de los algoritmos ya existentes. Esta tesis se centra en la detección de movimiento y aborda varias de las problemáticas habituales, planteando nuevos enfoques que, en la gran mayoría de las ocasiones, superan a otras propuestas pertenecientes al estado del arte. En particular estudiamos: - La importancia del espacio de color de cara a la detección de movimiento. - Los efectos del ruido en el vídeo de entrada. - Un nuevo modelo de fondo denominado MFBM que acepta cualquier número y tipo de rasgo de entrada. - Un método para paliar las dificultades que suponen los cambios de iluminación. - Un método no panorámico para detectar movimiento en cámaras no estáticas. Durante la tesis se han utilizado diferentes repositorios públicos que son ampliamente utilizados en el ámbito de la detección de movimiento. Además, los resultados obtenidos han sido comparados con los de otras propuestas existentes. Todo el código utilizado ha sido colgado en la Web de forma pública. En esta tesis se llega a las siguientes conclusiones: - El espacio de color con el que se codifique el vídeo de entrada repercute notablemente en el rendimiento de los métodos de detección. El modelo RGB no siempre es la mejor opción. También se ha comprobado que ponderar los canales de color del vídeo de entrada mejora el rendimiento de los métodos. - El ruido en el vídeo de entrada a la hora de realizar la detección de movimiento es un factor a tener en cuenta ya que condiciona el rendimiento de los métodos. Resulta llamativo que, si bien el ruido suele ser perjudicial, en ocasiones puede mejorar la detección. - El modelo MFBM supera a los demás métodos competidores estudiados, todos ellos pertenecientes al estado del arte. - Los problemas derivados de los cambios de iluminación se reducen significativamente al utilizar el método propuesto. - El método propuesto para detectar movimiento con cámaras no estáticas supera en la gran mayoría de las ocasiones a otras propuestas existentes. Se han consultado 280 entradas bibliográficas, entre ellas podemos destacar: - C. Wren, A. Azarbayejani, T. Darrell, and A. Pentl, “Pfinder: real-time tracking of the human body,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 19, pp. 780–785, 1997. - C. Stauffer and W. Grimson, “Adaptive background mixture models for real-time tracking,” in Proc. IEEE Intl. Conf. on Computer Vision and Pattern Recognition, 1999. - L. Li, W. Huang, I.-H. Gu, and Q. Tian, “Statistical modeling of complex backgrounds for foreground object detection,” Image Processing, IEEE Transactions on, vol. 13, pp. 1459–1472, 2004. - T. Bouwmans, “Traditional and recent approaches in background modeling for foreground detection: An overview,” Computer Science Review, vol. 11-12, pp. 31 – 66, 2014
    corecore