31 research outputs found

    Design, implementation and evaluation of automated surveillance systems

    Get PDF
    El reconocimiento de patrones ha conseguido un nivel de complejidad que nos permite reconocer diferente tipo de eventos, incluso peligros, y actuar en concordancia para minimizar el impacto de una situación complicada y abordarla de la mejor manera posible. Sin embargo, creemos que todavía se puede llegar a alcanzar aplicaciones más eficientes con algoritmos más precisos. Nuestra aplicación quiere probar a incluir el nuevo paradigma de la programación, las redes neuronales. Nuestra idea en principio fue explorar la alternativa que las nuevas redes neuronales convolucionales aportaban, en donde se podía ver en vídeos de ejemplos la alta tasa de detección e identificación que, por ejemplo, YOLOv2 podría mostrar. Después de comparar las características, vimos que YOLOv3 ofrecía un buen balance entre precisión y rapidez como comentaremos más adelante. Debido a la tasa de baja detecciones, haremos uso de los filtros de Kalman para ayudarnos a la hora de hacer reidentificación de personas y objetos. En este proyecto, haremos un estudio además de las alternativas de videovigilancia con las que cuentan empresas del sector y veremos que clase de productos ofrecen y, por otro lado, observaremos cuales son los trabajos de los grupos de investigadores de otras universidades que más similitudes tienen con nuestro objetivo. Dedicaremos, por lo tanto, el uso de esta red neuronal para detectar eventos como el abandono de mochilas y para mostrar la densidad de tránsito en localizaciones concretas, así como utilizaremos una metodología más tradicional, el flujo óptico, para detectar actuaciones anormales en una multitud.Automatic surveillance system is getting more and more sophisticated with the increasing calculation power that computers are reaching. The aim of this project is to take advantage of these tools and with the new classification and detection technology brought by neural networks, develop a surveillance application that can recognize certain behaviours (which are the detection of lost backpacks and suitcases, detection of abnormal crowd activity and heatmap of density occupation). To develop this program, python has been the selected programming language used, where YOLO and OpenCV form the spine of this project. After testing the code, it has been proved that due to the constrains of the detection for small objects, the project does not perform as it should for real development, but still it shows potential for the detection of lost backpacks in certain videos from the GBA dataset [1] and PETS2006 dataset [2]. The abnormal activity detection for crowds is made with a simple algorithm that seems to perform well, detecting the anomalies in all the testing dataset used, generated by the University of Minnesota [3]. Finally, the heatmap can display correctly the projection of people on the ground for five second, just as intended. The objective of this software is to be part of the core of what could be a future application with more modules that will be able to perform full automated surveillance tasks and gather useful information data, and these advances and future proposal will be explained in this memory.Máster Universitario en Ingeniería Industrial (M141

    Similarity learning for person re-identification and semantic video retrieval

    Full text link
    Many computer vision problems boil down to the learning of a good visual similarity function that calculates a score of how likely two instances share the same semantic concept. In this thesis, we focus on two problems related to similarity learning: Person Re-Identification, and Semantic Video Retrieval. Person Re-Identification aims to maintain the identity of an individual in diverse locations through different non-overlapping camera views. Starting with two cameras, we propose a novel visual word co-occurrence based appearance model to measure the similarities between pedestrian images. This model naturally accounts for spatial similarities and variations caused by pose, illumination and configuration changes across camera views. As a generalization to multiple camera views, we introduce the Group Membership Prediction (GMP) problem. The GMP problem involves predicting whether a collection of instances shares the same semantic property. In this context, we propose a novel probability model and introduce latent view-specific and view-shared random variables to jointly account for the view-specific appearance and cross-view similarities among data instances. Our method is tested on various benchmarks demonstrating superior accuracy over state-of-art. Semantic Video Retrieval seeks to match complex activities in a surveillance video to user described queries. In surveillance scenarios with noise and clutter usually present, visual uncertainties introduced by error-prone low-level detectors, classifiers and trackers compose a significant part of the semantic gap between user defined queries and the archive video. To bridge the gap, we propose a novel probabilistic activity localization formulation that incorporates learning of object attributes, between-object relationships, and object re-identification without activity-level training data. Our experiments demonstrate that the introduction of similarity learning components effectively compensate for noise and error in previous stages, and result in preferable performance on both aerial and ground surveillance videos. Considering the computational complexity of our similarity learning models, we attempt to develop a way of training complicated models efficiently while remaining good performance. As a proof-of-concept, we propose training deep neural networks for supervised learning of hash codes. With slight changes in the optimization formulation, we could explore the possibilities of incorporating the training framework for Person Re-Identification and related problems.2019-07-09T00:00:00
    corecore