122 research outputs found

    Long Range Automated Persistent Surveillance

    Get PDF
    This dissertation addresses long range automated persistent surveillance with focus on three topics: sensor planning, size preserving tracking, and high magnification imaging. field of view should be reserved so that camera handoff can be executed successfully before the object of interest becomes unidentifiable or untraceable. We design a sensor planning algorithm that not only maximizes coverage but also ensures uniform and sufficient overlapped camera’s field of view for an optimal handoff success rate. This algorithm works for environments with multiple dynamic targets using different types of cameras. Significantly improved handoff success rates are illustrated via experiments using floor plans of various scales. Size preserving tracking automatically adjusts the camera’s zoom for a consistent view of the object of interest. Target scale estimation is carried out based on the paraperspective projection model which compensates for the center offset and considers system latency and tracking errors. A computationally efficient foreground segmentation strategy, 3D affine shapes, is proposed. The 3D affine shapes feature direct and real-time implementation and improved flexibility in accommodating the target’s 3D motion, including off-plane rotations. The effectiveness of the scale estimation and foreground segmentation algorithms is validated via both offline and real-time tracking of pedestrians at various resolution levels. Face image quality assessment and enhancement compensate for the performance degradations in face recognition rates caused by high system magnifications and long observation distances. A class of adaptive sharpness measures is proposed to evaluate and predict this degradation. A wavelet based enhancement algorithm with automated frame selection is developed and proves efficient by a considerably elevated face recognition rate for severely blurred long range face images

    QUIS-CAMPI: Biometric Recognition in Surveillance Scenarios

    Get PDF
    The concerns about individuals security have justified the increasing number of surveillance cameras deployed both in private and public spaces. However, contrary to popular belief, these devices are in most cases used solely for recording, instead of feeding intelligent analysis processes capable of extracting information about the observed individuals. Thus, even though video surveillance has already proved to be essential for solving multiple crimes, obtaining relevant details about the subjects that took part in a crime depends on the manual inspection of recordings. As such, the current goal of the research community is the development of automated surveillance systems capable of monitoring and identifying subjects in surveillance scenarios. Accordingly, the main goal of this thesis is to improve the performance of biometric recognition algorithms in data acquired from surveillance scenarios. In particular, we aim at designing a visual surveillance system capable of acquiring biometric data at a distance (e.g., face, iris or gait) without requiring human intervention in the process, as well as devising biometric recognition methods robust to the degradation factors resulting from the unconstrained acquisition process. Regarding the first goal, the analysis of the data acquired by typical surveillance systems shows that large acquisition distances significantly decrease the resolution of biometric samples, and thus their discriminability is not sufficient for recognition purposes. In the literature, diverse works point out Pan Tilt Zoom (PTZ) cameras as the most practical way for acquiring high-resolution imagery at a distance, particularly when using a master-slave configuration. In the master-slave configuration, the video acquired by a typical surveillance camera is analyzed for obtaining regions of interest (e.g., car, person) and these regions are subsequently imaged at high-resolution by the PTZ camera. Several methods have already shown that this configuration can be used for acquiring biometric data at a distance. Nevertheless, these methods failed at providing effective solutions to the typical challenges of this strategy, restraining its use in surveillance scenarios. Accordingly, this thesis proposes two methods to support the development of a biometric data acquisition system based on the cooperation of a PTZ camera with a typical surveillance camera. The first proposal is a camera calibration method capable of accurately mapping the coordinates of the master camera to the pan/tilt angles of the PTZ camera. The second proposal is a camera scheduling method for determining - in real-time - the sequence of acquisitions that maximizes the number of different targets obtained, while minimizing the cumulative transition time. In order to achieve the first goal of this thesis, both methods were combined with state-of-the-art approaches of the human monitoring field to develop a fully automated surveillance capable of acquiring biometric data at a distance and without human cooperation, designated as QUIS-CAMPI system. The QUIS-CAMPI system is the basis for pursuing the second goal of this thesis. The analysis of the performance of the state-of-the-art biometric recognition approaches shows that these approaches attain almost ideal recognition rates in unconstrained data. However, this performance is incongruous with the recognition rates observed in surveillance scenarios. Taking into account the drawbacks of current biometric datasets, this thesis introduces a novel dataset comprising biometric samples (face images and gait videos) acquired by the QUIS-CAMPI system at a distance ranging from 5 to 40 meters and without human intervention in the acquisition process. This set allows to objectively assess the performance of state-of-the-art biometric recognition methods in data that truly encompass the covariates of surveillance scenarios. As such, this set was exploited for promoting the first international challenge on biometric recognition in the wild. This thesis describes the evaluation protocols adopted, along with the results obtained by the nine methods specially designed for this competition. In addition, the data acquired by the QUIS-CAMPI system were crucial for accomplishing the second goal of this thesis, i.e., the development of methods robust to the covariates of surveillance scenarios. The first proposal regards a method for detecting corrupted features in biometric signatures inferred by a redundancy analysis algorithm. The second proposal is a caricature-based face recognition approach capable of enhancing the recognition performance by automatically generating a caricature from a 2D photo. The experimental evaluation of these methods shows that both approaches contribute to improve the recognition performance in unconstrained data.A crescente preocupação com a segurança dos indivíduos tem justificado o crescimento do número de câmaras de vídeo-vigilância instaladas tanto em espaços privados como públicos. Contudo, ao contrário do que normalmente se pensa, estes dispositivos são, na maior parte dos casos, usados apenas para gravação, não estando ligados a nenhum tipo de software inteligente capaz de inferir em tempo real informações sobre os indivíduos observados. Assim, apesar de a vídeo-vigilância ter provado ser essencial na resolução de diversos crimes, o seu uso está ainda confinado à disponibilização de vídeos que têm que ser manualmente inspecionados para extrair informações relevantes dos sujeitos envolvidos no crime. Como tal, atualmente, o principal desafio da comunidade científica é o desenvolvimento de sistemas automatizados capazes de monitorizar e identificar indivíduos em ambientes de vídeo-vigilância. Esta tese tem como principal objetivo estender a aplicabilidade dos sistemas de reconhecimento biométrico aos ambientes de vídeo-vigilância. De forma mais especifica, pretende-se 1) conceber um sistema de vídeo-vigilância que consiga adquirir dados biométricos a longas distâncias (e.g., imagens da cara, íris, ou vídeos do tipo de passo) sem requerer a cooperação dos indivíduos no processo; e 2) desenvolver métodos de reconhecimento biométrico robustos aos fatores de degradação inerentes aos dados adquiridos por este tipo de sistemas. No que diz respeito ao primeiro objetivo, a análise aos dados adquiridos pelos sistemas típicos de vídeo-vigilância mostra que, devido à distância de captura, os traços biométricos amostrados não são suficientemente discriminativos para garantir taxas de reconhecimento aceitáveis. Na literatura, vários trabalhos advogam o uso de câmaras Pan Tilt Zoom (PTZ) para adquirir imagens de alta resolução à distância, principalmente o uso destes dispositivos no modo masterslave. Na configuração master-slave um módulo de análise inteligente seleciona zonas de interesse (e.g. carros, pessoas) a partir do vídeo adquirido por uma câmara de vídeo-vigilância e a câmara PTZ é orientada para adquirir em alta resolução as regiões de interesse. Diversos métodos já mostraram que esta configuração pode ser usada para adquirir dados biométricos à distância, ainda assim estes não foram capazes de solucionar alguns problemas relacionados com esta estratégia, impedindo assim o seu uso em ambientes de vídeo-vigilância. Deste modo, esta tese propõe dois métodos para permitir a aquisição de dados biométricos em ambientes de vídeo-vigilância usando uma câmara PTZ assistida por uma câmara típica de vídeo-vigilância. O primeiro é um método de calibração capaz de mapear de forma exata as coordenadas da câmara master para o ângulo da câmara PTZ (slave) sem o auxílio de outros dispositivos óticos. O segundo método determina a ordem pela qual um conjunto de sujeitos vai ser observado pela câmara PTZ. O método proposto consegue determinar em tempo-real a sequência de observações que maximiza o número de diferentes sujeitos observados e simultaneamente minimiza o tempo total de transição entre sujeitos. De modo a atingir o primeiro objetivo desta tese, os dois métodos propostos foram combinados com os avanços alcançados na área da monitorização de humanos para assim desenvolver o primeiro sistema de vídeo-vigilância completamente automatizado e capaz de adquirir dados biométricos a longas distâncias sem requerer a cooperação dos indivíduos no processo, designado por sistema QUIS-CAMPI. O sistema QUIS-CAMPI representa o ponto de partida para iniciar a investigação relacionada com o segundo objetivo desta tese. A análise do desempenho dos métodos de reconhecimento biométrico do estado-da-arte mostra que estes conseguem obter taxas de reconhecimento quase perfeitas em dados adquiridos sem restrições (e.g., taxas de reconhecimento maiores do que 99% no conjunto de dados LFW). Contudo, este desempenho não é corroborado pelos resultados observados em ambientes de vídeo-vigilância, o que sugere que os conjuntos de dados atuais não contêm verdadeiramente os fatores de degradação típicos dos ambientes de vídeo-vigilância. Tendo em conta as vulnerabilidades dos conjuntos de dados biométricos atuais, esta tese introduz um novo conjunto de dados biométricos (imagens da face e vídeos do tipo de passo) adquiridos pelo sistema QUIS-CAMPI a uma distância máxima de 40m e sem a cooperação dos sujeitos no processo de aquisição. Este conjunto permite avaliar de forma objetiva o desempenho dos métodos do estado-da-arte no reconhecimento de indivíduos em imagens/vídeos capturados num ambiente real de vídeo-vigilância. Como tal, este conjunto foi utilizado para promover a primeira competição de reconhecimento biométrico em ambientes não controlados. Esta tese descreve os protocolos de avaliação usados, assim como os resultados obtidos por 9 métodos especialmente desenhados para esta competição. Para além disso, os dados adquiridos pelo sistema QUIS-CAMPI foram essenciais para o desenvolvimento de dois métodos para aumentar a robustez aos fatores de degradação observados em ambientes de vídeo-vigilância. O primeiro é um método para detetar características corruptas em assinaturas biométricas através da análise da redundância entre subconjuntos de características. O segundo é um método de reconhecimento facial baseado em caricaturas automaticamente geradas a partir de uma única foto do sujeito. As experiências realizadas mostram que ambos os métodos conseguem reduzir as taxas de erro em dados adquiridos de forma não controlada

    ACCURATE TRACKING OF OBJECTS USING LEVEL SETS

    Get PDF
    Our current work presents an approach to tackle the challenging task of tracking objects in Internet videos taken from large web repositories such as YouTube. Such videos more often than not, are captured by users using their personal hand-held cameras and cellphones and hence suffer from problems such as poor quality, camera jitter and unconstrained lighting and environmental settings. Also, it has been observed that events being recorded by such videos usually contain objects moving in an unconstrained fashion. Hence, tracking objects in Internet videos is a very challenging task in the field of computer vision since there is no a-priori information about the types of objects we might encounter, their velocities while in motion or intrinsic camera parameters to estimate the location of object in each frame. Hence, in this setting it is clearly not possible to model objects as single homogenous distributions in feature space. The feature space itself cannot be fixed since different objects might be discriminable in different sub-spaces. Keeping these challenges in mind, in the current proposed technique, each object is divided into multiple fragments or regions and each fragment is represented in Gaussian Mixture model (GMM) in a joint feature-spatial space. Each fragment is automatically selected from the image data by adapting to image statistics using a segmentation technique. We introduce the concept of strength map which represents a probability distribution of the image statistics and is used to detecting the object. We extend our goal of tracking object to tracking them with accurate boundaries thereby making the current task more challenging. We solve this problem by modeling the object using a level sets framework, which helps in preserving accurate boundaries of the object and as well in modeling the target object and background. These extracted object boundaries are learned dynamically over time, enabling object tracking even during occlusion. Our proposed algorithm performs significantly better than any of the existing object modeling techniques. Experimental results have been shown in support of this claim. Apart from tracking, the present algorithm can also be applied to different scenarios. One such application is contour-based object detection. Also, the idea of strength map was successfully applied to track objects such as vessels and vehicles on a wide range of videos, as a part of the summer internship program

    Low and Variable Frame Rate Face Tracking Using an IP PTZ Camera

    Get PDF
    RÉSUMÉ En vision par ordinateur, le suivi d'objets avec des caméras PTZ a des applications dans divers domaines, tels que la surveillance vidéo, la surveillance du trafic, la surveillance de personnes et la reconnaissance de visage. Toutefois, un suivi plus précis, efficace, et fiable est requis pour une utilisation courante dans ces domaines. Dans cette thèse, le suivi est appliqué au haut du corps d'un humain, en incluant son visage. Le suivi du visage permet de déterminer son emplacement pour chaque trame d'une vidéo. Il peut être utilisé pour obtenir des images du visage d'un humain dans des poses différentes. Dans ce travail, nous proposons de suivre le visage d'un humain à l’aide d'une caméra IP PTZ (caméra réseau orientable). Une caméra IP PTZ répond à une commande via son serveur Web intégré et permet un accès distribué à partir d'Internet. Le suivi avec ce type de caméra inclut un bon nombre de défis, tels que des temps de réponse irrégulier aux commandes de contrôle, des taux de trame faibles et irréguliers, de grand mouvements de la cible entre deux trames, des occlusions, des modifications au champ de vue, des changements d'échelle, etc. Dans notre travail, nous souhaitons solutionner les problèmes des grands mouvements de la cible entre deux trames consécutives, du faible taux de trame, des modifications de l'arrière-plan, et du suivi avec divers changements d'échelle. En outre, l'algorithme de suivi doit prévoir les temps de réponse irréguliers de la caméra. Notre solution se compose d’une phase d’initialisation pour modéliser la cible (haut du corps), d’une adaptation du filtre de particules qui utilise le flux optique pour générer des échantillons à chaque trame (APF-OFS), et du contrôle de la caméra. Chaque composante exige des stratégies différentes. Lors de l'initialisation, on suppose que la caméra est statique. Ainsi, la détection du mouvement par soustraction d’arrière-plan est utilisée pour détecter l'emplacement initial de la personne. Ensuite, pour supprimer les faux positifs, un classificateur Bayesien est appliqué sur la région détectée afin de localiser les régions avec de la peau. Ensuite, une détection du visage basée sur la méthode de Viola et Jones est effectuée sur les régions de la peau. Si un visage est détecté, le suivi est lancé sur le haut du corps de la personne.----------ABSTRACT Object tracking with PTZ cameras has various applications in different computer vision topics such as video surveillance, traffic monitoring, people monitoring and face recognition. Accurate, efficient, and reliable tracking is required for this task. Here, object tracking is applied to human upper body tracking and face tracking. Face tracking determines the location of the human face for each input image of a video. It can be used to get images of the face of a human target under different poses. We propose to track the human face by means of an Internet Protocol (IP) Pan-Tilt-Zoom (PTZ) camera (i.e. a network-based camera that pans, tilts and zooms). An IP PTZ camera responds to command via its integrated web server. It allows a distributed access from Internet (access from everywhere, but with non-defined delay). Tracking with such camera includes many challenges such as irregular response times to camera control commands, low and irregular frame rate, large motions of the target between two frames, target occlusion, changing field of view (FOV), various scale changes, etc. In our work, we want to cope with the problem of large inter-frame motion of targets, low usable frame rate, background changes, and tracking with various scale changes. In addition, the tracking algorithm should handle the camera response time and zooming. Our solution consists of a system initialization phase which is the processing before camera motion and a tracker based on an Adaptive Particle Filter using Optical Flow based Sampling (APF-OFS) tracker, and camera control that are the processing after the motion of the camera. Each part requires different strategies. For initialization, when the camera is stationary, motion detection for a static camera is used to detect the initial location of the person face entering an area. For motion detection in the FOV of the camera, a background subtraction method is applied. Then to remove false positives, Bayesian skin classifier is applied on the detected motion region to discriminate skin regions from non skin regions. Face detection based on Viola and Jones face detector can be performed on the detected skin regions independently of their face size and position within the image

    Biometric fusion methods for adaptive face recognition in computer vision

    Get PDF
    PhD ThesisFace recognition is a biometric method that uses different techniques to identify the individuals based on the facial information received from digital image data. The system of face recognition is widely used for security purposes, which has challenging problems. The solutions to some of the most important challenges are proposed in this study. The aim of this thesis is to investigate face recognition across pose problem based on the image parameters of camera calibration. In this thesis, three novel methods have been derived to address the challenges of face recognition and offer solutions to infer the camera parameters from images using a geomtric approach based on perspective projection. The following techniques were used: camera calibration CMT and Face Quadtree Decomposition (FQD), in order to develop the face camera measurement technique (FCMT) for human facial recognition. Facial information from a feature extraction and identity-matching algorithm has been created. The success and efficacy of the proposed algorithm are analysed in terms of robustness to noise, the accuracy of distance measurement, and face recognition. To overcome the intrinsic and extrinsic parameters of camera calibration parameters, a novel technique has been developed based on perspective projection, which uses different geometrical shapes to calibrate the camera. The parameters used in novel measurement technique CMT that enables the system to infer the real distance for regular and irregular objects from the 2-D images. The proposed system of CMT feeds into FQD to measure the distance between the facial points. Quadtree decomposition enhances the representation of edges and other singularities along curves of the face, and thus improves directional features from face detection across face pose. The proposed FCMT system is the new combination of CMT and FQD to recognise the faces in the various pose. The theoretical foundation of the proposed solutions has been thoroughly developed and discussed in detail. The results show that the proposed algorithms outperform existing algorithms in face recognition, with a 2.5% improvement in main error recognition rate compared with recent studies

    Advanced photonic and electronic systems - WILGA 2017

    Get PDF
    WILGA annual symposium on advanced photonic and electronic systems has been organized by young scientist for young scientists since two decades. It traditionally gathers more than 350 young researchers and their tutors. Ph.D students and graduates present their recent achievements during well attended oral sessions. Wilga is a very good digest of Ph.D. works carried out at technical universities in electronics and photonics, as well as information sciences throughout Poland and some neighboring countries. Publishing patronage over Wilga keep Elektronika technical journal by SEP, IJET by PAN and Proceedings of SPIE. The latter world editorial series publishes annually more than 200 papers from Wilga. Wilga 2017 was the XL edition of this meeting. The following topical tracks were distinguished: photonics, electronics, information technologies and system research. The article is a digest of some chosen works presented during Wilga 2017 symposium. WILGA 2017 works were published in Proc. SPIE vol.10445

    Lightfield Analysis and Its Applications in Adaptive Optics and Surveillance Systems

    Get PDF
    An image can only be as good as the optics of a camera or any other imaging system allows it to be. An imaging system is merely a transformation that takes a 3D world coordinate to a 2D image plane. This can be done through both linear/non-linear transfer functions. Depending on the application at hand it is easier to use some models of imaging systems over the others in certain situations. The most well-known models are the 1) Pinhole model, 2) Thin Lens Model and 3) Thick lens model for optical systems. Using light-field analysis the connection through these different models is described. A novel figure of merit is presented on using one optical model over the other for certain applications. After analyzing these optical systems, their applications in plenoptic cameras for adaptive optics applications are introduced. A new technique to use a plenoptic camera to extract information about a localized distorted planar wave front is described. CODEV simulations conducted in this thesis show that its performance is comparable to those of a Shack-Hartmann sensor and that they can potentially increase the dynamic range of angles that can be extracted assuming a paraxial imaging system. As a final application, a novel dual PTZ-surveillance system to track a target through space is presented. 22X optic zoom lenses on high resolution pan/tilt platforms recalibrate a master-slave relationship based on encoder readouts rather than complicated image processing algorithms for real-time target tracking. As the target moves out of a region of interest in the master camera, it is moved to force the target back into the region of interest. Once the master camera is moved, a precalibrated lookup table is interpolated to compute the relationship between the master/slave cameras. The homography that relates the pixels of the master camera to the pan/tilt settings of the slave camera then continue to follow the planar trajectories of targets as they move through space at high accuracies

    Camera Planning and Fusion in a Heterogeneous Camera Network

    Get PDF
    Wide-area camera networks are becoming more and more common. They have widerange of commercial and military applications from video surveillance to smart home and from traffic monitoring to anti-terrorism. The design of such a camera network is a challenging problem due to the complexity of the environment, self and mutual occlusion of moving objects, diverse sensor properties and a myriad of performance metrics for different applications. In this dissertation, we consider two such challenges: camera planing and camera fusion. Camera planning is to determine the optimal number and placement of cameras for a target cost function. Camera fusion describes the task of combining images collected by heterogenous cameras in the network to extract information pertinent to a target application. I tackle the camera planning problem by developing a new unified framework based on binary integer programming (BIP) to relate the network design parameters and the performance goals of a variety of camera network tasks. Most of the BIP formulations are NP hard problems and various approximate algorithms have been proposed in the literature. In this dissertation, I develop a comprehensive framework in comparing the entire spectrum of approximation algorithms from Greedy, Markov Chain Monte Carlo (MCMC) to various relaxation techniques. The key contribution is to provide not only a generic formulation of the camera planning problem but also novel approaches to adapt the formulation to powerful approximation schemes including Simulated Annealing (SA) and Semi-Definite Program (SDP). The accuracy, efficiency and scalability of each technique are analyzed and compared in depth. Extensive experimental results are provided to illustrate the strength and weakness of each method. The second problem of heterogeneous camera fusion is a very complex problem. Information can be fused at different levels from pixel or voxel to semantic objects, with large variation in accuracy, communication and computation costs. My focus is on the geometric transformation of shapes between objects observed at different camera planes. This so-called the geometric fusion approach usually provides the most reliable fusion approach at the expense of high computation and communication costs. To tackle the complexity, a hierarchy of camera models with different levels of complexity was proposed to balance the effectiveness and efficiency of the camera network operation. Then different calibration and registration methods are proposed for each camera model. At last, I provide two specific examples to demonstrate the effectiveness of the model: 1)a fusion system to improve the segmentation of human body in a camera network consisted of thermal and regular visible light cameras and 2) a view dependent rendering system by combining the information from depth and regular cameras to collecting the scene information and generating new views in real time

    Gaussian mixture model classifiers for detection and tracking in UAV video streams.

    Get PDF
    Masters Degree. University of KwaZulu-Natal, Durban.Manual visual surveillance systems are subject to a high degree of human-error and operator fatigue. The automation of such systems often employs detectors, trackers and classifiers as fundamental building blocks. Detection, tracking and classification are especially useful and challenging in Unmanned Aerial Vehicle (UAV) based surveillance systems. Previous solutions have addressed challenges via complex classification methods. This dissertation proposes less complex Gaussian Mixture Model (GMM) based classifiers that can simplify the process; where data is represented as a reduced set of model parameters, and classification is performed in the low dimensionality parameter-space. The specification and adoption of GMM based classifiers on the UAV visual tracking feature space formed the principal contribution of the work. This methodology can be generalised to other feature spaces. This dissertation presents two main contributions in the form of submissions to ISI accredited journals. In the first paper, objectives are demonstrated with a vehicle detector incorporating a two stage GMM classifier, applied to a single feature space, namely Histogram of Oriented Gradients (HoG). While the second paper demonstrates objectives with a vehicle tracker using colour histograms (in RGB and HSV), with Gaussian Mixture Model (GMM) classifiers and a Kalman filter. The proposed works are comparable to related works with testing performed on benchmark datasets. In the tracking domain for such platforms, tracking alone is insufficient. Adaptive detection and classification can assist in search space reduction, building of knowledge priors and improved target representations. Results show that the proposed approach improves performance and robustness. Findings also indicate potential further enhancements such as a multi-mode tracker with global and local tracking based on a combination of both papers
    • …
    corecore