42 research outputs found

    Proyecciones cónicas de rectas en sistemas catadióptricos para percepción visual en entornos construidos por el hombre

    Get PDF
    Los sistemas de visión omnidireccional son dispositivos que permiten la adquisición de imágenes con un campo de vista de 360º en un eje y superior 180º en el otro. La necesidad de integrar estas cámaras en sistemas de visión por computador ha impulsado la investigación en este campo profundizando en los modelos matemáticos y la base teórica necesaria que permite la implementación de aplicaciones. Existen diversas tecnologías para obtener imágenes omnidireccionales. Los sistemas catadióptricos son aquellos que consiguen aumentar el campo de vista utilizando espejos. Entre estos, encontramos los sistemas hiper-catadióptricos que son aquellos que utilizan una cámara perspectiva y un espejo hiperbólico. La geometría hiperbólica del espejo garantiza que el sistema sea central. En estos sistemas adquieren una especial relevancia las rectas del espacio, en la medida en que, rectas largas son completamente visibles en única imagen. La recta es una forma geométrica abundante en entornos construidos por el hombre que además acostumbra a ordenarse según direcciones dominantes. Salvo construcciones singulares, la fuerza de la gravedad fija una dirección vertical que puede utilizarse como referencia en el cálculo de la orientación del sistema. Sin embargo el uso de rectas en sistemas catadióptricos implica la dificultad añadida de trabajar con un modelo proyectivo no lineal en el que las rectas 3d son proyectadas en cónicas. Este TFM recoge el trabajo que se presenta en el artículo "Significant Conics on Catadioptric Images for 3D Orientation and Image Rectification" que pretendemos enviar a "Robotics and Autonomous Systems". En él se presenta un método para calcular la orientación de un sistema hiper-catadióptrico utilizando las cónicas que son proyecciones de rectas 3D. El método calcula la orientación respecto del sistema de referencia absoluto definido por el conjunto de puntos de fuga en un entorno en que existan direcciones dominantes

    Non-parametric Models of Distortion in Imaging Systems.

    Full text link
    Traditional radial lens distortion models are based on the physical construction of lenses. However, manufacturing defects and physical shock often cause the actual observed distortion to be different from what can be modeled by the physically motivated models. In this work, we initially propose a Gaussian process radial distortion model as an alternative to the physically motivated models. The non-parametric nature of this model helps implicitly select the right model complexity, whereas for traditional distortion models one must perform explicit model selection to decide the right parametric complexity. Next, we forego the radial distortion assumption and present a completely non-parametric, mathematically motivated distortion model based on locally-weighted homographies. The separation from an underlying physical model allows this model to capture arbitrary sources of distortion. We then apply this fully non-parametric distortion model to a zoom lens, where the distortion complexity can vary across zoom levels and the lens exhibits noticeable non-radial distortion. Through our experiments and evaluation, we show that the proposed models are as accurate as the traditional parametric models at characterizing radial distortion while flexibly capturing non-radial distortion if present in the imaging system.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120690/1/rpradeep_1.pd

    Enhancing 3D Visual Odometry with Single-Camera Stereo Omnidirectional Systems

    Full text link
    We explore low-cost solutions for efficiently improving the 3D pose estimation problem of a single camera moving in an unfamiliar environment. The visual odometry (VO) task -- as it is called when using computer vision to estimate egomotion -- is of particular interest to mobile robots as well as humans with visual impairments. The payload capacity of small robots like micro-aerial vehicles (drones) requires the use of portable perception equipment, which is constrained by size, weight, energy consumption, and processing power. Using a single camera as the passive sensor for the VO task satisfies these requirements, and it motivates the proposed solutions presented in this thesis. To deliver the portability goal with a single off-the-shelf camera, we have taken two approaches: The first one, and the most extensively studied here, revolves around an unorthodox camera-mirrors configuration (catadioptrics) achieving a stereo omnidirectional system (SOS). The second approach relies on expanding the visual features from the scene into higher dimensionalities to track the pose of a conventional camera in a photogrammetric fashion. The first goal has many interdependent challenges, which we address as part of this thesis: SOS design, projection model, adequate calibration procedure, and application to VO. We show several practical advantages for the single-camera SOS due to its complete 360-degree stereo views, that other conventional 3D sensors lack due to their limited field of view. Since our omnidirectional stereo (omnistereo) views are captured by a single camera, a truly instantaneous pair of panoramic images is possible for 3D perception tasks. Finally, we address the VO problem as a direct multichannel tracking approach, which increases the pose estimation accuracy of the baseline method (i.e., using only grayscale or color information) under the photometric error minimization as the heart of the “direct” tracking algorithm. Currently, this solution has been tested on standard monocular cameras, but it could also be applied to an SOS. We believe the challenges that we attempted to solve have not been considered previously with the level of detail needed for successfully performing VO with a single camera as the ultimate goal in both real-life and simulated scenes

    Airborne vision-based attitude estimation and localisation

    Get PDF
    Vision plays an integral part in a pilot's ability to navigate and control an aircraft. Therefore Visual Flight Rules have been developed around the pilot's ability to see the environment outside of the cockpit in order to control the attitude of the aircraft, to navigate and to avoid obstacles. The automation of these processes using a vision system could greatly increase the reliability and autonomy of unmanned aircraft and flight automation systems. This thesis investigates the development and implementation of a robust vision system which fuses inertial information with visual information in a probabilistic framework with the aim of aircraft navigation. The horizon appearance is a strong visual indicator of the attitude of the aircraft. This leads to the first research area of this thesis, visual horizon attitude determination. An image processing method was developed to provide high performance horizon detection and extraction from camera imagery. A number of horizon models were developed to link the detected horizon to the attitude of the aircraft with varying degrees of accuracy. The second area investigated in this thesis was visual localisation of the aircraft. A terrain-aided horizon model was developed to estimate the position, altitude as well as attitude of the aircraft. This gives rough positions estimates with highly accurate attitude information. The visual localisation accuracy was improved by incorporating ground feature-based map-aided navigation. Road intersections were detected using a developed image processing algorithm and then they were matched to a database to provide positional information. The developed vision system show comparable performance to other non-vision-based systems while removing the dependence on external systems for navigation. The vision system and techniques developed in this thesis helps to increase the autonomy of unmanned aircraft and flight automation systems for manned flight

    Global Shipping Container Monitoring Using Machine Learning with Multi-Sensor Hubs and Catadioptric Imaging

    Get PDF
    We describe a framework for global shipping container monitoring using machine learning with multi-sensor hubs and infrared catadioptric imaging. A wireless mesh radio satellite tag architecture provides connectivity anywhere in the world which is a significant improvement to legacy methods. We discuss the design and testing of a low-cost long-wave infrared catadioptric imaging device and multi-sensor hub combination as an intelligent edge computing system that, when equipped with physics-based machine learning algorithms, can interpret the scene inside a shipping container to make efficient use of expensive communications bandwidth. The histogram of oriented gradients and T-channel (HOG+) feature as introduced for human detection on low-resolution infrared catadioptric images is shown to be effective for various mirror shapes designed to give wide volume coverage with controlled distortion. Initial results for through-metal communication with ultrasonic guided waves show promise using the Dynamic Wavelet Fingerprint Technique (DWFT) to identify Lamb waves in a complicated ultrasonic signal

    Optical flow templates for mobile robot environment understanding

    Get PDF
    In this work we develop optical flow templates. In doing so, we introduce a practical tool for inferring robot egomotion and semantic superpixel labeling using optical flow in imaging systems with arbitrary optics. In order to do this we develop valuable understanding of geometric relationships and mathematical methods that are useful in interpreting optical flow to the robotics and computer vision communities. This work is motivated by what we perceive as directions for advancing the current state of the art in obstacle detection and scene understanding for mobile robots. Specifically, many existing methods build 3D point clouds, which are not directly useful for autonomous navigation and require further processing. Both the step of building the point clouds and the later processing steps are challenging and computationally intensive. Additionally, many current methods require a calibrated camera, which introduces calibration challenges and places limitations on the types of camera optics that may be used. Wide-angle lenses, systems with mirrors, and multiple cameras all require different calibration models and can be difficult or impossible to calibrate at all. Finally, current pixel and superpixel obstacle labeling algorithms typically rely on image appearance. While image appearance is informative, image motion is a direct effect of the scene structure that determines whether a region of the environment is an obstacle. The egomotion estimation and obstacle labeling methods we develop here based on optical flow templates require very little computation per frame and do not require building point clouds. Additionally, they do not require any specific type of camera optics, nor a calibrated camera. Finally, they label obstacles using optical flow alone without image appearance. In this thesis we start with optical flow subspaces for egomotion estimation and detection of “motion anomalies”. We then extend this to multiple subspaces and develop mathematical reasoning to select between them, comprising optical flow templates. Using these we classify environment shapes and label superpixels. Finally, we show how performing all learning and inference directly from image spatio-temporal gradients greatly improves computation time and accuracy.Ph.D

    Reconstruction active et passive en vision par ordinateur

    Full text link
    Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal

    QUIS-CAMPI: Biometric Recognition in Surveillance Scenarios

    Get PDF
    The concerns about individuals security have justified the increasing number of surveillance cameras deployed both in private and public spaces. However, contrary to popular belief, these devices are in most cases used solely for recording, instead of feeding intelligent analysis processes capable of extracting information about the observed individuals. Thus, even though video surveillance has already proved to be essential for solving multiple crimes, obtaining relevant details about the subjects that took part in a crime depends on the manual inspection of recordings. As such, the current goal of the research community is the development of automated surveillance systems capable of monitoring and identifying subjects in surveillance scenarios. Accordingly, the main goal of this thesis is to improve the performance of biometric recognition algorithms in data acquired from surveillance scenarios. In particular, we aim at designing a visual surveillance system capable of acquiring biometric data at a distance (e.g., face, iris or gait) without requiring human intervention in the process, as well as devising biometric recognition methods robust to the degradation factors resulting from the unconstrained acquisition process. Regarding the first goal, the analysis of the data acquired by typical surveillance systems shows that large acquisition distances significantly decrease the resolution of biometric samples, and thus their discriminability is not sufficient for recognition purposes. In the literature, diverse works point out Pan Tilt Zoom (PTZ) cameras as the most practical way for acquiring high-resolution imagery at a distance, particularly when using a master-slave configuration. In the master-slave configuration, the video acquired by a typical surveillance camera is analyzed for obtaining regions of interest (e.g., car, person) and these regions are subsequently imaged at high-resolution by the PTZ camera. Several methods have already shown that this configuration can be used for acquiring biometric data at a distance. Nevertheless, these methods failed at providing effective solutions to the typical challenges of this strategy, restraining its use in surveillance scenarios. Accordingly, this thesis proposes two methods to support the development of a biometric data acquisition system based on the cooperation of a PTZ camera with a typical surveillance camera. The first proposal is a camera calibration method capable of accurately mapping the coordinates of the master camera to the pan/tilt angles of the PTZ camera. The second proposal is a camera scheduling method for determining - in real-time - the sequence of acquisitions that maximizes the number of different targets obtained, while minimizing the cumulative transition time. In order to achieve the first goal of this thesis, both methods were combined with state-of-the-art approaches of the human monitoring field to develop a fully automated surveillance capable of acquiring biometric data at a distance and without human cooperation, designated as QUIS-CAMPI system. The QUIS-CAMPI system is the basis for pursuing the second goal of this thesis. The analysis of the performance of the state-of-the-art biometric recognition approaches shows that these approaches attain almost ideal recognition rates in unconstrained data. However, this performance is incongruous with the recognition rates observed in surveillance scenarios. Taking into account the drawbacks of current biometric datasets, this thesis introduces a novel dataset comprising biometric samples (face images and gait videos) acquired by the QUIS-CAMPI system at a distance ranging from 5 to 40 meters and without human intervention in the acquisition process. This set allows to objectively assess the performance of state-of-the-art biometric recognition methods in data that truly encompass the covariates of surveillance scenarios. As such, this set was exploited for promoting the first international challenge on biometric recognition in the wild. This thesis describes the evaluation protocols adopted, along with the results obtained by the nine methods specially designed for this competition. In addition, the data acquired by the QUIS-CAMPI system were crucial for accomplishing the second goal of this thesis, i.e., the development of methods robust to the covariates of surveillance scenarios. The first proposal regards a method for detecting corrupted features in biometric signatures inferred by a redundancy analysis algorithm. The second proposal is a caricature-based face recognition approach capable of enhancing the recognition performance by automatically generating a caricature from a 2D photo. The experimental evaluation of these methods shows that both approaches contribute to improve the recognition performance in unconstrained data.A crescente preocupação com a segurança dos indivíduos tem justificado o crescimento do número de câmaras de vídeo-vigilância instaladas tanto em espaços privados como públicos. Contudo, ao contrário do que normalmente se pensa, estes dispositivos são, na maior parte dos casos, usados apenas para gravação, não estando ligados a nenhum tipo de software inteligente capaz de inferir em tempo real informações sobre os indivíduos observados. Assim, apesar de a vídeo-vigilância ter provado ser essencial na resolução de diversos crimes, o seu uso está ainda confinado à disponibilização de vídeos que têm que ser manualmente inspecionados para extrair informações relevantes dos sujeitos envolvidos no crime. Como tal, atualmente, o principal desafio da comunidade científica é o desenvolvimento de sistemas automatizados capazes de monitorizar e identificar indivíduos em ambientes de vídeo-vigilância. Esta tese tem como principal objetivo estender a aplicabilidade dos sistemas de reconhecimento biométrico aos ambientes de vídeo-vigilância. De forma mais especifica, pretende-se 1) conceber um sistema de vídeo-vigilância que consiga adquirir dados biométricos a longas distâncias (e.g., imagens da cara, íris, ou vídeos do tipo de passo) sem requerer a cooperação dos indivíduos no processo; e 2) desenvolver métodos de reconhecimento biométrico robustos aos fatores de degradação inerentes aos dados adquiridos por este tipo de sistemas. No que diz respeito ao primeiro objetivo, a análise aos dados adquiridos pelos sistemas típicos de vídeo-vigilância mostra que, devido à distância de captura, os traços biométricos amostrados não são suficientemente discriminativos para garantir taxas de reconhecimento aceitáveis. Na literatura, vários trabalhos advogam o uso de câmaras Pan Tilt Zoom (PTZ) para adquirir imagens de alta resolução à distância, principalmente o uso destes dispositivos no modo masterslave. Na configuração master-slave um módulo de análise inteligente seleciona zonas de interesse (e.g. carros, pessoas) a partir do vídeo adquirido por uma câmara de vídeo-vigilância e a câmara PTZ é orientada para adquirir em alta resolução as regiões de interesse. Diversos métodos já mostraram que esta configuração pode ser usada para adquirir dados biométricos à distância, ainda assim estes não foram capazes de solucionar alguns problemas relacionados com esta estratégia, impedindo assim o seu uso em ambientes de vídeo-vigilância. Deste modo, esta tese propõe dois métodos para permitir a aquisição de dados biométricos em ambientes de vídeo-vigilância usando uma câmara PTZ assistida por uma câmara típica de vídeo-vigilância. O primeiro é um método de calibração capaz de mapear de forma exata as coordenadas da câmara master para o ângulo da câmara PTZ (slave) sem o auxílio de outros dispositivos óticos. O segundo método determina a ordem pela qual um conjunto de sujeitos vai ser observado pela câmara PTZ. O método proposto consegue determinar em tempo-real a sequência de observações que maximiza o número de diferentes sujeitos observados e simultaneamente minimiza o tempo total de transição entre sujeitos. De modo a atingir o primeiro objetivo desta tese, os dois métodos propostos foram combinados com os avanços alcançados na área da monitorização de humanos para assim desenvolver o primeiro sistema de vídeo-vigilância completamente automatizado e capaz de adquirir dados biométricos a longas distâncias sem requerer a cooperação dos indivíduos no processo, designado por sistema QUIS-CAMPI. O sistema QUIS-CAMPI representa o ponto de partida para iniciar a investigação relacionada com o segundo objetivo desta tese. A análise do desempenho dos métodos de reconhecimento biométrico do estado-da-arte mostra que estes conseguem obter taxas de reconhecimento quase perfeitas em dados adquiridos sem restrições (e.g., taxas de reconhecimento maiores do que 99% no conjunto de dados LFW). Contudo, este desempenho não é corroborado pelos resultados observados em ambientes de vídeo-vigilância, o que sugere que os conjuntos de dados atuais não contêm verdadeiramente os fatores de degradação típicos dos ambientes de vídeo-vigilância. Tendo em conta as vulnerabilidades dos conjuntos de dados biométricos atuais, esta tese introduz um novo conjunto de dados biométricos (imagens da face e vídeos do tipo de passo) adquiridos pelo sistema QUIS-CAMPI a uma distância máxima de 40m e sem a cooperação dos sujeitos no processo de aquisição. Este conjunto permite avaliar de forma objetiva o desempenho dos métodos do estado-da-arte no reconhecimento de indivíduos em imagens/vídeos capturados num ambiente real de vídeo-vigilância. Como tal, este conjunto foi utilizado para promover a primeira competição de reconhecimento biométrico em ambientes não controlados. Esta tese descreve os protocolos de avaliação usados, assim como os resultados obtidos por 9 métodos especialmente desenhados para esta competição. Para além disso, os dados adquiridos pelo sistema QUIS-CAMPI foram essenciais para o desenvolvimento de dois métodos para aumentar a robustez aos fatores de degradação observados em ambientes de vídeo-vigilância. O primeiro é um método para detetar características corruptas em assinaturas biométricas através da análise da redundância entre subconjuntos de características. O segundo é um método de reconhecimento facial baseado em caricaturas automaticamente geradas a partir de uma única foto do sujeito. As experiências realizadas mostram que ambos os métodos conseguem reduzir as taxas de erro em dados adquiridos de forma não controlada
    corecore