960 research outputs found

    Real-time Classification of Vehicle Types within Infra-red Imagery

    Get PDF
    Real-time classification of vehicles into sub-category types poses a significant challenge within infra-red imagery due to the high levels of intra-class variation in thermal vehicle signatures caused by aspects of design, current operating duration and ambient thermal conditions. Despite these challenges, infra-red sensing offers significant generalized target object detection advantages in terms of all-weather operation and invariance to visual camouflage techniques. This work investigates the accuracy of a number of real-time object classification approaches for this task within the wider context of an existing initial object detection and tracking framework. Specifically we evaluate the use of traditional feature-driven bag of visual words and histogram of oriented gradient classification approaches against modern convolutional neural network architectures. Furthermore, we use classical photogrammetry, within the context of current target detection and classification techniques, as a means of approximating 3D target position within the scene based on this vehicle type classification. Based on photogrammetric estimation of target position, we then illustrate the use of regular Kalman filter based tracking operating on actual 3D vehicle trajectories. Results are presented using a conventional thermal-band infra-red (IR) sensor arrangement where targets are tracked over a range of evaluation scenarios

    Computationally Efficient Algorithm for Detecting Moving Objects with Moving Background

    Get PDF
    Abstract— The area of moving object detection has been a constant topic of research in more than a decade, where a research community have witnessed various significant contribution in the past that mitigates the problem of real-time and moving object detection. In our prior studies, we have addressed such issues using various sophisticated technique yielding superior results. But, it is felt that some light weight algorithm is required for the purpose of performing moving object detection with complete retention of object detection accuracy. This paper have presented a very simple algorithm that uses visual descriptor for extracting the dynamic features during fast transition of frames. The proposed algorithm is tested with one of the most significant work done recently on same purpose with respect to precision and recall rate along with analysis of processing time of proposed algorithm

    Improving geolocation by combining GPS with image analysis

    Get PDF
    Tese de Mestrado Integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201

    Real-time classification of vehicle types within infra-red imagery.

    Get PDF
    Real-time classification of vehicles into sub-category types poses a significant challenge within infra-red imagery due to the high levels of intra-class variation in thermal vehicle signatures caused by aspects of design, current operating duration and ambient thermal conditions. Despite these challenges, infra-red sensing offers significant generalized target object detection advantages in terms of all-weather operation and invariance to visual camouflage techniques. This work investigates the accuracy of a number of real-time object classification approaches for this task within the wider context of an existing initial object detection and tracking framework. Specifically we evaluate the use of traditional feature-driven bag of visual words and histogram of oriented gradient classification approaches against modern convolutional neural network architectures. Furthermore, we use classical photogrammetry, within the context of current target detection and classification techniques, as a means of approximating 3D target position within the scene based on this vehicle type classification. Based on photogrammetric estimation of target position, we then illustrate the use of regular Kalman filter based tracking operating on actual 3D vehicle trajectories. Results are presented using a conventional thermal-band infra-red (IR) sensor arrangement where targets are tracked over a range of evaluation scenarios

    Smart environment monitoring through micro unmanned aerial vehicles

    Get PDF
    In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection

    A low cost virtual reality interface for educational games

    Get PDF
    Mobile virtual reality has the potential to improve learning experiences by making them more immersive and engaging for students. This type of virtual reality also aims to be more cost effective by using a smartphone to drive the virtual reality experience. One issue with mobile virtual reality is that the screen (i.e. main interface) of the smartphone is occluded by the virtual reality headset. To investigate solutions to this issue, this project details the development and testing of a computer vision based controller that aims to have a cheaper per unit cost when compared to a conventional electronic controller by making use of 3D printing and the built-in camera of a smartphone. Reducing the cost per unit is useful for educational contexts as solutions would need to scale to classrooms sizes. The research question for this project is thus, “can a computer vision based virtual reality controller provide comparable immersion to a conventional electronic controller”. It was found that a computer vision based controller can provide comparable immersion, though it is more challenging to use. This challenge was found to contribute more towards engagement as it did not diminish the performance of users in terms of question scores

    Deep Learning for 3D Visual Perception

    Get PDF
    La percepción visual 3D se refiere al conjunto de problemas que engloban la reunión de información a través de un sensor visual y la estimación la posición tridimensional y estructura de los objetos y formaciones al rededor del sensor. Algunas funcionalidades como la estimación de la ego moción o construcción de mapas are esenciales para otras tareas de más alto nivel como conducción autónoma o realidad aumentada. En esta tesis se han atacado varios desafíos en la percepción 3D, todos ellos útiles desde la perspectiva de SLAM (Localización y Mapeo Simultáneos) que en si es un problema de percepción 3D.Localización y Mapeo Simultáneos –SLAM– busca realizar el seguimiento de la posición de un dispositivo (por ejemplo de un robot, un teléfono o unas gafas de realidad virtual) con respecto al mapa que está construyendo simultáneamente mientras la plataforma explora el entorno. SLAM es una tecnología muy relevante en distintas aplicaciones como realidad virtual, realidad aumentada o conducción autónoma. SLAM Visual es el termino utilizado para referirse al problema de SLAM resuelto utilizando unicamente sensores visuales. Muchas de las piezas del sistema ideal de SLAM son, hoy en día, bien conocidas, maduras y en muchos casos presentes en aplicaciones. Sin embargo, hay otras piezas que todavía presentan desafíos de investigación significantes. En particular, en los que hemos trabajado en esta tesis son la estimación de la estructura 3D al rededor de una cámara a partir de una sola imagen, reconocimiento de lugares ya visitados bajo cambios de apariencia drásticos, reconstrucción de alto nivel o SLAM en entornos dinámicos; todos ellos utilizando redes neuronales profundas.Estimación de profundidad monocular is la tarea de percibir la distancia a la cámara de cada uno de los pixeles en la imagen, utilizando solo la información que obtenemos de una única imagen. Este es un problema mal condicionado, y por lo tanto es muy difícil de inferir la profundidad exacta de los puntos en una sola imagen. Requiere conocimiento de lo que se ve y del sensor que utilizamos. Por ejemplo, si podemos saber que un modelo de coche tiene cierta altura y también sabemos el tipo de cámara que hemos utilizado (distancia focal, tamaño de pixel...); podemos decir que si ese coche tiene cierta altura en la imagen, por ejemplo 50 pixeles, esta a cierta distancia de la cámara. Para ello nosotros presentamos el primer trabajo capaz de estimar profundidad a partir de una sola vista que es capaz de obtener un funcionamiento razonable con múltiples tipos de cámara; como un teléfono o una cámara de video.También presentamos como estimar, utilizando una sola imagen, la estructura de una habitación o el plan de la habitación. Para este segundo trabajo, aprovechamos imágenes esféricas tomadas por una cámara panorámica utilizando una representación equirectangular. Utilizando estas imágenes recuperamos el plan de la habitación, nuestro objetivo es reconocer las pistas en la imagen que definen la estructura de una habitación. Nos centramos en recuperar la versión más simple, que son las lineas que separan suelo, paredes y techo.Localización y mapeo a largo plazo requiere dar solución a los cambios de apariencia en el entorno; el efecto que puede tener en una imagen tomarla en invierno o verano puede ser muy grande. Introducimos un modelo multivista invariante a cambios de apariencia que resuelve el problema de reconocimiento de lugares de forma robusta. El reconocimiento de lugares visual trata de identificar un lugar que ya hemos visitado asociando pistas visuales que se ven en las imágenes; la tomada en el pasado y la tomada en el presente. Lo preferible es ser invariante a cambios en punto de vista, iluminación, objetos dinámicos y cambios de apariencia a largo plazo como el día y la noche, las estaciones o el clima.Para tener funcionalidad a largo plazo también presentamos DynaSLAM, un sistema de SLAM que distingue las partes estáticas y dinámicas de la escena. Se asegura de estimar su posición unicamente basándose en las partes estáticas y solo reconstruye el mapa de las partes estáticas. De forma que si visitamos una escena de nuevo, nuestro mapa no se ve afectado por la presencia de nuevos objetos dinámicos o la desaparición de los anteriores.En resumen, en esta tesis contribuimos a diferentes problemas de percepción 3D; todos ellos resuelven problemas del SLAM Visual.<br /
    corecore