3,285 research outputs found
Human mobility monitoring in very low resolution visual sensor network
This paper proposes an automated system for monitoring mobility patterns using a network of very low resolution visual sensors (30 30 pixels). The use of very low resolution sensors reduces privacy concern, cost, computation requirement and power consumption. The core of our proposed system is a robust people tracker that uses low resolution videos provided by the visual sensor network. The distributed processing architecture of our tracking system allows all image processing tasks to be done on the digital signal controller in each visual sensor. In this paper, we experimentally show that reliable tracking of people is possible using very low resolution imagery. We also compare the performance of our tracker against a state-of-the-art tracking method and show that our method outperforms. Moreover, the mobility statistics of tracks such as total distance traveled and average speed derived from trajectories are compared with those derived from ground truth given by Ultra-Wide Band sensors. The results of this comparison show that the trajectories from our system are accurate enough to obtain useful mobility statistics
Rule Of Thumb: Deep derotation for improved fingertip detection
We investigate a novel global orientation regression approach for articulated
objects using a deep convolutional neural network. This is integrated with an
in-plane image derotation scheme, DeROT, to tackle the problem of per-frame
fingertip detection in depth images. The method reduces the complexity of
learning in the space of articulated poses which is demonstrated by using two
distinct state-of-the-art learning based hand pose estimation methods applied
to fingertip detection. Significant classification improvements are shown over
the baseline implementation. Our framework involves no tracking, kinematic
constraints or explicit prior model of the articulated object in hand. To
support our approach we also describe a new pipeline for high accuracy magnetic
annotation and labeling of objects imaged by a depth camera.Comment: To be published in proceedings of BMVC 201
Multimodal perception for autonomous driving
Mención Internacional en el tÃtulo de doctorAutonomous driving is set to play an important role among intelligent
transportation systems in the coming decades. The advantages
of its large-scale implementation –reduced accidents, shorter commuting
times, or higher fuel efficiency– have made its development a priority
for academia and industry. However, there is still a long way to
go to achieve full self-driving vehicles, capable of dealing with any
scenario without human intervention. To this end, advances in control,
navigation and, especially, environment perception technologies
are yet required. In particular, the detection of other road users that
may interfere with the vehicle’s trajectory is a key element, since it
allows to model the current traffic situation and, thus, to make decisions
accordingly.
The objective of this thesis is to provide solutions to some of
the main challenges of on-board perception systems, such as extrinsic
calibration of sensors, object detection, and deployment on
real platforms. First, a calibration method for obtaining the relative
transformation between pairs of sensors is introduced, eliminating
the complex manual adjustment of these parameters. The algorithm
makes use of an original calibration pattern and supports LiDARs,
and monocular and stereo cameras. Second, different deep learning
models for 3D object detection using LiDAR data in its bird’s eye
view projection are presented. Through a novel encoding, the use
of architectures tailored to image detection is proposed to process
the 3D information of point clouds in real time. Furthermore, the
effectiveness of using this projection together with image features is
analyzed. Finally, a method to mitigate the accuracy drop of LiDARbased
detection networks when deployed in ad-hoc configurations is
introduced. For this purpose, the simulation of virtual signals mimicking
the specifications of the desired real device is used to generate
new annotated datasets that can be used to train the models.
The performance of the proposed methods is evaluated against
other existing alternatives using reference benchmarks in the field of
computer vision (KITTI and nuScenes) and through experiments in
open traffic with an automated vehicle. The results obtained demonstrate
the relevance of the presented work and its suitability for commercial
use.La conducción autónoma está llamada a jugar un papel importante en
los sistemas inteligentes de transporte de las próximas décadas. Las
ventajas de su implementación a larga escala –disminución de accidentes,
reducción del tiempo de trayecto, u optimización del consumo–
han convertido su desarrollo en una prioridad para la academia y
la industria. Sin embargo, todavÃa hay un largo camino por delante
hasta alcanzar una automatización total, capaz de enfrentarse a cualquier
escenario sin intervención humana. Para ello, aún se requieren
avances en las tecnologÃas de control, navegación y, especialmente,
percepción del entorno. Concretamente, la detección de otros usuarios
de la carretera que puedan interferir en la trayectoria del vehÃculo
es una pieza fundamental para conseguirlo, puesto que permite modelar
el estado actual del tráfico y tomar decisiones en consecuencia.
El objetivo de esta tesis es aportar soluciones a algunos de los
principales retos de los sistemas de percepción embarcados, como
la calibración extrÃnseca de los sensores, la detección de objetos, y su
despliegue en plataformas reales. En primer lugar, se introduce un
método para la obtención de la transformación relativa entre pares
de sensores, eliminando el complejo ajuste manual de estos parámetros.
El algoritmo hace uso de un patrón de calibración propio y da
soporte a cámaras monoculares, estéreo, y LiDAR. En segundo lugar,
se presentan diferentes modelos de aprendizaje profundo para la detección
de objectos en 3D utilizando datos de escáneres LiDAR en su
proyección en vista de pájaro. A través de una nueva codificación, se
propone la utilización de arquitecturas de detección en imagen para
procesar en tiempo real la información tridimensional de las nubes
de puntos. Además, se analiza la efectividad del uso de esta proyección
junto con caracterÃsticas procedentes de imágenes. Por último,
se introduce un método para mitigar la pérdida de precisión de las
redes de detección basadas en LiDAR cuando son desplegadas en
configuraciones ad-hoc. Para ello, se plantea la simulación de señales
virtuales con las caracterÃsticas del modelo real que se quiere utilizar,
generando asà nuevos conjuntos anotados para entrenar los modelos.
El rendimiento de los métodos propuestos es evaluado frente a
otras alternativas existentes haciendo uso de bases de datos de referencia
en el campo de la visión por computador (KITTI y nuScenes),
y mediante experimentos en tráfico abierto empleando un vehÃculo
automatizado. Los resultados obtenidos demuestran la relevancia de
los trabajos presentados y su viabilidad para un uso comercial.Programa de Doctorado en IngenierÃa Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidente: Jesús GarcÃa Herrero.- Secretario: Ignacio Parra Alonso.- Vocal: Gustavo Adolfo Peláez Coronad
Aircraft state estimation using cameras and passive radar
Multiple target tracking (MTT) is a fundamental task in many application domains. It is a difficult problem to solve in general, so applications make use of domain specific and problem-specific knowledge to approach the problem by solving subtasks separately. This work puts forward a MTT framework (MTTF) which is based on the Bayesian recursive estimator (BRE). The MTTF extends a particle filter (PF) to handle the multiple targets and adds a probabilistic graphical model (PGM) data association stage to compute the mapping from detections to trackers. The MTTF was applied to the problem of passively monitoring airspace. Two applications were built: a passive radar MTT module and a comprehensive visual object tracking (VOT) system. Both applications require a solution to the MTT problem, for which the MTTF was utilized. The VOT system performed well on real data recorded at the University of Cape Town (UCT) as part of this investigation. The system was able to detect and track aircraft flying within the region of interest (ROI). The VOT system consisted of a single camera, an image processing module, the MTTF module and an evaluation module. The world coordinate frame target localization was within ±3.2 km and these results are presented on Google Earth. The image plane target localization has an average reprojection error of ±17.3 pixels. The VOT system achieved an average area under the curve value of 0.77 for all receiver operating characteristic curves. These performance figures are typical over the ±1 hr of video recordings taken from the UCT site. The passive radar application was tested on simulated data. The MTTF module was designed to connect to an existing passive radar system developed by Peralex Electronics Pty Ltd. The MTTF module estimated the number of targets in the scene and localized them within a 2D local world Cartesian coordinate system. The investigations encompass numerous areas of research as well as practical aspects of software engineering and systems design
Lidar-based Obstacle Detection and Recognition for Autonomous Agricultural Vehicles
Today, agricultural vehicles are available that can drive autonomously and follow exact route plans more precisely than human operators. Combined with advancements in precision agriculture, autonomous agricultural robots can reduce manual labor, improve workflow, and optimize yield. However, as of today, human operators are still required for monitoring the environment and acting upon potential obstacles in front of the vehicle. To eliminate this need, safety must be ensured by accurate and reliable obstacle detection and avoidance systems.In this thesis, lidar-based obstacle detection and recognition in agricultural environments has been investigated. A rotating multi-beam lidar generating 3D point clouds was used for point-wise classification of agricultural scenes, while multi-modal fusion with cameras and radar was used to increase performance and robustness. Two research perception platforms were presented and used for data acquisition. The proposed methods were all evaluated on recorded datasets that represented a wide range of realistic agricultural environments and included both static and dynamic obstacles.For 3D point cloud classification, two methods were proposed for handling density variations during feature extraction. One method outperformed a frequently used generic 3D feature descriptor, whereas the other method showed promising preliminary results using deep learning on 2D range images. For multi-modal fusion, four methods were proposed for combining lidar with color camera, thermal camera, and radar. Gradual improvements in classification accuracy were seen, as spatial, temporal, and multi-modal relationships were introduced in the models. Finally, occupancy grid mapping was used to fuse and map detections globally, and runtime obstacle detection was applied on mapped detections along the vehicle path, thus simulating an actual traversal.The proposed methods serve as a first step towards full autonomy for agricultural vehicles. The study has thus shown that recent advancements in autonomous driving can be transferred to the agricultural domain, when accurate distinctions are made between obstacles and processable vegetation. Future research in the domain has further been facilitated with the release of the multi-modal obstacle dataset, FieldSAFE
Tecniche per la rilevazione automatica marker-less di persone e marker-based di robot all'interno di reti di telecamere RGB-Depth
OpenPTrack is a state of the art solution for people detection and tracking, in this work we extended some of the functionalities (detection from highly tilted camera) of the software and introduced new ones (automatic ground plane equation calculator). Also, we test the feasibility and the behaviour of a mobile camera mounted on a people-following robot and dynamically registered in the OPT network through a fiducial cubic marke
TractorEYE: Vision-based Real-time Detection for Autonomous Vehicles in Agriculture
Agricultural vehicles such as tractors and harvesters have for decades been able to navigate automatically and more efficiently using commercially available products such as auto-steering and tractor-guidance systems. However, a human operator is still required inside the vehicle to ensure the safety of vehicle and especially surroundings such as humans and animals. To get fully autonomous vehicles certified for farming, computer vision algorithms and sensor technologies must detect obstacles with equivalent or better than human-level performance. Furthermore, detections must run in real-time to allow vehicles to actuate and avoid collision.This thesis proposes a detection system (TractorEYE), a dataset (FieldSAFE), and procedures to fuse information from multiple sensor technologies to improve detection of obstacles and to generate a map. TractorEYE is a multi-sensor detection system for autonomous vehicles in agriculture. The multi-sensor system consists of three hardware synchronized and registered sensors (stereo camera, thermal camera and multi-beam lidar) mounted on/in a ruggedized and water-resistant casing. Algorithms have been developed to run a total of six detection algorithms (four for rgb camera, one for thermal camera and one for a Multi-beam lidar) and fuse detection information in a common format using either 3D positions or Inverse Sensor Models. A GPU powered computational platform is able to run detection algorithms online. For the rgb camera, a deep learning algorithm is proposed DeepAnomaly to perform real-time anomaly detection of distant, heavy occluded and unknown obstacles in agriculture. DeepAnomaly is -- compared to a state-of-the-art object detector Faster R-CNN -- for an agricultural use-case able to detect humans better and at longer ranges (45-90m) using a smaller memory footprint and 7.3-times faster processing. Low memory footprint and fast processing makes DeepAnomaly suitable for real-time applications running on an embedded GPU. FieldSAFE is a multi-modal dataset for detection of static and moving obstacles in agriculture. The dataset includes synchronized recordings from a rgb camera, stereo camera, thermal camera, 360-degree camera, lidar and radar. Precise localization and pose is provided using IMU and GPS. Ground truth of static and moving obstacles (humans, mannequin dolls, barrels, buildings, vehicles, and vegetation) are available as an annotated orthophoto and GPS coordinates for moving obstacles. Detection information from multiple detection algorithms and sensors are fused into a map using Inverse Sensor Models and occupancy grid maps. This thesis presented many scientific contribution and state-of-the-art within perception for autonomous tractors; this includes a dataset, sensor platform, detection algorithms and procedures to perform multi-sensor fusion. Furthermore, important engineering contributions to autonomous farming vehicles are presented such as easily applicable, open-source software packages and algorithms that have been demonstrated in an end-to-end real-time detection system. The contributions of this thesis have demonstrated, addressed and solved critical issues to utilize camera-based perception systems that are essential to make autonomous vehicles in agriculture a reality
- …