905 research outputs found
A Near-to-Far Learning Framework for Terrain Characterization Using an Aerial/Ground-Vehicle Team
In this thesis, a novel framework for adaptive terrain characterization of untraversed far terrain in a natural outdoor setting is presented. The system learns the association between visual appearance of different terrain and the proprioceptive characteristics of that terrain in a self-supervised framework. The proprioceptive characteristics of the terrain are acquired by inertial sensors recording measurements of one second traversals that are mapped into the frequency domain and later through a clustering technique classified into discrete proprioceptive classes. Later, these labels are used as training inputs to the adaptive visual classifier. The visual classifier uses images captured by an aerial vehicle scouting ahead of the ground vehicle and extracts local and global descriptors from image patches. An incremental SVM is utilized on the set of images and training sets as they are grabbed sequentially. The framework proposed in this thesis has been experimentally validated in an outdoor environment. We compare the results of the adaptive approach with the offline a priori classification approach and yield an average 12% increase in accuracy results on outdoor settings. The adaptive classifier gradually learns the association between characteristics and visual features of new terrain interactions and modifies the decision boundaries
Smart environment monitoring through micro unmanned aerial vehicles
In recent years, the improvements of small-scale Unmanned Aerial Vehicles (UAVs) in terms of flight time, automatic control, and remote transmission are promoting the development of a wide range of practical applications. In aerial video surveillance, the monitoring of broad areas still has many challenges due to the achievement of different tasks in real-time, including mosaicking, change detection, and object detection. In this thesis work, a small-scale UAV based vision system to maintain regular surveillance over target areas is proposed. The system works in two modes. The first mode allows to monitor an area of interest by performing several flights. During the first flight, it creates an incremental geo-referenced mosaic of an area of interest and classifies all the known elements (e.g., persons) found on the ground by an improved Faster R-CNN architecture previously trained. In subsequent reconnaissance flights, the system searches for any changes (e.g., disappearance of persons) that may occur in the mosaic by a histogram equalization and RGB-Local Binary Pattern (RGB-LBP) based algorithm. If present, the mosaic is updated. The second mode, allows to perform a real-time classification by using, again, our improved Faster R-CNN model, useful for time-critical operations. Thanks to different design features, the system works in real-time and performs mosaicking and change detection tasks at low-altitude, thus allowing the classification even of small objects. The proposed system was tested by using the whole set of challenging video sequences contained in the UAV Mosaicking and Change Detection (UMCD) dataset and other public datasets. The evaluation of the system by well-known performance metrics has shown remarkable results in terms of mosaic creation and updating, as well as in terms of change detection and object detection
End-to-End Tracking and Semantic Segmentation Using Recurrent Neural Networks
In this work we present a novel end-to-end framework for tracking and
classifying a robot's surroundings in complex, dynamic and only partially
observable real-world environments. The approach deploys a recurrent neural
network to filter an input stream of raw laser measurements in order to
directly infer object locations, along with their identity in both visible and
occluded areas. To achieve this we first train the network using unsupervised
Deep Tracking, a recently proposed theoretical framework for end-to-end space
occupancy prediction. We show that by learning to track on a large amount of
unsupervised data, the network creates a rich internal representation of its
environment which we in turn exploit through the principle of inductive
transfer of knowledge to perform the task of it's semantic classification. As a
result, we show that only a small amount of labelled data suffices to steer the
network towards mastering this additional task. Furthermore we propose a novel
recurrent neural network architecture specifically tailored to tracking and
semantic classification in real-world robotics applications. We demonstrate the
tracking and classification performance of the method on real-world data
collected at a busy road junction. Our evaluation shows that the proposed
end-to-end framework compares favourably to a state-of-the-art, model-free
tracking solution and that it outperforms a conventional one-shot training
scheme for semantic classification
Learning Depth With Very Sparse Supervision
Motivated by the astonishing capabilities of natural intelligent agents and
inspired by theories from psychology, this paper explores the idea that
perception gets coupled to 3D properties of the world via interaction with the
environment. Existing works for depth estimation require either massive amounts
of annotated training data or some form of hard-coded geometrical constraint.
This paper explores a new approach to learning depth perception requiring
neither of those. Specifically, we train a specialized global-local network
architecture with what would be available to a robot interacting with the
environment: from extremely sparse depth measurements down to even a single
pixel per image. From a pair of consecutive images, our proposed network
outputs a latent representation of the observer's motion between the images and
a dense depth map. Experiments on several datasets show that, when ground truth
is available even for just one of the image pixels, the proposed network can
learn monocular dense depth estimation up to 22.5% more accurately than
state-of-the-art approaches. We believe that this work, despite its scientific
interest, lays the foundations to learn depth from extremely sparse
supervision, which can be valuable to all robotic systems acting under severe
bandwidth or sensing constraints.Comment: Accepted for Publication at the IEEE Robotics and Automation Letters
(RA-L) 2020, and International Conference on Intelligent Robots and Systems
(IROS) 202
- …