2,689 research outputs found

    Sparsity Invariant CNNs

    Full text link
    In this paper, we consider convolutional neural networks operating on sparse inputs with an application to depth upsampling from sparse laser scan data. First, we show that traditional convolutional networks perform poorly when applied to sparse data even when the location of missing data is provided to the network. To overcome this problem, we propose a simple yet effective sparse convolution layer which explicitly considers the location of missing data during the convolution operation. We demonstrate the benefits of the proposed network architecture in synthetic and real experiments with respect to various baseline approaches. Compared to dense baselines, the proposed sparse convolution network generalizes well to novel datasets and is invariant to the level of sparsity in the data. For our evaluation, we derive a novel dataset from the KITTI benchmark, comprising 93k depth annotated RGB images. Our dataset allows for training and evaluating depth upsampling and depth prediction techniques in challenging real-world settings and will be made available upon publication

    Localization in Unstructured Environments: Towards Autonomous Robots in Forests with Delaunay Triangulation

    Full text link
    Autonomous harvesting and transportation is a long-term goal of the forest industry. One of the main challenges is the accurate localization of both vehicles and trees in a forest. Forests are unstructured environments where it is difficult to find a group of significant landmarks for current fast feature-based place recognition algorithms. This paper proposes a novel approach where local observations are matched to a general tree map using the Delaunay triangularization as the representation format. Instead of point cloud based matching methods, we utilize a topology-based method. First, tree trunk positions are registered at a prior run done by a forest harvester. Second, the resulting map is Delaunay triangularized. Third, a local submap of the autonomous robot is registered, triangularized and matched using triangular similarity maximization to estimate the position of the robot. We test our method on a dataset accumulated from a forestry site at Lieksa, Finland. A total length of 2100\,m of harvester path was recorded by an industrial harvester with a 3D laser scanner and a geolocation unit fixed to the frame. Our experiments show a 12\,cm s.t.d. in the location accuracy and with real-time data processing for speeds not exceeding 0.5\,m/s. The accuracy and speed limit is realistic during forest operations

    Furniture models learned from the WWW: using web catalogs to locate and categorize unknown furniture pieces in 3D laser scans

    Get PDF
    In this article, we investigate how autonomous robots can exploit the high quality information already available from the WWW concerning 3-D models of office furniture. Apart from the hobbyist effort in Google 3-D Warehouse, many companies providing office furnishings already have the models for considerable portions of the objects found in our workplaces and homes. In particular, we present an approach that allows a robot to learn generic models of typical office furniture using examples found in the Web. These generic models are then used by the robot to locate and categorize unknown furniture in real indoor environments

    DROW: Real-Time Deep Learning based Wheelchair Detection in 2D Range Data

    Full text link
    We introduce the DROW detector, a deep learning based detector for 2D range data. Laser scanners are lighting invariant, provide accurate range data, and typically cover a large field of view, making them interesting sensors for robotics applications. So far, research on detection in laser range data has been dominated by hand-crafted features and boosted classifiers, potentially losing performance due to suboptimal design choices. We propose a Convolutional Neural Network (CNN) based detector for this task. We show how to effectively apply CNNs for detection in 2D range data, and propose a depth preprocessing step and voting scheme that significantly improve CNN performance. We demonstrate our approach on wheelchairs and walkers, obtaining state of the art detection results. Apart from the training data, none of our design choices limits the detector to these two classes, though. We provide a ROS node for our detector and release our dataset containing 464k laser scans, out of which 24k were annotated.Comment: Lucas Beyer and Alexander Hermans contributed equall

    Track, then Decide: Category-Agnostic Vision-based Multi-Object Tracking

    Full text link
    The most common paradigm for vision-based multi-object tracking is tracking-by-detection, due to the availability of reliable detectors for several important object categories such as cars and pedestrians. However, future mobile systems will need a capability to cope with rich human-made environments, in which obtaining detectors for every possible object category would be infeasible. In this paper, we propose a model-free multi-object tracking approach that uses a category-agnostic image segmentation method to track objects. We present an efficient segmentation mask-based tracker which associates pixel-precise masks reported by the segmentation. Our approach can utilize semantic information whenever it is available for classifying objects at the track level, while retaining the capability to track generic unknown objects in the absence of such information. We demonstrate experimentally that our approach achieves performance comparable to state-of-the-art tracking-by-detection methods for popular object categories such as cars and pedestrians. Additionally, we show that the proposed method can discover and robustly track a large variety of other objects.Comment: ICRA'18 submissio

    Robot Mapping and Navigation in Real-World Environments

    Get PDF
    Robots can perform various tasks, such as mapping hazardous sites, taking part in search-and-rescue scenarios, or delivering goods and people. Robots operating in the real world face many challenges on the way to the completion of their mission. Essential capabilities required for the operation of such robots are mapping, localization and navigation. Solving all of these tasks robustly presents a substantial difficulty as these components are usually interconnected, i.e., a robot that starts without any knowledge about the environment must simultaneously build a map, localize itself in it, analyze the surroundings and plan a path to efficiently explore an unknown environment. In addition to the interconnections between these tasks, they highly depend on the sensors used by the robot and on the type of the environment in which the robot operates. For example, an RGB camera can be used in an outdoor scene for computing visual odometry, or to detect dynamic objects but becomes less useful in an environment that does not have enough light for cameras to operate. The software that controls the behavior of the robot must seamlessly process all the data coming from different sensors. This often leads to systems that are tailored to a particular robot and a particular set of sensors. In this thesis, we challenge this concept by developing and implementing methods for a typical robot navigation pipeline that can work with different types of the sensors seamlessly both, in indoor and outdoor environments. With the emergence of new range-sensing RGBD and LiDAR sensors, there is an opportunity to build a single system that can operate robustly both in indoor and outdoor environments equally well and, thus, extends the application areas of mobile robots. The techniques presented in this thesis aim to be used with both RGBD and LiDAR sensors without adaptations for individual sensor models by using range image representation and aim to provide methods for navigation and scene interpretation in both static and dynamic environments. For a static world, we present a number of approaches that address the core components of a typical robot navigation pipeline. At the core of building a consistent map of the environment using a mobile robot lies point cloud matching. To this end, we present a method for photometric point cloud matching that treats RGBD and LiDAR sensors in a uniform fashion and is able to accurately register point clouds at the frame rate of the sensor. This method serves as a building block for the further mapping pipeline. In addition to the matching algorithm, we present a method for traversability analysis of the currently observed terrain in order to guide an autonomous robot to the safe parts of the surrounding environment. A source of danger when navigating difficult to access sites is the fact that the robot may fail in building a correct map of the environment. This dramatically impacts the ability of an autonomous robot to navigate towards its goal in a robust way, thus, it is important for the robot to be able to detect these situations and to find its way home not relying on any kind of map. To address this challenge, we present a method for analyzing the quality of the map that the robot has built to date, and safely returning the robot to the starting point in case the map is found to be in an inconsistent state. The scenes in dynamic environments are vastly different from the ones experienced in static ones. In a dynamic setting, objects can be moving, thus making static traversability estimates not enough. With the approaches developed in this thesis, we aim at identifying distinct objects and tracking them to aid navigation and scene understanding. We target these challenges by providing a method for clustering a scene taken with a LiDAR scanner and a measure that can be used to determine if two clustered objects are similar that can aid the tracking performance. All methods presented in this thesis are capable of supporting real-time robot operation, rely on RGBD or LiDAR sensors and have been tested on real robots in real-world environments and on real-world datasets. All approaches have been published in peer-reviewed conference papers and journal articles. In addition to that, most of the presented contributions have been released publicly as open source software

    LiDAR-guided object search and detection in Subterranean Environments

    Full text link
    Detecting objects of interest, such as human survivors, safety equipment, and structure access points, is critical to any search-and-rescue operation. Robots deployed for such time-sensitive efforts rely on their onboard sensors to perform their designated tasks. However, as disaster response operations are predominantly conducted under perceptually degraded conditions, commonly utilized sensors such as visual cameras and LiDARs suffer in terms of performance degradation. In response, this work presents a method that utilizes the complementary nature of vision and depth sensors to leverage multi-modal information to aid object detection at longer distances. In particular, depth and intensity values from sparse LiDAR returns are used to generate proposals for objects present in the environment. These proposals are then utilized by a Pan-Tilt-Zoom (PTZ) camera system to perform a directed search by adjusting its pose and zoom level for performing object detection and classification in difficult environments. The proposed work has been thoroughly verified using an ANYmal quadruped robot in underground settings and on datasets collected during the DARPA Subterranean Challenge finals.Comment: 6 pages, 5 Figures, 2 Tables, conference: IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR-2022), Seville, Spai
    corecore