1,353 research outputs found

    DeepNav: Learning to Navigate Large Cities

    Full text link
    We present DeepNav, a Convolutional Neural Network (CNN) based algorithm for navigating large cities using locally visible street-view images. The DeepNav agent learns to reach its destination quickly by making the correct navigation decisions at intersections. We collect a large-scale dataset of street-view images organized in a graph where nodes are connected by roads. This dataset contains 10 city graphs and more than 1 million street-view images. We propose 3 supervised learning approaches for the navigation task and show how A* search in the city graph can be used to generate supervision for the learning. Our annotation process is fully automated using publicly available mapping services and requires no human input. We evaluate the proposed DeepNav models on 4 held-out cities for navigating to 5 different types of destinations. Our algorithms outperform previous work that uses hand-crafted features and Support Vector Regression (SVR)[19].Comment: CVPR 2017 camera ready versio

    Autonomous Driving with a Simulation Trained Convolutional Neural Network

    Get PDF
    Autonomous vehicles will help society if they can easily support a broad range of driving environments, conditions, and vehicles. Achieving this requires reducing the complexity of the algorithmic system, easing the collection of training data, and verifying operation using real-world experiments. Our work addresses these issues by utilizing a reflexive neural network that translates images into steering and throttle commands. This network is trained using simulation data from Grand Theft Auto V~\cite{gtav}, which we augment to reduce the number of simulation hours driven. We then validate our work using a RC car system through numerous tests. Our system successfully drive 98 of 100 laps of a track with multiple road types and difficult turns; it also successfully avoids collisions with another vehicle in 90\% of the trials

    Improved Ground-Based Monocular Visual Odometry Estimation using Inertially-Aided Convolutional Neural Networks

    Get PDF
    While Convolutional Neural Networks (CNNs) can estimate frame-to-frame (F2F) motion even with monocular images, additional inputs can improve Visual Odometry (VO) predictions. In this thesis, a FlowNetS-based [1] CNN architecture estimates VO using sequential images from the KITTI Odometry dataset [2]. For each of three output types (full six degrees of freedom (6-DoF), Cartesian translation, and transitional scale), a baseline network with only image pair input is compared with a nearly identical architecture that is also given an additional rotation estimate such as from an Inertial Navigation System (INS). The inertially-aided networks show an order of magnitude improvement over the baseline when predicting rotation, but the aided rotation predictions are still worse than the input rotations. Translation predictions are not necessarily helped either. A full-trajectory analysis gives similar results. The INS-aided neural networks are also tested for sensitivity to angular random walk (ARW) and bias errors in the sensor measurements

    END-TO-END LEARNING UTILIZING TEMPORAL INFORMATION FOR VISION- BASED AUTONOMOUS DRIVING

    Get PDF
    End-to-End learning models trained with conditional imitation learning (CIL) have demonstrated their capabilities in driving autonomously in dynamic environments. The performance of such models however is limited as most of them fail to utilize the temporal information, which resides in a sequence of observations. In this work, we explore the use of temporal information with a recurrent network to improve driving performance. We propose a model that combines a pre-trained, deeper convolutional neural network to better capture image features with a long short-term memory network to better explore temporal information. Experimental results indicate that the proposed model achieves performance gain in several tasks in the CARLA benchmark, compared to the state-of-the-art models. In particular, comparing with other CIL-based models in the most challenging task, navigation in dynamic environments, we achieve a 96% success rate while other CIL-based models had 82-92% in training conditions; we also achieved 88% while other CIL-based models did 42-90% in the new town and new weather conditions. The subsequent ablation study also shows that all the major features of the proposed model are essential for improving performance. We, therefore, believe that this work contributes significantly towards safe, efficient, clean autonomous driving for future smart cities

    Terrain Segmentation and Roughness Estimation using RGB Data: Path Planning Application on the CENTAURO Robot

    Get PDF
    Robots operating in real world environments require a high-level perceptual understanding of the chief physical properties of the terrain they are traversing. In unknown environments, roughness is one such important terrain property that could play a key role in devising robot control/planning strategies. In this paper, we present a fast method for predicting pixel-wise labels of terrain (stone, sand, road/sidewalk, wood, grass, metal) and roughness estimation, using a single RGB-based deep neural network. Real world RGB images are used to experimentally validate the presented approach. Furthermore, we demonstrate an application of our proposed method on the centaur-like wheeled-legged robot CENTAURO, by integrating it with a navigation planner that is capable of re-configuring the leg joints to modify the robot footprint polygon for stability purposes or for safe traversal among obstacles
    corecore