14 research outputs found

    Deep Learning Features at Scale for Visual Place Recognition

    Full text link
    The success of deep learning techniques in the computer vision domain has triggered a range of initial investigations into their utility for visual place recognition, all using generic features from networks that were trained for other types of recognition tasks. In this paper, we train, at large scale, two CNN architectures for the specific place recognition task and employ a multi-scale feature encoding method to generate condition- and viewpoint-invariant features. To enable this training to occur, we have developed a massive Specific PlacEs Dataset (SPED) with hundreds of examples of place appearance change at thousands of different places, as opposed to the semantic place type datasets currently available. This new dataset enables us to set up a training regime that interprets place recognition as a classification problem. We comprehensively evaluate our trained networks on several challenging benchmark place recognition datasets and demonstrate that they achieve an average 10% increase in performance over other place recognition algorithms and pre-trained CNNs. By analyzing the network responses and their differences from pre-trained networks, we provide insights into what a network learns when training for place recognition, and what these results signify for future research in this area.Comment: 8 pages, 10 figures. Accepted by International Conference on Robotics and Automation (ICRA) 2017. This is the submitted version. The final published version may be slightly differen

    Look No Further: Adapting the Localization Sensory Window to the Temporal Characteristics of the Environment

    Full text link
    Many localization algorithms use a spatiotemporal window of sensory information in order to recognize spatial locations, and the length of this window is often a sensitive parameter that must be tuned to the specifics of the application. This letter presents a general method for environment-driven variation of the length of the spatiotemporal window based on searching for the most significant localization hypothesis, to use as much context as is appropriate but not more. We evaluate this approach on benchmark datasets using visual and Wi-Fi sensor modalities and a variety of sensory comparison front-ends under in-order and out-of-order traversals of the environment. Our results show that the system greatly reduces the maximum distance traveled without localization compared to a fixed-length approach while achieving competitive localization accuracy, and our proposed method achieves this performance without deployment-time tuning.Comment: Pre-print of article appearing in 2017 IEEE Robotics and Automation Letters. v2: incorporated reviewer feedbac

    Automated Map Reading: Image Based Localisation in 2-D Maps Using Binary Semantic Descriptors

    Get PDF
    We describe a novel approach to image based localisation in urban environments using semantic matching between images and a 2-D map. It contrasts with the vast majority of existing approaches which use image to image database matching. We use highly compact binary descriptors to represent semantic features at locations, significantly increasing scalability compared with existing methods and having the potential for greater invariance to variable imaging conditions. The approach is also more akin to human map reading, making it more suited to human-system interaction. The binary descriptors indicate the presence or not of semantic features relating to buildings and road junctions in discrete viewing directions. We use CNN classifiers to detect the features in images and match descriptor estimates with a database of location tagged descriptors derived from the 2-D map. In isolation, the descriptors are not sufficiently discriminative, but when concatenated sequentially along a route, their combination becomes highly distinctive and allows localisation even when using non-perfect classifiers. Performance is further improved by taking into account left or right turns over a route. Experimental results obtained using Google StreetView and OpenStreetMap data show that the approach has considerable potential, achieving localisation accuracy of around 85% using routes corresponding to approximately 200 meters.Comment: 8 pages, submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems 201

    Multi-Session Visual SLAM for Illumination Invariant Localization in Indoor Environments

    Full text link
    For robots navigating using only a camera, illumination changes in indoor environments can cause localization failures during autonomous navigation. In this paper, we present a multi-session visual SLAM approach to create a map made of multiple variations of the same locations in different illumination conditions. The multi-session map can then be used at any hour of the day for improved localization capability. The approach presented is independent of the visual features used, and this is demonstrated by comparing localization performance between multi-session maps created using the RTAB-Map library with SURF, SIFT, BRIEF, FREAK, BRISK, KAZE, DAISY and SuperPoint visual features. The approach is tested on six mapping and six localization sessions recorded at 30 minutes intervals during sunset using a Google Tango phone in a real apartment.Comment: 6 pages, 5 figure

    Automated Map Reading:Image Based Localisation in 2-D Maps Using Binary Semantic Descriptors

    Get PDF
    corecore