Search CORE

295 research outputs found

PlaNet - Photo Geolocation with Convolutional Neural Networks

Author: A Babenko
A Graves
A Mikulík
AR Zamir
AR Zamir
G Baatz
H Jegou
H Jégou
J Duchi
J Elman
J Hays
J Knopp
MD Zeiler
S Cao
S Hochreiter
T Sattler
Y Li
Y Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Is it possible to build a system to determine the location where a photo was taken using just its pixels? In general, the problem seems exceptionally difficult: it is trivial to construct situations where no location can be inferred. Yet images often contain informative cues such as landmarks, weather patterns, vegetation, road markings, and architectural details, which in combination may allow one to determine an approximate location and occasionally an exact location. Websites such as GeoGuessr and View from your Window suggest that humans are relatively good at integrating these cues to geolocate images, especially en-masse. In computer vision, the photo geolocation problem is usually approached using image retrieval methods. In contrast, we pose the problem as one of classification by subdividing the surface of the earth into thousands of multi-scale geographic cells, and train a deep network using millions of geotagged images. While previous approaches only recognize landmarks or perform approximate matching using global image descriptors, our model is able to use and integrate multiple visible cues. We show that the resulting model, called PlaNet, outperforms previous approaches and even attains superhuman levels of accuracy in some cases. Moreover, we extend our model to photo albums by combining it with a long short-term memory (LSTM) architecture. By learning to exploit temporal coherence to geolocate uncertain photos, we demonstrate that this model achieves a 50% performance improvement over the single-image model

arXiv.org e-Print Archive

Crossref

Publikationsserver der RWTH Aachen University

Visual Landmark Recognition from Internet Photo Collections: A Large-Scale Evaluation

Author: Leibe Bastian
Weyand Tobias
Publication venue: 'Elsevier BV'
Publication date: 18/09/2014
Field of study

The task of a visual landmark recognition system is to identify photographed buildings or objects in query photos and to provide the user with relevant information on them. With their increasing coverage of the world's landmark buildings and objects, Internet photo collections are now being used as a source for building such systems in a fully automatic fashion. This process typically consists of three steps: clustering large amounts of images by the objects they depict; determining object names from user-provided tags; and building a robust, compact, and efficient recognition index. To this date, however, there is little empirical information on how well current approaches for those steps perform in a large-scale open-set mining and recognition task. Furthermore, there is little empirical information on how recognition performance varies for different types of landmark objects and where there is still potential for improvement. With this paper, we intend to fill these gaps. Using a dataset of 500k images from Paris, we analyze each component of the landmark recognition pipeline in order to answer the following questions: How many and what kinds of objects can be discovered automatically? How can we best use the resulting image clusters to recognize the object in a query? How can the object be efficiently represented in memory for recognition? How reliably can semantic information be extracted? And finally: What are the limiting factors in the resulting pipeline from query to semantics? We evaluate how different choices of methods and parameters for the individual pipeline steps affect overall system performance and examine their effects for different query categories such as buildings, paintings or sculptures

arXiv.org e-Print Archive

Publikationsserver der RWTH Aachen University

Place recognition: An Overview of Vision Perspective

Author: Chen Yuming
Wang Xiaodong
Zeng Zhiqiang
Zhang Jian
Zhu Chaoyang
Publication venue: 'MDPI AG'
Publication date: 01/11/2018
Field of study

Place recognition is one of the most fundamental topics in computer vision and robotics communities, where the task is to accurately and efficiently recognize the location of a given query image. Despite years of wisdom accumulated in this field, place recognition still remains an open problem due to the various ways in which the appearance of real-world places may differ. This paper presents an overview of the place recognition literature. Since condition invariant and viewpoint invariant features are essential factors to long-term robust visual place recognition system, We start with traditional image description methodology developed in the past, which exploit techniques from image retrieval field. Recently, the rapid advances of related fields such as object detection and image classification have inspired a new technique to improve visual place recognition system, i.e., convolutional neural networks (CNNs). Thus we then introduce recent progress of visual place recognition system based on CNNs to automatically learn better image representations for places. Eventually, we close with discussions and future work of place recognition.Comment: Applied Sciences (2018

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Visual location awareness for mobile robots using feature-based vision

Author: Kazeka Alexander
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2010
Field of study

Department Head: L. Darrell Whitley.2010 Spring.Includes bibliographical references (pages 48-50).This thesis presents an evaluation of feature-based visual recognition paradigm for the task of mobile robot localization. Although many works describe feature-based visual robot localization, they often do so using complex methods for map-building and position estimation which obscure the underlying vision systems' performance. One of the main contributions of this work is the development of an evaluation algorithm employing simple models for location awareness with focus on evaluating the underlying vision system. While SeeAsYou is used as a prototypical vision system for evaluation, the algorithm is designed to allow it to be used with other feature-based vision systems as well. The main result is that feature-based recognition with SeeAsYou provides some information but is not strong enough to reliably achieve location awareness without the temporal context. Adding a simple temporal model, however, suggests a more reliable localization performance

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Topological place recognition for life-long visual localization

Author: Arroyo Contera Roberto
Publication venue
Publication date: 01/01/2017
Field of study

Premio Extraordinario de Doctorado de la UAH en el año académico 2016-2017La navegación de vehículos inteligentes o robots móviles en períodos largos de tiempo ha experimentado un gran interés por parte de la comunidad investigadora en los últimos años. Los sistemas basados en cámaras se han extendido ampliamente en el pasado reciente gracias a las mejoras en sus características, precio y reducción de tamaño, añadidos a los progresos en técnicas de visión artificial. Por ello, la localización basada en visión es una aspecto clave para desarrollar una navegación autónoma robusta en situaciones a largo plazo. Teniendo en cuenta esto, la identificación de localizaciones por medio de técnicas de reconocimiento de lugar topológicas puede ser complementaria a otros enfoques como son las soluciones basadas en el Global Positioning System (GPS), o incluso suplementaria cuando la señal GPS no está disponible.El estado del arte en reconocimiento de lugar topológico ha mostrado un funcionamiento satisfactorio en el corto plazo. Sin embargo, la localización visual a largo plazo es problemática debido a los grandes cambios de apariencia que un lugar sufre como consecuencia de elementos dinámicos, la iluminación o la climatología, entre otros. El objetivo de esta tesis es enfrentarse a las dificultades de llevar a cabo una localización topológica eficiente y robusta a lo largo del tiempo. En consecuencia, se van a contribuir dos nuevos enfoques basados en reconocimiento visual de lugar para resolver los diferentes problemas asociados a una localización visual a largo plazo. Por un lado, un método de reconocimiento de lugar visual basado en descriptores binarios es propuesto. La innovación de este enfoque reside en la descripción global de secuencias de imágenes como códigos binarios, que son extraídos mediante un descriptor basado en la técnica denominada Local Difference Binary (LDB). Los descriptores son eficientemente asociados usando la distancia de Hamming y un método de búsqueda conocido como Approximate Nearest Neighbors (ANN). Además, una técnica de iluminación invariante es aplicada para mejorar el funcionamiento en condiciones luminosas cambiantes. El empleo de la descripción binaria previamente introducida proporciona una reducción de los costes computacionales y de memoria.Por otro lado, también se presenta un método de reconocimiento de lugar visual basado en deep learning, en el cual los descriptores aplicados son procesados por una Convolutional Neural Network (CNN). Este es un concepto recientemente popularizado en visión artificial que ha obtenido resultados impresionantes en problemas de clasificación de imagen. La novedad de nuestro enfoque reside en la fusión de la información de imagen de múltiples capas convolucionales a varios niveles y granularidades. Además, los datos redundantes de los descriptores basados en CNNs son comprimidos en un número reducido de bits para una localización más eficiente. El descriptor final es condensado aplicando técnicas de compresión y binarización para realizar una asociación usando de nuevo la distancia de Hamming. En términos generales, los métodos centrados en CNNs mejoran la precisión generando representaciones visuales de las localizaciones más detalladas, pero son más costosos en términos de computación.Ambos enfoques de reconocimiento de lugar visual son extensamente evaluados sobre varios datasets públicos. Estas pruebas arrojan una precisión satisfactoria en situaciones a largo plazo, como es corroborado por los resultados mostrados, que comparan nuestros métodos contra los principales algoritmos del estado del arte, mostrando mejores resultados para todos los casos.Además, también se ha analizado la aplicabilidad de nuestro reconocimiento de lugar topológico en diferentes problemas de localización. Estas aplicaciones incluyen la detección de cierres de lazo basada en los lugares reconocidos o la corrección de la deriva acumulada en odometría visual usando la información proporcionada por los cierres de lazo. Asimismo, también se consideran las aplicaciones de la detección de cambios geométricos a lo largo de las estaciones del año, que son esenciales para las actualizaciones de los mapas en sistemas de conducción autónomos centrados en una operación a largo plazo. Todas estas contribuciones son discutidas al final de la tesis, incluyendo varias conclusiones sobre el trabajo presentado y líneas de investigación futuras

e_Buah - Biblioteca Digital de la Universidad de Alcalá

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Panoramic Annular Localizer: Tackling the Variation Challenges of Outdoor Localization Using Panoramic Annular Images and Active Deep Descriptors

Author: Bai Jian
Cheng Ruiqi
Hu Weijian
Huang Xiao
Li Huabing
Lin Shufei
Sun Dongming
Wang Kaiwei
Yang Kailun
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/07/2019
Field of study

Visual localization is an attractive problem that estimates the camera localization from database images based on the query image. It is a crucial task for various applications, such as autonomous vehicles, assistive navigation and augmented reality. The challenging issues of the task lie in various appearance variations between query and database images, including illumination variations, dynamic object variations and viewpoint variations. In order to tackle those challenges, Panoramic Annular Localizer into which panoramic annular lens and robust deep image descriptors are incorporated is proposed in this paper. The panoramic annular images captured by the single camera are processed and fed into the NetVLAD network to form the active deep descriptor, and sequential matching is utilized to generate the localization result. The experiments carried on the public datasets and in the field illustrate the validation of the proposed system.Comment: Accepted by ITSC 201

arXiv.org e-Print Archive

Crossref

Place Recognition for Mobile Robot in Changing Environments

Author: Cao Juan
Publication venue
Publication date: 01/01/2016
Field of study

Aberystwyth Research Portal

A panoramic localizer based on coarse-to-fine descriptors for navigation assistance

Author: Cheng R.
Fang Y.
Sun L.
Wang K.
Yang K.
Publication venue: MDPI
Publication date: 01/09/2020
Field of study

KITopen