822 research outputs found
PlaNet - Photo Geolocation with Convolutional Neural Networks
Is it possible to build a system to determine the location where a photo was
taken using just its pixels? In general, the problem seems exceptionally
difficult: it is trivial to construct situations where no location can be
inferred. Yet images often contain informative cues such as landmarks, weather
patterns, vegetation, road markings, and architectural details, which in
combination may allow one to determine an approximate location and occasionally
an exact location. Websites such as GeoGuessr and View from your Window suggest
that humans are relatively good at integrating these cues to geolocate images,
especially en-masse. In computer vision, the photo geolocation problem is
usually approached using image retrieval methods. In contrast, we pose the
problem as one of classification by subdividing the surface of the earth into
thousands of multi-scale geographic cells, and train a deep network using
millions of geotagged images. While previous approaches only recognize
landmarks or perform approximate matching using global image descriptors, our
model is able to use and integrate multiple visible cues. We show that the
resulting model, called PlaNet, outperforms previous approaches and even
attains superhuman levels of accuracy in some cases. Moreover, we extend our
model to photo albums by combining it with a long short-term memory (LSTM)
architecture. By learning to exploit temporal coherence to geolocate uncertain
photos, we demonstrate that this model achieves a 50% performance improvement
over the single-image model
Large-Scale Mapping of Human Activity using Geo-Tagged Videos
This paper is the first work to perform spatio-temporal mapping of human
activity using the visual content of geo-tagged videos. We utilize a recent
deep-learning based video analysis framework, termed hidden two-stream
networks, to recognize a range of activities in YouTube videos. This framework
is efficient and can run in real time or faster which is important for
recognizing events as they occur in streaming video or for reducing latency in
analyzing already captured video. This is, in turn, important for using video
in smart-city applications. We perform a series of experiments to show our
approach is able to accurately map activities both spatially and temporally. We
also demonstrate the advantages of using the visual content over the
tags/titles.Comment: Accepted at ACM SIGSPATIAL 201
Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition
Aerial scene recognition is a fundamental task in remote sensing and has
recently received increased interest. While the visual information from
overhead images with powerful models and efficient algorithms yields
considerable performance on scene recognition, it still suffers from the
variation of ground objects, lighting conditions etc. Inspired by the
multi-channel perception theory in cognition science, in this paper, for
improving the performance on the aerial scene recognition, we explore a novel
audiovisual aerial scene recognition task using both images and sounds as
input. Based on an observation that some specific sound events are more likely
to be heard at a given geographic location, we propose to exploit the knowledge
from the sound events to improve the performance on the aerial scene
recognition. For this purpose, we have constructed a new dataset named AuDio
Visual Aerial sceNe reCognition datasEt (ADVANCE). With the help of this
dataset, we evaluate three proposed approaches for transferring the sound event
knowledge to the aerial scene recognition task in a multimodal learning
framework, and show the benefit of exploiting the audio information for the
aerial scene recognition. The source code is publicly available for
reproducibility purposes.Comment: ECCV 202
Leveraging Overhead Imagery for Localization, Mapping, and Understanding
Ground-level and overhead images provide complementary viewpoints of the world. This thesis proposes methods which leverage dense overhead imagery, in addition to sparsely distributed ground-level imagery, to advance traditional computer vision problems, such as ground-level image localization and fine-grained urban mapping. Our work focuses on three primary research areas: learning a joint feature representation between ground-level and overhead imagery to enable direct comparison for the task of image geolocalization, incorporating unlabeled overhead images by inferring labels from nearby ground-level images to improve image-driven mapping, and fusing ground-level imagery with overhead imagery to enhance understanding. The ultimate contribution of this thesis is a general framework for estimating geospatial functions, such as land cover or land use, which integrates visual evidence from both ground-level and overhead image viewpoints
Describing and Understanding Neighborhood Characteristics through Online Social Media
Geotagged data can be used to describe regions in the world and discover
local themes. However, not all data produced within a region is necessarily
specifically descriptive of that area. To surface the content that is
characteristic for a region, we present the geographical hierarchy model (GHM),
a probabilistic model based on the assumption that data observed in a region is
a random mixture of content that pertains to different levels of a hierarchy.
We apply the GHM to a dataset of 8 million Flickr photos in order to
discriminate between content (i.e., tags) that specifically characterizes a
region (e.g., neighborhood) and content that characterizes surrounding areas or
more general themes. Knowledge of the discriminative and non-discriminative
terms used throughout the hierarchy enables us to quantify the uniqueness of a
given region and to compare similar but distant regions. Our evaluation
demonstrates that our model improves upon traditional Naive Bayes
classification by 47% and hierarchical TF-IDF by 27%. We further highlight the
differences and commonalities with human reasoning about what is locally
characteristic for a neighborhood, distilled from ten interviews and a survey
that covered themes such as time, events, and prior regional knowledgeComment: Accepted in WWW 2015, 2015, Florence, Ital
Find the one you like! Profiling Swiss parks with user generated content
The establishment of national parks originated from the desire to preserve scenic landscape areas of national or regional importance. With more recent diversification of protected area types and goals, obtaining knowledge on how parks are recreationally used has become more challenging for (local) policy makers and park managements, as there is a general lack of systematic and publicly available visitor monitoring data. We analyze recreational park use for 20 Swiss parks of national importance and develop park profiles, using user-generated content from Flickr. The 20 Swiss parks are described by 111,437 unique images taken by 6,468 unique users between 2007 and 2020. We fill an existing research gap by defining park use across three dimensions space, time and users, and combining these in our analyses. The park profiles provide information on diversity of recreational use and serve as a starting point for analyzing how the three dimensions contribute to this diversity. Our results show diverging park uses for the three dimensions indicating that park location matters, especially in terms of peri-urbanity and geographic region. Our method can be translated into European scale analyses, provided that different languages are considered. Park profiles are easy to communicate and easy to interpret tools for (local) policy-makers and park managers to segment the tourism market and develop new park marketing strategies to e.g. streamline visitation flows and reduce the negative impacts of outdoor recreation. In broader terms, our study serves as input for future recreation policy to protect, restore and promote sustainable use of protected areas
Management implications
• We offer insights into recreational park use in terms of users, space and time, particularly valuable for parks where monitoring visitation is vital (i.e. protected areas).
• Park profiles provide easy to communicate and interpret extensible tools for (local) policy-makers and park managers.
• Park profiles can be used for among others the development of new park marketing strategies catering preferences for differentiated user groups.
• Since different recreational user groups differ in their recreational behavior and in turn their environmental impact, park profiles can help in streamlining visitors as a potentially effective management measure for increased park sustainability (e.g. biodiversity conservation)
- …