1,914 research outputs found
Recognizing City Identity via Attribute Analysis of Geo-tagged Images
After hundreds of years of human settlement, each city has formed a distinct identity, distinguishing itself from other cities. In this work, we propose to characterize the identity of a city via an attribute analysis of 2 million geo-tagged images from 21 cities over 3 continents. First, we estimate the scene attributes of these images and use this representation to build a higher-level set of 7 city attributes, tailored to the form and function of cities. Then, we conduct the city identity recognition experiments on the geo-tagged images and identify images with salient city identity on each city attribute. Based on the misclassification rate of the city identity recognition, we analyze the visual similarity among different cities. Finally, we discuss the potential application of computer vision to urban planning.National Science Foundation (U.S.) (Grant 1016862)Google (Firm) (Research Award
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indoor Urban Places
New research cutting across architecture, urban studies, and psychology is
contextualizing the understanding of urban spaces according to the perceptions
of their inhabitants. One fundamental construct that relates place and
experience is ambiance, which is defined as "the mood or feeling associated
with a particular place". We posit that the systematic study of ambiance
dimensions in cities is a new domain for which multimedia research can make
pivotal contributions. We present a study to examine how images collected from
social media can be used for the crowdsourced characterization of indoor
ambiance impressions in popular urban places. We design a crowdsourcing
framework to understand suitability of social images as data source to convey
place ambiance, to examine what type of images are most suitable to describe
ambiance, and to assess how people perceive places socially from the
perspective of ambiance along 13 dimensions. Our study is based on 50,000
Foursquare images collected from 300 popular places across six cities
worldwide. The results show that reliable estimates of ambiance can be obtained
for several of the dimensions. Furthermore, we found that most aggregate
impressions of ambiance are similar across popular places in all studied
cities. We conclude by presenting a multidisciplinary research agenda for
future research in this domain
Predicting Geo-informative Attributes in Large-Scale Image Collections Using Convolutional Neural Networks
Geographic location is a powerful property for or-ganizing large-scale photo collections, but only a small fraction of online photos are geo-tagged. Most work in automatically estimating geo-tags from image content is based on comparison against models of buildings or land-marks, or on matching to large reference collections of geo-tagged images. These approaches work well for frequently-photographed places like major cities and tourist destina-tions, but fail for photos taken in sparsely photographed places where few reference photos exist. Here we consider how to recognize general geo-informative attributes of a photo, e.g. the elevation gradient, population density, de-mographics, etc. of where it was taken, instead of trying to estimate a precise geo-tag. We learn models for these attributes using a large (noisy) set of geo-tagged images from Flickr by training deep convolutional neural networks (CNNs). We evaluate on over a dozen attributes, showing that while automatically recognizing some attributes is very difficult, others can be automatically estimated with about the same accuracy as a human. 1
Hotels-50K: A Global Hotel Recognition Dataset
Recognizing a hotel from an image of a hotel room is important for human
trafficking investigations. Images directly link victims to places and can help
verify where victims have been trafficked, and where their traffickers might
move them or others in the future. Recognizing the hotel from images is
challenging because of low image quality, uncommon camera perspectives, large
occlusions (often the victim), and the similarity of objects (e.g., furniture,
art, bedding) across different hotel rooms.
To support efforts towards this hotel recognition task, we have curated a
dataset of over 1 million annotated hotel room images from 50,000 hotels. These
images include professionally captured photographs from travel websites and
crowd-sourced images from a mobile application, which are more similar to the
types of images analyzed in real-world investigations. We present a baseline
approach based on a standard network architecture and a collection of
data-augmentation approaches tuned to this problem domain
Improving Image Classification with Location Context
With the widespread availability of cellphones and cameras that have GPS
capabilities, it is common for images being uploaded to the Internet today to
have GPS coordinates associated with them. In addition to research that tries
to predict GPS coordinates from visual features, this also opens up the door to
problems that are conditioned on the availability of GPS coordinates. In this
work, we tackle the problem of performing image classification with location
context, in which we are given the GPS coordinates for images in both the train
and test phases. We explore different ways of encoding and extracting features
from the GPS coordinates, and show how to naturally incorporate these features
into a Convolutional Neural Network (CNN), the current state-of-the-art for
most image classification and recognition problems. We also show how it is
possible to simultaneously learn the optimal pooling radii for a subset of our
features within the CNN framework. To evaluate our model and to help promote
research in this area, we identify a set of location-sensitive concepts and
annotate a subset of the Yahoo Flickr Creative Commons 100M dataset that has
GPS coordinates with these concepts, which we make publicly available. By
leveraging location context, we are able to achieve almost a 7% gain in mean
average precision
Cultural Diffusion and Trends in Facebook Photographs
Online social media is a social vehicle in which people share various moments
of their lives with their friends, such as playing sports, cooking dinner or
just taking a selfie for fun, via visual means, that is, photographs. Our study
takes a closer look at the popular visual concepts illustrating various
cultural lifestyles from aggregated, de-identified photographs. We perform
analysis both at macroscopic and microscopic levels, to gain novel insights
about global and local visual trends as well as the dynamics of interpersonal
cultural exchange and diffusion among Facebook friends. We processed images by
automatically classifying the visual content by a convolutional neural network
(CNN). Through various statistical tests, we find that socially tied
individuals more likely post images showing similar cultural lifestyles. To
further identify the main cause of the observed social correlation, we use the
Shuffle test and the Preference-based Matched Estimation (PME) test to
distinguish the effects of influence and homophily. The results indicate that
the visual content of each user's photographs are temporally, although not
necessarily causally, correlated with the photographs of their friends, which
may suggest the effect of influence. Our paper demonstrates that Facebook
photographs exhibit diverse cultural lifestyles and preferences and that the
social interaction mediated through the visual channel in social media can be
an effective mechanism for cultural diffusion.Comment: 10 pages, To appear in ICWSM 2017 (Full Paper
- …