The widespread adoption of GPS-enabled tagging of social media content via
smartphones and social media services (e.g., Facebook, Twitter, Foursquare) uncovers
a new window into the spatio-temporal activities of hundreds of millions of people.
These \footprints" open new possibilities for understanding how people can organize
for societal impact and lay the foundation for new crowd-powered geo-social systems.
However, there are key challenges to delivering on this promise: the slow adoption
of location sharing, the inherent bias in the users that do share location, imbalanced
location granularity, respecting location privacy, among many others. With these
challenges in mind, this dissertation aims to develop the framework, algorithms, and
methods for a new class of geo-social information systems. The dissertation is structured
in two main parts: the rst focuses on understanding the capacity of existing
footprints; the second demonstrates the potential of new geo-social information systems
through two concrete prototypes.
First, we investigate the capacity of using these geo-social footprints to build new
geo-social information systems. (i): we propose and evaluate a probabilistic framework
for estimating a microblog user's location based purely on the content of the
user's posts. With the help of a classi cation component for automatically identifying
words in tweets with a strong local geo-scope, the location estimator places 51%
of Twitter users within 100 miles of their actual location. (ii): we investigate a set of
22 million check-ins across 220,000 users and report a quantitative assessment of human
mobility patterns by analyzing the spatial, temporal, social, and textual aspects
associated with these footprints. Concretely, we observe that users follow simple reproducible
mobility patterns. (iii): we compare a set of 35 million publicly shared check-ins with a set of over 400 million private query logs recorded by a commercial
hotel search engine. Although generated by users with fundamentally di erent intentions,
we nd common conclusions may be drawn from both data sources, indicating
the viability of publicly shared location information to complement (and replace, in
some cases), privately held location information.
Second, we introduce a couple of prototypes of new geo-social information systems
that utilize the collective intelligence from the emerging geo-social footprints.
Concretely, we propose an activity-driven search system, and a local expert nding
system that both take advantage of the collective intelligence. Speci cally, we study
location-based activity patterns revealed through location sharing services and nd
that these activity patterns can identify semantically related locations, and help with
both unsupervised location clustering, and supervised location categorization with a
high con dence. Based on these results, we show how activity-driven semantic organization
of locations may be naturally incorporated into location-based web search.
In addition, we propose a local expert nding system that identi es top local experts
for a topic in a location. Concretely, the system utilizes semantic labels that people
label each other, people's locations in current location-based social networks, and can
identify top local experts with a high precision. We also observe that the proposed
local authority metrics that utilize collective intelligence from expert candidates' core
audience (list labelers), signi cantly improve the performance of local experts nding
than the more intuitive way that only considers candidates' locations.
ii