88 research outputs found

    Human dynamics in the age of big data: a theory-data-driven approach

    Get PDF
    The revolution of information and communication technology (ICT) in the past two decades have transformed the world and people’s lives with the ways that knowledge is produced. With the advancements in location-aware technologies, a large volume of data so-called “big data” is now available through various sources to explore the world. This dissertation examines the potential use of such data in understanding human dynamics by focusing on both theory- and data-driven approaches. Specifically, human dynamics represented by communication and activities is linked to geographic concepts of space and place through social media data to set a research platform for effective use of social media as an information system. Three case studies covering these conceptual linkages are presented to (1) identify communication patterns on social media; (2) identify spatial patterns of activities in urban areas and detect events; and (3) explore urban mobility patterns. The first case study examines the use of and communication dynamics on Twitter during Hurricane Sandy utilizing survey and data analytics techniques. Twitter was identified as a valuable source of disaster-related information. Additionally, the results shed lights on the most significant information that can be derived from Twitter during disasters and the need for establishing bi-directional communications during such events to achieve an effective communication. The second case study examines the potential of Twitter in identifying activities and events and exploring movements during Hurricane Sandy utilizing both time-geographic information and qualitative social media text data. The study provides insights for enhancing situational awareness during natural disasters. The third case study examines the potential of Twitter in modeling commuting trip distribution in New York City. By integrating both traditional and social media data and utilizing machine learning techniques, the study identified Twitter as a valuable source for transportation modeling. Despite the limitations of social media such as the accuracy issue, there is tremendous opportunity for geographers to enrich their understanding of human dynamics in the world. However, we will need new research frameworks, which integrate geographic concepts with information systems theories to theorize the process. Furthermore, integrating various data sources is the key to future research and will need new computational approaches. Addressing these computational challenges, therefore, will be a crucial step to extend the frontier of big data knowledge from a geographic perspective. KEYWORDS: Big data, social media, Twitter, human dynamics, VGI, natural disasters, Hurricane Sandy, transportation modeling, machine learning, situational awareness, NYC, GI

    A statistical approach for studying urban human dynamics

    Get PDF
    A thesis submitted in partial fulfillment of the requirements for the degree of Doctor in Information Management, specialization in Geographic Information SystemsThis doctoral dissertation proposed several statistical approaches to analyse urban dynamics with aiming to provide tools for decision making processes and urban studies. It assumed that human activity and human mobility compose urban dynamics. Initially, it studied geolocated social media data and considered them as a proxy for where and when people carry out what it is defined as the human activity. It employed techniques associated with generalised linear models, functional data analysis, hierarchical clustering, and epidemic data, to explain the spatio-temporal distribution of the places where people interact with their social networks. Afterwards, to understand the mobility in urban environments, data coming from an underground railway system were used. The information was considered repeated daily measurements to capture the regularity of human behaviour. By implementing methods from functional principal components data analysis and hierarchical clustering, it was possible to describe the system and identify human mobility patterns

    Understanding the Socio-infrastructure Systems During Disaster from Social Media Data

    Get PDF
    Our socio-infrastructure systems are becoming more and more vulnerable due to the increased severity and frequency of extreme events every year. Effective disaster management can minimize the damaging impacts of a disaster to a large extent. The ubiquitous use of social media platforms in GPS enabled smartphones offers a unique opportunity to observe, model, and predict human behavior during a disaster. This dissertation explores the opportunity of using social media data and different modeling techniques towards understanding and managing disaster more dynamically. In this dissertation, we focus on four objectives. First, we develop a method to infer individual evacuation behaviors (e.g., evacuation decision, timing, destination) from social media data. We develop an input output hidden Markov model to infer evacuation decisions from user tweets. Our findings show that using geo-tagged posts and text data, a hidden Markov model can be developed to capture the dynamics of hurricane evacuation decision. Second, we develop evacuation demand prediction model using social media and traffic data. We find that trained from social media and traffic data, a deep learning model can predict well evacuation traffic demand up to 24 hours ahead. Third, we present a multi-label classification approach to identify the co-occurrence of multiple types of infrastructure disruptions considering the sentiment towards a disruption—whether a post is reporting an actual disruption (negative), or a disruption in general (neutral), or not affected by a disruption (positive). We validate our approach for data collected during multiple hurricanes. Fourth, finally we develop an agent-based model to understand the influence of multiple information sources on risk perception dynamics and evacuation decisions. In this study, we explore the effects of socio-demographic factors and information sources such as social connectivity, neighborhood observation, and weather information and its credibility in forming risk perception dynamics and evacuation decisions

    Measuring Collective Attention in Online Content: Sampling, Engagement, and Network Effects

    Get PDF
    The production and consumption of online content have been increasing rapidly, whereas human attention is a scarce resource. Understanding how the content captures collective attention has become a challenge of growing importance. In this thesis, we tackle this challenge from three fronts -- quantifying sampling effects of social media data; measuring engagement behaviors towards online content; and estimating network effects induced by the recommender systems. Data sampling is a fundamental problem. To obtain a list of items, one common method is sampling based on the item prevalence in social media streams. However, social data is often noisy and incomplete, which may affect the subsequent observations. For each item, user behaviors can be conceptualized as two steps -- the first step is relevant to the content appeal, measured by the number of clicks; the second step is relevant to the content quality, measured by the post-clicking metrics, e.g., dwell time, likes, or comments. We categorize online attention (behaviors) into two classes: popularity (clicking) and engagement (watching, liking, or commenting). Moreover, modern platforms use recommender systems to present the users with a tailoring content display for maximizing satisfaction. The recommendation alters the appeal of an item by changing its ranking, and consequently impacts its popularity. Our research is enabled by the data available from the largest video hosting site YouTube. We use YouTube URLs shared on Twitter as a sampling protocol to obtain a collection of videos, and we track their prevalence from 2015 to 2019. This method creates a longitudinal dataset consisting of more than 5 billion tweets. Albeit the volume is substantial, we find Twitter still subsamples the data. Our dataset covers about 80% of all tweets with YouTube URLs. We present a comprehensive measurement study of the Twitter sampling effects across different timescales and different subjects. We find that the volume of missing tweets can be estimated by Twitter rate limit messages, true entity ranking can be inferred based on sampled observations, and sampling compromises the quality of network and diffusion models. Next, we present the first large-scale measurement study of how users collectively engage with YouTube videos. We study the time and percentage of each video being watched. We propose a duration-calibrated metric, called relative engagement, which is correlated with recognized notion of content quality, stable over time, and predictable even before a video's upload. Lastly, we examine the network effects induced by the YouTube recommender system. We construct the recommendation network for 60,740 music videos from 4,435 professional artists. An edge indicates that the target video is recommended on the webpage of source video. We discover the popularity bias -- videos are disproportionately recommended towards more popular videos. We use the bow-tie structure to characterize the network and find that the largest strongly connected component consists of 23.1% of videos while occupying 82.6% of attention. We also build models to estimate the latent influence between videos and artists. By taking into account the network structure, we can predict video popularity 9.7% better than other baselines. Altogether, we explore the collective consuming patterns of human attention towards online content. Methods and findings from this thesis can be used by content producers, hosting sites, and online users alike to improve content production, advertising strategies, and recommender systems. We expect our new metrics, methods, and observations can generalize to other multimedia platforms such as the music streaming service Spotify

    Public participation in the Geoweb era: Geosocial media use in local government

    Get PDF
    Advances in spatially enabled information and communication technologies (ICTs) have provided governments with the potential to enhance public participation and to collaborate with citizens. This dissertation critically assesses this potential and identifies the opportunities and challenges for local governments to embark on emerging geo-enabled practices. This dissertation first proposes a new typology for classifying geo-enabled practices related to public participation (termed here as geo-participation) and demonstrates the emerging opportunities presented by geo-participation to improve government-citizen collaboration and government operations. This dissertation then provides in-depth examinations of geosocial media as an exemplar geo-participation practice. The first empirical study assesses the potential of repurposing geosocial media data to gauge public opinions. The study suggests that geosocial media can help identify geographies of public perceptions concerning public facilities and services and have the potential to complement other methods of gauging public sentiment. The second empirical study assesses the usefulness of geosocial media for sharing non-emergency issues and identifies an important opportunity of enabling citizen collaboration for reporting and sharing non-emergency issues. Altogether, this dissertation makes several conceptual, empirical, and practical contributions to local government adoption of geo-participation. Conceptually, the proposed typology lays the foundation for researching and implementing geo-participation practices. Empirically, this dissertation tells a story of opportunities and challenges that sheds light on how local governments may adopt geosocial media to solicit citizen input and enable new forms of government-citizen interaction. Practically, this dissertation develops a tool for processing text-based citizen input and models of implementing geosocial media reporting that can help local government develop proper strategies of adopting geosocial media

    Marketplace Theory in the Age of AI Communicators

    Get PDF

    Helve'tweet: exploration d'un million de tweets géolocalisés en Suisse, février-août 2017

    Get PDF
    Réseau social utilisé activement par 8% de la population suisse, Twitter permet à ses utilisateurs de géolocaliser leurs messages. Cette étude exploratoire quantitative, basée sur des messages géolocalisés en Suisse écrits entre le 18 février et le 31 août 2017, fait suite au projet GEoTweet consacré aux tweets genevois en 2014-2015. Elle se propose de répondre à trois questions de recherche pour évaluer les possibilités et les limites de l’utilisation des données fournies par l’API de Twitter lors des recherches sur la Suisse, dans les domaines de la sociologie des données et des sciences de l’information. Le focus est porté plus spécifiquement sur l’exploitation des données de géolocalisation, sur la problématique de l’identification des langues et sur les critères définissant un tweet suisse dans une perspective d’archivage. Après l’introduction et la revue de littérature, le rapport présente la méthodologie utilisée, les biais identifiés et les outils créés pour les mesurer, les éviter ou du moins les minimiser. Une concordance a ainsi été créée entre les place.id de Twitter et la liste officielle des communes suisses pour pallier au caractère non vérifié (en partie obsolètes, en partie erronées) des données géographiques fournies par Twitter. Trois séries de tests ont également été menés pour vérifier la fiabilité de l’algorithme de reconnaissance de langue de Twitter pour l’échantillon. Ils montrent une marge d’erreur de 4,25% sur les grandes langues européennes, mais qui peut monter jusqu’à 92% pour une langue « exotique » comme l’indonésien. Les analyses des tweets et des twittos ont permis de dégager des résultats importants. D’une part, elles montrent les fortes variations de leur nombre et de leur diversité linguistique à travers l’espace et le temps (p.ex. plus de comptes actifs en Suisse alémanique, mais plus de tweets en français dans l’ensemble ; plus de tweets pendant les périodes de vacances, mais baisse de la proportion des tweets et des twittos en langues nationales et en anglais). D’autre part la durée et l’étendue géographique de leur activité sont très variables (p.ex. 82% des comptes avec moins de 10 tweets, 68% actifs pendant un seul mois et 71% dans un seul canton). Des hypothèses ont été formulées et vérifiées pour expliquer ces résultats qui relèvent de la propension élevée des germanophones à twitter en anglais et de l’effet positif des loisirs sur l’envie et l’opportunité de twitter avec géolocalisation. Dans la dernière partie, l’étude propose des pistes afin d’établir des critères pour reconnaître un tweet suisse, en se basant sur les analyses menées préalablement ainsi que sur les expériences menées dans d’autres pays du monde. Le contexte international et suisse de l’archivage des tweets est abordé, sans prétention de vouloir proposer une méthode, au vu de la complexité des enjeux sociologiques, techniques et légaux

    Visual Analytics Methods for Exploring Geographically Networked Phenomena

    Get PDF
    abstract: The connections between different entities define different kinds of networks, and many such networked phenomena are influenced by their underlying geographical relationships. By integrating network and geospatial analysis, the goal is to extract information about interaction topologies and the relationships to related geographical constructs. In the recent decades, much work has been done analyzing the dynamics of spatial networks; however, many challenges still remain in this field. First, the development of social media and transportation technologies has greatly reshaped the typologies of communications between different geographical regions. Second, the distance metrics used in spatial analysis should also be enriched with the underlying network information to develop accurate models. Visual analytics provides methods for data exploration, pattern recognition, and knowledge discovery. However, despite the long history of geovisualizations and network visual analytics, little work has been done to develop visual analytics tools that focus specifically on geographically networked phenomena. This thesis develops a variety of visualization methods to present data values and geospatial network relationships, which enables users to interactively explore the data. Users can investigate the connections in both virtual networks and geospatial networks and the underlying geographical context can be used to improve knowledge discovery. The focus of this thesis is on social media analysis and geographical hotspots optimization. A framework is proposed for social network analysis to unveil the links between social media interactions and their underlying networked geospatial phenomena. This will be combined with a novel hotspot approach to improve hotspot identification and boundary detection with the networks extracted from urban infrastructure. Several real world problems have been analyzed using the proposed visual analytics frameworks. The primary studies and experiments show that visual analytics methods can help analysts explore such data from multiple perspectives and help the knowledge discovery process.Dissertation/ThesisDoctoral Dissertation Computer Science 201
    • …
    corecore