678 research outputs found

    On the Accuracy of Hyper-local Geotagging of Social Media Content

    Full text link
    Social media users share billions of items per year, only a small fraction of which is geotagged. We present a data- driven approach for identifying non-geotagged content items that can be associated with a hyper-local geographic area by modeling the location distributions of hyper-local n-grams that appear in the text. We explore the trade-off between accuracy, precision and coverage of this method. Further, we explore differences across content received from multiple platforms and devices, and show, for example, that content shared via different sources and applications produces significantly different geographic distributions, and that it is best to model and predict location for items according to their source. Our findings show the potential and the bounds of a data-driven approach to geotag short social media texts, and offer implications for all applications that use data-driven approaches to locate content.Comment: 10 page

    Large-Scale Geo-Facial Image Analysis

    Get PDF
    While face analysis from images is a well-studied area, little work has explored the dependence of facial appearance on the geographic location from which the image was captured. To fill this gap, we constructed GeoFaces, a large dataset of geotagged face images, and used it to examine the geo-dependence of facial features and attributes, such as ethnicity, gender, or the presence of facial hair. Our analysis illuminates the relationship between raw facial appearance, facial attributes, and geographic location, both globally and in selected major urban areas. Some of our experiments, and the resulting visualizations, confirm prior expectations, such as the predominance of ethnically Asian faces in Asia, while others highlight novel information that can be obtained with this type of analysis, such as the major city with the highest percentage of people with a mustache

    Online indexing and clustering of social media data for emergency management

    Get PDF
    Social media becomes a vital part in our daily communication practice, creating a huge amount of data and covering different real-world situations. Currently, there is a tendency in making use of social media during emergency management and response. Most of this effort is performed by a huge number of volunteers browsing through social media data and preparing maps that can be used by professional first responders. Automatic analysis approaches are needed to directly support the response teams in monitoring and also understanding the evolution of facts in social media during an emergency situation. In this paper, we investigate the problem of real-time sub-events identification in social media data (i.e., Twitter, Flickr and YouTube) during emergencies. A processing framework is presented serving to generate situational reports/summaries from social media data. This framework relies in particular on online indexing and online clustering of media data streams. Online indexing aims at tracking the relevant vocabulary to capture the evolution of sub-events over time. Online clustering, on the other hand, is used to detect and update the set of sub-events using the indices built during online indexing. To evaluate the framework, social media data related to Hurricane Sandy 2012 was collected and used in a series of experiments. In particular some online indexing methods have been tested against a proposed method to show their suitability. Moreover, the quality of online clustering has been studied using standard clustering indices. Overall the framework provides a great opportunity for supporting emergency responders as demonstrated in real-world emergency exercises

    On the use of multi-sensor digital traces to discover spatio-temporal human behavioral patterns

    Get PDF
    134 p.La tecnología ya es parte de nuestras vidas y cada vez que interactuamos con ella, ya sea en una llamada telefónica, al realizar un pago con tarjeta de crédito o nuestra actividad en redes sociales, se almacenan trazas digitales. En esta tesis nos interesan aquellas trazas digitales que también registran la geolocalización de las personas al momento de realizar sus actividades diarias. Esta información nos permite conocer cómo las personas interactúan con la ciudad, algo muy valioso en planificación urbana,gestión de tráfico, políticas publicas e incluso para tomar acciones preventivas frente a desastres naturales.Esta tesis tiene por objetivo estudiar patrones de comportamiento humano a partir de trazas digitales. Para ello se utilizan tres conjuntos de datos masivos que registran la actividad de usuarios anonimizados en cuanto a llamados telefónicos, compras en tarjetas de crédito y actividad en redes sociales (check-ins,imágenes, comentarios y tweets). Se propone una metodología que permite extraer patrones de comportamiento humano usando modelos de semántica latente, Latent Dirichlet Allocation y DynamicTopis Models. El primero para detectar patrones espaciales y el segundo para detectar patrones espaciotemporales. Adicionalmente, se propone un conjunto de métricas para contar con un métodoobjetivo de evaluación de patrones obtenidos
    corecore