6 research outputs found

    Authenticity of Geo-Location and Place Name in Tweets

    Get PDF
    The place name and geo-coordinates of tweets are supposed to represent the possible location of the user at the time of posting that tweet. However, our analysis over a large collection of tweets indicates that these fields may not give the correct location of the user at the time of posting that tweet. Our investigation reveals that the tweets posted through third party applications such as Instagram or Swarmapp contain the geo-coordinate of the user specified location, not his current location. Any place name can be entered by a user to be displayed on a tweet. It may not be same as his/her exact location. Our analysis revealed that around 12% of tweets contains place names which are different from their real location. The findings of this research can be used as caution while designing location-based services using social media

    Spatial And Temporal Patterns Of Geo-Tagged Tweets

    Get PDF
    With over 500 million current registered users and over 500 million tweets per day, Twitter has caught the attention of scientists in various disciplines. As Twitter allows users to send messages with location tags, a massive amount of valuable geo-social knowledge is embedded in tweets, which can provide useful implications for human geography, urban science, location-based service, targeted advertising, and social network studies. This thesis aims to determine the lifestyle patterns of college students by analyzing the spatial and temporal dynamics in their tweets. Geo-tagged tweets are collected over a period of six months for four US Midwestern college cites: 1) West Lafayette, Indiana (Purdue University); 2) Bloomington, Indiana (Indiana University); 3) Ann Arbor, Michigan (University of Michigan); 4) Columbus, Ohio (The Ohio State University). The overall distribution of the tweets was determined for each city, and the spatial patterns of representative individuals were examined as well. Grouping the tweets in time domains, the temporal patterns on an hourly, daily, and monthly basis were analyzed. Utilizing detailed land use data for each city, further insight about the thematic properties of the tweeting locations was obtained, leading to a deeper understanding about the life, mobility and flow patterns of Twitter users. Finally, space-time clusters and anomalies within tweets, which were considered events, were found with the space-time statistics. The results generally reflected everyday human activity patterns including the mobile population in each city as well as the commute behaviors of the representative users. The tweets also consistently revealed the occurrence of anomalies or events. The results of this thesis therefore confirmed the feasibility and promising future for using geo-tagged micro-blogging services such as Twitter in understanding human behavior patterns and other geo-social related studies

    Um ambiente para anotação de localização e eventos em coleções de fotografias.

    Get PDF
    Em razão do elevado número de fotografias gerado atualmente, técnicas para organizar, buscar e recuperar tais imagens são fundamentais. Organizar uma coleção de fotografias com milhares de imagens não é um trabalho simples. Além disso, associar dados de redes sociais com fotografias é ainda mais trabalhoso. Fotografias pessoais são comumente anotadas tomando-se como referência as seguintes perguntas: "Quem? Onde? Quando?". Considerando-se as informações importantes na recuperação de fotografias, este trabalho centra esforços nas questões “Onde?” e “Quando?”. Com essas duas perguntas em mente, o foco está voltado para a localização (“Onde?”) e os eventos da fotografia (“Onde?” e “Quando?”). O objetivo geral consiste em propor um ambiente para anotação de localização e eventos sociais. Essa anotação é auxiliada pelas técnicas de propagação de localização e de detecção de eventos sociais propostas neste trabalho. Os resultados dos experimentos com técnicas de propagação de localização indicam que a escolha dessa técnica deve ocorrer conforme o comportamento de cada usuário do sistema. Por isso, além das técnicas de propagação, propõe-se, neste trabalho, uma seleção automática de técnica de propagação de localização. Os experimentos realizados para validar a técnica de detecção de eventos sociais apresentaram bons resultados e a referida técnica, além de realizar a detecção de eventos, também pode ser usada para agrupar fotografias pertencentes a um mesmo evento. Por fim, este trabalho apresenta um protótipo de ferramenta web para unir a anotação de localização com a anotação de eventos.Due to the large number of pictures that is currently generated, it is very important to have techniques to organize, search and retrieve such images. Organize a collection of photos with thousands of images is not a simple job, and to associate data from social networks with the photographs is even more laborious. Personal photographs are commonly organized by reference to the following questions: "Who? Where? When?". Considering the three important questions, this work focuses efforts on the questions "Where?" and "When?". With these two questions in mind, the focus will be on location ("Where?") and events ("Where?" and "When?"). The overall objective is to offer an environment for annotating location and social events. This annotation is aided by the techniques proposed in this work. The experiments with location propagation techniques indicate that the choice of the propagation technique should happen considering the behavior of each user. Therefore, in addition to propagation techniques, is proposed, in this work, an automatic selection of location propagation techniques. The experiments performed to validate the social event detection technique presented good results and, in addition to performing event detection, can also be used to group photographs belonging to the same event. Finally, this work presents a tool to unite the annotation location with the annotation of events.CNPqCape

    Towards an Extensible Expert-Sourcing Platform

    Get PDF
    University of Minnesota Ph.D. dissertation.May 2019. Major: Computer Science. Advisor: Mohamed Mokbel. 1 computer file (PDF); viii, 106 pages.In recent years, general purpose crowdsourcing platforms, e.g., Amazon Mechanical Turk, Figure Eight, and ChinaCrowds, have been gaining a lot of popularity due to their capability in solving tasks that are still difficult for machines or computers to solve, e.g., labeling data, sorting images, computing skyline over noisy data, and sentiment analysis. Unfortunately, current crowdsourcing platforms are lacking a very important feature that is desired by many of the recent crowdsourcing applications, namely, recruiting workers that are expert at a given task. Being able to recruit expert workers will allow those applications to not only achieve a more accurate results but also higher quality results than recruiting general crowd for the applications. We call such crowdsourcing process as expert-sourcing, i.e., outsourcing tasks to experts. Without having any platforms to support them, developers of each expert-sourcing application needs to build the whole crowdsourcing system stack from scratch while, in fact, those systems share many common components with each other. This thesis proposes Luna; the first extensible expert-sourcing platform. To instantiate a new expert-sourcing application out of Luna, one only needs to provide a few simple plug-ins that will be integrated with the core components of Luna to provide the expert-sourcing platform for the new application. This is possible due to the fact that Luna is able to identify the components that can be shared among many expert-sourcing applications and the components that need to be tailored for a specific application. In this thesis, we show the extensibility of Luna by instantiating six different expert-sourcing applications that are currently not well supported by the general purpose crowdsourcing platforms. Experimental evaluation with real crowdsourcing deployment as well as by using real dataset shows that Luna is able to achieve not only more accurate but also better quality results than existing general purpose crowdsourcing platforms in supporting expert-sourcing applications. Lastly, we also provide a more specialized expert-sourcing platform for image geotagging application that is initially deemed unfit to be solved by crowdsourcing

    Automated Assessment of the Aftermath of Typhoons Using Social Media Texts

    Full text link
    Disasters are one of the major threats to economics and human societies, causing substantial losses of human lives, properties and infrastructures. It has been our persistent endeavors to understand, prevent and reduce such disasters, and the popularization of social media is offering new opportunities to enhance disaster management in a crowd-sourcing approach. However, social media data is also characterized by its undue brevity, intense noise, and informality of language. The existing literature has not completely addressed these disadvantages, otherwise vast manual efforts are devoted to tackling these problems. The major focus of this research is on constructing a holistic framework to exploit social media data in typhoon damage assessment. The scope of this research covers data collection, relevance classification, location extraction and damage assessment while assorted approaches are utilized to overcome the disadvantages of social media data. Moreover, a semi-supervised or unsupervised approach is prioritized in forming the framework to minimize manual intervention. In data collection, query expansion strategy is adopted to optimize the search recall of typhoon-relevant information retrieval. Multiple filtering strategies are developed to screen the keywords and maintain the relevance to search topics in the keyword updates. A classifier based on a convolutional neural network is presented for relevance classification, with hashtags and word clusters as extra input channels to augment the information. In location extraction, a model is constructed by integrating Bidirectional Long Short-Time Memory and Conditional Random Fields. Feature noise correction layers and label smoothing are leveraged to handle the noisy training data. Finally, a multi-instance multi-label classifier identifies the damage relations in four categories, and the damage categories of a message are integrated with the damage descriptions score to obtain damage severity score for the message. A case study is conducted to verify the effectiveness of the framework. The outcomes indicate that the approaches and models developed in this study significantly improve in the classification of social media texts especially under the framework of semi-supervised or unsupervised learning. Moreover, the results of damage assessment from social media data are remarkably consistent with the official statistics, which demonstrates the practicality of the proposed damage scoring scheme
    corecore