10 research outputs found

    LORE: a model for the detection of fine-grained locative references in tweets

    Full text link
    [EN] Extracting geospatially rich knowledge from tweets is of utmost importance for location-based systems in emergency services to raise situational awareness about a given crisis-related incident, such as earthquakes, floods, car accidents, terrorist attacks, shooting attacks, etc. The problem is that the majority of tweets are not geotagged, so we need to resort to the messages in the search of geospatial evidence. In this context, we present LORE, a location-detection system for tweets that leverages the geographic database GeoNames together with linguistic knowledge through NLP techniques. One of the main contributions of this model is to capture fine-grained complex locative references, ranging from geopolitical entities and natural geographic references to points of interest and traffic ways. LORE outperforms state-of-the-art open-source location-extraction systems (i.e. Stanford NER, spaCy, NLTK and OpenNLP), achieving an unprecedented trade-off between precision and recall. Therefore, our model provides not only a quantitative advantage over other well-known systems in terms of performance but also a qualitative advantage in terms of the diversity and semantic granularity of the locative references extracted from the tweets.Financial support for this research has been provided by the Spanish Ministry of Science, Innovation and Universities [grant number RTC 2017-6389-5], and the European Union's Horizon 2020 research and innovation program [grant number 101017861: project SMARTLAGOON]. We also thank Universidad de Granada for their financial support to the first author through the Becas de Iniciacion para estudiantes de Master 2018 del Plan Propio de la UGR.Fernández-Martínez, NJ.; Periñán-Pascual, C. (2021). LORE: a model for the detection of fine-grained locative references in tweets. Onomázein. (52):195-225. https://doi.org/10.7764/onomazein.52.111952255

    A Survey of Location Prediction on Twitter

    Full text link
    Locations, e.g., countries, states, cities, and point-of-interests, are central to news, emergency events, and people's daily lives. Automatic identification of locations associated with or mentioned in documents has been explored for decades. As one of the most popular online social network platforms, Twitter has attracted a large number of users who send millions of tweets on daily basis. Due to the world-wide coverage of its users and real-time freshness of tweets, location prediction on Twitter has gained significant attention in recent years. Research efforts are spent on dealing with new challenges and opportunities brought by the noisy, short, and context-rich nature of tweets. In this survey, we aim at offering an overall picture of location prediction on Twitter. Specifically, we concentrate on the prediction of user home locations, tweet locations, and mentioned locations. We first define the three tasks and review the evaluation metrics. By summarizing Twitter network, tweet content, and tweet context as potential inputs, we then structurally highlight how the problems depend on these inputs. Each dependency is illustrated by a comprehensive review of the corresponding strategies adopted in state-of-the-art approaches. In addition, we also briefly review two related problems, i.e., semantic location prediction and point-of-interest recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur

    Location Reference Recognition from Texts: A Survey and Comparison

    Full text link
    A vast amount of location information exists in unstructured texts, such as social media posts, news stories, scientific articles, web pages, travel blogs, and historical archives. Geoparsing refers to recognizing location references from texts and identifying their geospatial representations. While geoparsing can benefit many domains, a summary of its specific applications is still missing. Further, there is a lack of a comprehensive review and comparison of existing approaches for location reference recognition, which is the first and core step of geoparsing. To fill these research gaps, this review first summarizes seven typical application domains of geoparsing: geographic information retrieval, disaster management, disease surveillance, traffic management, spatial humanities, tourism management, and crime management. We then review existing approaches for location reference recognition by categorizing these approaches into four groups based on their underlying functional principle: rule-based, gazetteer matching–based, statistical learning-–based, and hybrid approaches. Next, we thoroughly evaluate the correctness and computational efficiency of the 27 most widely used approaches for location reference recognition based on 26 public datasets with different types of texts (e.g., social media posts and news stories) containing 39,736 location references worldwide. Results from this thorough evaluation can help inform future methodological developments and can help guide the selection of proper approaches based on application needs

    Location reference recognition from texts: A survey and comparison

    Get PDF
    A vast amount of location information exists in unstructured texts, such as social media posts, news stories, scientific articles, web pages, travel blogs, and historical archives. Geoparsing refers to the process of recognizing location references from texts and identifying their geospatial representations. While geoparsing can benefit many domains, a summary of the specific applications is still missing. Further, there lacks a comprehensive review and comparison of existing approaches for location reference recognition, which is the first and a core step of geoparsing. To fill these research gaps, this review first summarizes seven typical application domains of geoparsing: geographic information retrieval, disaster management, disease surveillance, traffic management, spatial humanities, tourism management, and crime management. We then review existing approaches for location reference recognition by categorizing these approaches into four groups based on their underlying functional principle: rule-based, gazetteer matching-based, statistical learning-based, and hybrid approaches. Next, we thoroughly evaluate the correctness and computational efficiency of the 27 most widely used approaches for location reference recognition based on 26 public datasets with different types of texts (e.g., social media posts and news stories) containing 39,736 location references across the world. Results from this thorough evaluation can help inform future methodological developments for location reference recognition, and can help guide the selection of proper approaches based on application needs

    Applications of deep learning in extracting actionable information from crisis-related social media content

    Get PDF
    Doctor of PhilosophyDepartment of Computer ScienceDoina CarageaWe have witnessed a large number of crisis situations in recent years, from natural disasters to man-made disasters and also to deadly animal and human health crises, culminating with the ongoing Covid-19 public health crisis. Disasters can have devastating health and socio-economic impacts. Emergency response and critical resource management during crises are pivotal tasks in mitigating the impacts of such events. These tasks require time-critical and reliable information for effective implementation. During emergent crises, there is a huge influx of information from various sources, which makes the task of collecting and managing reliable information harder. Identifying key information relevant for emergency responders and policy makers from huge streams of data is an infeasible task for human to attempt. There is a clear need of a pipeline of systems that can monitor, identify and collect actionable and relevant information from incomplete and noisy sources of data. Social media has evolved into a platform for people to share their concerns, report information as eyewitnesses of events, and also call for help, especially during crisis situations. However, due to the unstructured nature of data shared in these digital media, inherent noise and potential misinformation, extraction of actionable information is a challenging task. Considering the challenges associated with modern data-driven emergency response and crisis management, deep-learning is a natural choice in making use of the large volume of unstructured data. However, deep-learning models, typically, require a large amount of annotated or labelled data, which may not always be available for an emergent crisis. This dissertation aims to address some of these issues by exploring multi-task and multimodal deep-learning approaches, combined with self-supervised representation learning. From an application point of view, this dissertation tackles two specific tasks surrounding crisis information management: firstly, the time-critical task of identifying actionable information for emergent crisis, and secondly, the task of analyzing public response to crisis events and the policies surrounding the events through social-media

    Location mention detection in tweets and microblogs

    No full text
    The automatic identification of location expressions in social media text is an actively researched task. We present a novel approach to detection mentions of locations in the texts of microblogs and social media. We propose an approach based on Noun Phrase extraction and n-gram based matching instead of the traditional methods using Named Entity Recognition (NER) or Conditional Random Fields (CRF), arguing that our method is better suited to noisy microblog text. Our proposed system is comprised of several individual modules to detect addresses, Points of Interest (e.g. hospitals or universities), distance and direction markers; and location names (e.g. suburbs or countries). Our system won the ALTA 2014 Twitter Location Detection shared task with an F-score of 0.792 for detecting location expressions in a test set of 1,000 tweets, demonstrating its efficacy for this task. A number of directions for future work are discussed.12 page(s

    Sentiment analysis as a service

    Get PDF
    This research focuses on the design and development of a service composition based framework that enables the execution of services for social media based sentiment analysis. Our research develops novel analytical models, composition techniques and algorithms which use services as a mean for sentiment abstraction, processing and analysis from large scale social media data. Current sentiment analysis techniques require specialized skill of data science and machine learning. Moreover, traditional approaches rely on laborious and time-consuming activities such as manual dataset labelling, data model training and validation. This makes overall sentiment analysis process a challenging task. In comparison, services are `ready-made' software solutions that can be composed on-demand for developing complex applications without indulging in the domain specific details. This thesis investigates a novel approach that transforms traditional social media based sentiment analysis process into a service composition driven solution. In this thesis, we begin by developing a novel service framework that replaces the traditional sentiment analysis tasks with online services. Our framework includes a new service model to present services required for sentiment analysis. We develop a semantic service composition model and algorithm that dynamically composes various services for data collection, noise filtering and sentiment extraction. In particular, we focus on abstracting sentiment based on location and time. We then focus on enhancing the flexibility of our proposed service framework to compose appropriate sentiment analysis services for highly dynamic and changing features of social media platforms. In addition, we aim to efficiently process and analyze large scale social media data. In order to enhance our service composition framework, we propose a novel approach to formalize social media platforms as cloud enabled services. We develop a functional and quality of service (QoS) model that captures various dynamic features of social media platforms. In addition, we devise a cloud based service model to access social media platforms as services by using the Ontology Web Language for Service (OWL-S). Secondly, we integrate the QoS model into our existing composition framework. It enables our framework to dynamically assess the QoS of multiple social media platforms, and simultaneously compose appropriate services to extract, process, analyze and integrate the sentiment results from large scale data. Finally, we concentrate on efficient utilization of the sentiment analysis extracted from large scale data. We formulate a meta-information composition model that transforms and stores sentiment obtained from large streams of data into re-usable information. Later, the re-usable information is on-demand integrated and delivered to end users. To demonstrate the performance and test the effectiveness of our proposed models, we develop prototypes to evaluate our composition framework. We design several scenarios and conduct a series of experiments using real-world social media datasets. We present the results and discuss the outcomes which validate the performance of our research
    corecore