1,299 research outputs found

    A Survey of Location Prediction on Twitter

    Full text link
    Locations, e.g., countries, states, cities, and point-of-interests, are central to news, emergency events, and people's daily lives. Automatic identification of locations associated with or mentioned in documents has been explored for decades. As one of the most popular online social network platforms, Twitter has attracted a large number of users who send millions of tweets on daily basis. Due to the world-wide coverage of its users and real-time freshness of tweets, location prediction on Twitter has gained significant attention in recent years. Research efforts are spent on dealing with new challenges and opportunities brought by the noisy, short, and context-rich nature of tweets. In this survey, we aim at offering an overall picture of location prediction on Twitter. Specifically, we concentrate on the prediction of user home locations, tweet locations, and mentioned locations. We first define the three tasks and review the evaluation metrics. By summarizing Twitter network, tweet content, and tweet context as potential inputs, we then structurally highlight how the problems depend on these inputs. Each dependency is illustrated by a comprehensive review of the corresponding strategies adopted in state-of-the-art approaches. In addition, we also briefly review two related problems, i.e., semantic location prediction and point-of-interest recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur

    Identifying the Geographic Location of an Image with a Multimodal Probability Density Function

    No full text
    There is a wide array of online photographic content that is not geotagged. Algorithms for efficient and accurate geographical estimation of an image are needed to geolocate these photos. This paper presents a general model for using both textual metadata and visual features of photos to automatically place them on a world map

    Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks

    Full text link
    We propose a method for embedding two-dimensional locations in a continuous vector space using a neural network-based model incorporating mixtures of Gaussian distributions, presenting two model variants for text-based geolocation and lexical dialectology. Evaluated over Twitter data, the proposed model outperforms conventional regression-based geolocation and provides a better estimate of uncertainty. We also show the effectiveness of the representation for predicting words from location in lexical dialectology, and evaluate it using the DARE dataset.Comment: Conference on Empirical Methods in Natural Language Processing (EMNLP 2017) September 2017, Copenhagen, Denmar

    Developing a Kalman filter approach to home range estimation: Applied to the Atlantic bluefin tuna (Thunnus thynnus)

    Get PDF
    Accurate estimation of an animal\u27s home range, or utilization distribution, is of great importance to understanding the animal\u27s role in the ecosystem, and for effective population management. Current methods for home range estimation often do not incorporate uncertainty in the observations of monitored animals. Given days without observations, they also have the potential to omit migration corridors when describing important habitat. Here the Extended Kalman filter is modified to return daily predicted geolocations, creating a most probable estimation of the true path the observed animal followed. Markov Chain Monte Carlo methods were used to map the uncertainty in this path to create a probability of use distribution, representing the animal\u27s utilization distribution. The modified method was applied to Atlantic bluefin tuna (Thunnus thynnus) observed using pop-off satellite archival tags with light-based geolocation. The home range estimation technique developed can be used for any animal with a time-series of locations

    Hidden Markov modelling of movement data from fish

    Get PDF

    Contingent Kernel Density Estimation

    Get PDF
    Kernel density estimation is a widely used method for estimating a distribution based on a sample of points drawn from that distribution. Generally, in practice some form of error contaminates the sample of observed points. Such error can be the result of imprecise measurements or observation bias. Often this error is negligible and may be disregarded in analysis. In cases where the error is non-negligible, estimation methods should be adjusted to reduce resulting bias. Several modifications of kernel density estimation have been developed to address specific forms of errors. One form of error that has not yet been addressed is the case where observations are nominally placed at the centers of areas from which the points are assumed to have been drawn, where these areas are of varying sizes. In this scenario, the bias arises because the size of the error can vary among points and some subset of points can be known to have smaller error than another subset or the form of the error may change among points. This paper proposes a “contingent kernel density estimation” technique to address this form of error. This new technique adjusts the standard kernel on a point-by-point basis in an adaptive response to changing structure and magnitude of error. In this paper, equations for our contingent kernel technique are derived, the technique is validated using numerical simulations, and an example using the geographic locations of social networking users is worked to demonstrate the utility of the method

    Utilizing Volunteered Geographic Information for Real-Time Analysis of Fire Hazards: Investigating the Potential of Twitter Data in Assessing the Impacted Areas

    Get PDF
    Natural hazards such as wildfires have proven to be more frequent in recent years, and to minimize losses and activate emergency response, it is necessary to estimate their impact quickly and consequently identify the most affected areas. Volunteered geographic information (VGI) data, particularly from the social media platform Twitter, now X, are emerging as an accessible and near-real-time geoinformation data source about natural hazards. Our study seeks to analyze and evaluate the feasibility and limitations of using tweets in our proposed method for fire area assessment in near-real time. The methodology involves weighted barycenter calculation from tweet locations and estimating the affected area through various approaches based on data within tweet texts, including viewing angle to the fire, road segment blocking information, and distance to fire information. Case study scenarios are examined, revealing that the estimated areas align closely with fire hazard areas compared to remote sensing (RS) estimated fire areas, used as pseudo-references. The approach demonstrates reasonable accuracy with estimation areas differing by distances of 2 to 6 km between VGI and pseudo-reference centers and barycenters differing by distances of 5 km on average from pseudo-reference centers. Thus, geospatial analysis on VGI, mainly from Twitter, allows for a rapid and approximate assessment of affected areas. This capability enables emergency responders to coordinate operations and allocate resources efficiently during natural hazards
    • …
    corecore