2 research outputs found

    Spatial Statistics of Term Co-occurrences for Location Prediction of Tweets

    No full text
    Predicting the locations of non-geotagged tweets is an active research area in geographical information retrieval. In this work, we propose a method to detect term co-occurrences in tweets that exhibit spatial clustering or dispersion tendency with significant deviation from the underlying single-term patterns, and use these co-occurrences to extend the feature space in probabilistic language models. We observe that using term pairs that spatially attract or repel each other yields significant increase in the accuracy of predicted locations. The method we propose relies purely on statistical approaches and spatial point patterns without using external data sources or gazetteers. Evaluations conducted on a large set of multilingual tweets indicate higher accuracy than the existing state-of-the-art methods.acceptedVersionThis is a post-peer-review, pre-copyedit version of an article published in [Lecture Notes in Computer Science] Locked until 1.3.2019 due to copyright restrictions. The final authenticated version is available online at: https://doi.org/10.1007/978-3-319-76941-7_3

    Spatial Statistics of Term Co-occurrences for Location Prediction of Tweets

    No full text
    Predicting the locations of non-geotagged tweets is an active research area in geographical information retrieval. In this work, we propose a method to detect term co-occurrences in tweets that exhibit spatial clustering or dispersion tendency with significant deviation from the underlying single-term patterns, and use these co-occurrences to extend the feature space in probabilistic language models. We observe that using term pairs that spatially attract or repel each other yields significant increase in the accuracy of predicted locations. The method we propose relies purely on statistical approaches and spatial point patterns without using external data sources or gazetteers. Evaluations conducted on a large set of multilingual tweets indicate higher accuracy than the existing state-of-the-art methods
    corecore