Search CORE

8 research outputs found

Overture POI data for the United Kingdom: a comprehensive, queryable open data product

Author: Ballantyne Patrick
Berragan Cillian
Publication venue
Publication date: 27/10/2023
Field of study

Point of Interest data that is comprehensive, globally-available and open-access, is sparse, despite being important inputs for research in a number of application areas. New data from the Overture Maps Foundation offers significant potential in this arena, but accessing the data relies on computational resources beyond the skillset and capacity of the average researcher. In this article, we provide a processed version of the Overture places (POI) dataset for the UK, in a fully-queryable format, and provide accompanying code through which to explore the data, and generate other national subsets. In the article, we describe the construction and characteristics of the dataset, before considering how reliable it is (locational accuracy, attribute comprehensiveness), through direct comparison with Geolytix supermarket data. This dataset can support new and important research projects in a variety of different thematic areas, and foster a network of researchers to further evaluate its advantages and limitations.Comment: Main document: 6 pages, 1 figure, 2 tables. Supplementary: 2 pages, 2 figures, 1 tabl

arXiv.org e-Print Archive

Mapping Great Britain's semantic footprints through a large language model analysis of Reddit comments

Author: Berragan Cillian
Calafiore Alessia
Morley Jeremy
Singleton Alex
Publication venue
Publication date: 01/06/2024
Field of study

Observed regional variation in geotagged social media text is often attributed to dialects, where features in language are assumed to exhibit region-specific properties. While dialects are seen as a key component in defining the identity of regions, there are a multitude of other geographic properties that may be captured within natural language text. In our work, we consider locational mentions that are directly embedded within comments on the social media website Reddit, providing a range of associated semantic information, and enabling deeper representations between locations to be captured. Using a large corpus of geoparsed Reddit comments from UK-related local discussion subreddits, we first extract embedded semantic information using a large language model, aggregated into local authority districts, representing the semantic footprint of these regions. These footprints broadly exhibit spatial autocorrelation, with clusters that conform with the national borders of Wales and Scotland. London, Wales, and Scotland also demonstrate notably different semantic footprints compared with the rest of Great Britain

University of Liverpool Repository

Edinburgh Research Explorer

Evaluating the similarity of location-based corpora identified in Reddit comments

Author: Berragan Cillian
Calafiore Alessia
Morley Jeremy
Singleton Alex
Publication venue
Publication date: 02/04/2023
Field of study

Edinburgh Research Explorer

Transformer based named entity recognition for place name extraction from unstructured text

Author: Berragan Cillian
Calafiore Alessia
Morley Jeremy
Singleton Alex
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2022
Field of study

Place names embedded in online natural language text present a useful source of geographic information. Despite this, many methods for the extraction of place names from text use pre-trained models that were not explicitly designed for this task. Our paper builds five custom-built Named Entity Recognition (NER) models and evaluates them against three popular pre-built models for place name extraction. The models are evaluated using a set of manually annotated Wikipedia articles with reference to the F1 score metric. Our best performing model achieves an F1 score of 0.939 compared with 0.730 for the best performing pre-built model. Our model is then used to extract all place names from Wikipedia articles in Great Britain, demonstrating the ability to more accurately capture unknown place names from volunteered sources of online geographic information

University of Liverpool Repository

Edinburgh Research Explorer

Geoparsing comments from Reddit to extract mental place connectivity within the United Kingdom

Author: Berragan Cillian,
Publication venue
Publication date: 24/08/2022
Field of study

Ezid

Mapping Cognitive Place Associations within the United Kingdom through Online Discussion on Reddit

Author: Cillian Berragan (9871736)
Publication venue
Publication date: 10/02/2024
Field of study

This repository contains the data and code required to replicate the analysis of our paper.Data may be found within the data/ directory in the uploaded zip file.Code is found within the scripts/ directory.Replicate Processing Install project dependencies into a python virtual environment using `poetry install`.Replicate our pipeline using dvc repro -f.NOTE: Consult the dvc.yaml file to see the processing pipeline. Several stages have been frozen as they require external data.</p

FigShare

Recommended from our members

Geoparsing comments from Reddit to extract mental place connectivity within the United Kingdom

Author: Berragan Cillian
Calafiore Alessia
Morley Jeremy
Singleton Alex
Publication venue: eScholarship, University of California
Publication date: 09/09/2022
Field of study

Place connectivity is explored between geographic locations extracted from comments on Reddit. Unlike formally structured geographic data, this corpus of unstructured text provides connections derived from co-occurring locations, capturing subconscious links between them, alongside inherent biases. Our work demonstrates the ability to link locations mentioned by unique users, building ‘mental’ place connections for over 50,000 unique locations in the United Kingdom. Sentiment regarding locations is compared against their levels of connectivity, demonstrating that user opinions regarding locations are likely drivers in mental place connectivity

eScholarship - University of California

Mapping cognitive place associations within the United Kingdom through online discussion on Reddit

Author: Berragan Cillian
Calafiore Alessia
Morley Jeremy
Singleton Alex
Publication venue: Wiley
Publication date: 01/01/2024
Field of study

AbstractThis paper explores cognitive place associations; conceptualised as a place‐based mental model that derives subconscious links between geographic locations. Utilising a large corpus of online discussion data from the social media website Reddit, we experiment on the extraction of such geographic knowledge from unstructured text. First we construct a system to identify place names found in Reddit comments, disambiguating each to a set of coordinates where possible. Following this, we build a collective picture of cognitive place associations in the United Kingdom, linking locations that co‐occur in user comments and evaluating the effect of distance on the strength of these associations. Exploring these geographies nationally, associations were shown to be typically weaker over greater distances. This distance decay is also highly regional, rural areas typically have greater levels of distance decay, particularly in Wales and Scotland. When comparing major cities across the UK, we observe distinct distance decay patterns, influenced primarily by proximity to other cities.</jats:p

University of Liverpool Repository

Edinburgh Research Explorer