10 research outputs found
Multi-Scale Representation Learning for Spatial Feature Distributions using Grid Cells
Unsupervised text encoding models have recently fueled substantial progress
in NLP. The key idea is to use neural networks to convert words in texts to
vector space representations based on word positions in a sentence and their
contexts, which are suitable for end-to-end training of downstream tasks. We
see a strikingly similar situation in spatial analysis, which focuses on
incorporating both absolute positions and spatial contexts of geographic
objects such as POIs into models. A general-purpose representation model for
space is valuable for a multitude of tasks. However, no such general model
exists to date beyond simply applying discretization or feed-forward nets to
coordinates, and little effort has been put into jointly modeling distributions
with vastly different characteristics, which commonly emerges from GIS data.
Meanwhile, Nobel Prize-winning Neuroscience research shows that grid cells in
mammals provide a multi-scale periodic representation that functions as a
metric for location encoding and is critical for recognizing places and for
path-integration. Therefore, we propose a representation learning model called
Space2Vec to encode the absolute positions and spatial relationships of places.
We conduct experiments on two real-world geographic data for two different
tasks: 1) predicting types of POIs given their positions and context, 2) image
classification leveraging their geo-locations. Results show that because of its
multi-scale representations, Space2Vec outperforms well-established ML
approaches such as RBF kernels, multi-layer feed-forward nets, and tile
embedding approaches for location modeling and image classification tasks.
Detailed analysis shows that all baselines can at most well handle distribution
at one scale but show poor performances in other scales. In contrast,
Space2Vec's multi-scale representation can handle distributions at different
scales.Comment: 15 pages; Accepted to ICLR 2020 as a spotlight pape
Spatial Data Analysis Utilizing Grid Dbscan Algorithm in Clustering Techniques for Partial Object Classification Issues
Clustering algorithms to solve problems with partial object categorization in spatial data analysis is the topic of this research, which explores the usefulness of these techniques. In order to do this, the Grid-DBSCAN method is offered as an effective clustering tool for the purpose of resolving issues involving partial object categorization. A grid-based technique is included into the Grid-DBSCAN algorithm, which is derived from the DBSCAN algorithm and is designed to increase its overall performance. A number of datasets taken from the real world are used to evaluate the method, and it is then compared to existing clustering techniques. The findings of the experiments indicate that the Grid-DBSCAN method is superior to the other clustering algorithms in terms of accuracy and resilience, and that it is able to locate the most effective solution for jobs involving partial object categorization. It is also possible to enhance the Grid-DBSCAN technique so that it can handle different kinds of complicated datasets. The purpose of this study is to offer an understanding of the efficiency of the suggested method and its potential to perform partial object categorization problems in spatial data analysis
Learning Large-scale Location Embedding From Human Mobility Trajectories with Graphs
An increasing amount of location-based service (LBS) data is being
accumulated and helps to study urban dynamics and human mobility. GPS
coordinates and other location indicators are normally low dimensional and only
representing spatial proximity, thus difficult to be effectively utilized by
machine learning models in Geo-aware applications. Existing location embedding
methods are mostly tailored for specific problems that are taken place within
areas of interest. When it comes to the scale of a city or even a country,
existing approaches always suffer from extensive computational cost and
significant data sparsity. Different from existing studies, we propose to learn
representations through a GCN-aided skip-gram model named GCN-L2V by
considering both spatial connection and human mobility. With a flow graph and a
spatial graph, it embeds context information into vector representations.
GCN-L2V is able to capture relationships among locations and provide a better
notion of similarity in a spatial environment. Across quantitative experiments
and case studies, we empirically demonstrate that representations learned by
GCN-L2V are effective. As far as we know, this is the first study that provides
a fine-grained location embedding at the city level using only LBS records.
GCN-L2V is a general-purpose embedding model with high flexibility and can be
applied in down-streaming Geo-aware applications
Semantically-Enriched Search Engine for Geoportals: A Case Study with ArcGIS Online
Many geoportals such as ArcGIS Online are established with the goal of
improving geospatial data reusability and achieving intelligent knowledge
discovery. However, according to previous research, most of the existing
geoportals adopt Lucene-based techniques to achieve their core search
functionality, which has a limited ability to capture the user's search
intentions. To better understand a user's search intention, query expansion can
be used to enrich the user's query by adding semantically similar terms. In the
context of geoportals and geographic information retrieval, we advocate the
idea of semantically enriching a user's query from both geospatial and thematic
perspectives. In the geospatial aspect, we propose to enrich a query by using
both place partonomy and distance decay. In terms of the thematic aspect,
concept expansion and embedding-based document similarity are used to infer the
implicit information hidden in a user's query. This semantic query expansion 1
2 G. Mai et al. framework is implemented as a semantically-enriched search
engine using ArcGIS Online as a case study. A benchmark dataset is constructed
to evaluate the proposed framework. Our evaluation results show that the
proposed semantic query expansion framework is very effective in capturing a
user's search intention and significantly outperforms a well-established
baseline-Lucene's practical scoring function-with more than 3.0 increments in
DCG@K (K=3,5,10).Comment: 18 pages; Accepted to AGILE 2020 as a full paper GitHub Code
Repository: https://github.com/gengchenmai/arcgis-online-search-engin
SE-KGE: A Location-Aware Knowledge Graph Embedding Model for Geographic Question Answering and Spatial Semantic Lifting
Learning knowledge graph (KG) embeddings is an emerging technique for a
variety of downstream tasks such as summarization, link prediction, information
retrieval, and question answering. However, most existing KG embedding models
neglect space and, therefore, do not perform well when applied to (geo)spatial
data and tasks. For those models that consider space, most of them primarily
rely on some notions of distance. These models suffer from higher computational
complexity during training while still losing information beyond the relative
distance between entities. In this work, we propose a location-aware KG
embedding model called SE-KGE. It directly encodes spatial information such as
point coordinates or bounding boxes of geographic entities into the KG
embedding space. The resulting model is capable of handling different types of
spatial reasoning. We also construct a geographic knowledge graph as well as a
set of geographic query-answer pairs called DBGeo to evaluate the performance
of SE-KGE in comparison to multiple baselines. Evaluation results show that
SE-KGE outperforms these baselines on the DBGeo dataset for geographic logic
query answering task. This demonstrates the effectiveness of our
spatially-explicit model and the importance of considering the scale of
different geographic entities. Finally, we introduce a novel downstream task
called spatial semantic lifting which links an arbitrary location in the study
area to entities in the KG via some relations. Evaluation on DBGeo shows that
our model outperforms the baseline by a substantial margin.Comment: Accepted to Transactions in GI
Geospatial Query Answering Using Knowledge Graph Embeddings
Τα γραφήματα γεωχωρικής γνώσης πάσχουν από ελλιπή στοιχεία, τα οποία οφείλονται
στις όχι πάντα αξιόπιστες πηγές δεδομένων. Αυτό επηρεάζει δραματικά τα αποτελέσματα
της απάντησης γεωχωρικών ερωτημάτων με τις παραδοσιακές τεχνικές που χρησιμοποιούν τυποποιημένες γλώσσες ερωτημάτων όπως η stSPARQL ή η GeoSPARQL. Τα
μοντέλα που βασίζονται στην ενσωμάτωση προβάλλουν τις οντότητες και τις σχέσεις του
ερωτήματος που τίθεται στον συνεχή διανυσματικό χώρο, προβλέποντας, με αυτόν τον
τρόπο, τις απαντήσεις στο ερώτημα που τίθεται. Ως εκ τούτου, μπορούν να χειριστούν
ερωτήματα για τα οποία τα δεδομένα που απαιτούνται για την απάντησή τους δεν δηλώνονται ρητά στον γράφο γνώσης. Στην παρούσα ερευνητική εργασία, αναπτύξαμε το μοντέλο απάντησης γεωχωρικών ερωτημάτων με βάση την ενσωμάτωση, SQABo, το οποίο
κωδικοποιεί τα γεωχωρικά ερωτήματα ως κουτιά στον χώρο ενσωμάτωσης και επιστρέφει
τις απαντήσεις εντός του κουτιού. Δείχνουμε ότι αυτή η προσέγγιση έχει καλύτερες επιδόσεις από τις υπάρχουσες εργασίες στη βιβλιογραφία, οι οποίες κωδικοποιούν τα ερωτήματα ως σημεία στο διανυσματικό χώρο. Επιπλέον, διαθέτουμε ελεύθερα στην ερευνητική
κοινότητα ένα σύνολο δεδομένων για την απάντηση ερωτημάτων για το YAGO2geo, έναν
από τους πλουσιότερους και ακριβέστερους γράφους γεωχωρικής γνώσης, για μελλοντική
έρευνα.Geospatial knowledge graphs suffer from incompleteness which is due to the not-alwaysreliable data sources. This dramatically affects the results of geospatial query answering with traditional techniques which use standard query languages like stSPARQL or
GeoSPARQL.An alternative method for query answering is by using KG embeddings.
Embedding-based models project entities and relations of the posed query onto the continuous vector space, predicting, this way, the answers to the posed query. Hence, they
can handle queries for which the data required for their answering is not explicitly stated
in the knowledge graph. In this research work, we have developed the embedding-based
geospatial query answering model, SQABo, which encodes the geospatial queries as boxes
into the embedding space and returns the answers inside the box. We show that this approach performs better than existing work in the literature, which encodes the queries as
points in the vector space. Additionally, we make freely available a query-answering dataset for YAGO2geo, one of the richest and most precise geospatial knowledge graphs, to
the research community for future research