4,667 research outputs found
Content Differences in Syntactic and Semantic Representations
Syntactic analysis plays an important role in semantic parsing, but the
nature of this role remains a topic of ongoing debate. The debate has been
constrained by the scarcity of empirical comparative studies between syntactic
and semantic schemes, which hinders the development of parsing methods informed
by the details of target schemes and constructions. We target this gap, and
take Universal Dependencies (UD) and UCCA as a test case. After abstracting
away from differences of convention or formalism, we find that most content
divergences can be ascribed to: (1) UCCA's distinction between a Scene and a
non-Scene; (2) UCCA's distinction between primary relations, secondary ones and
participants; (3) different treatment of multi-word expressions, and (4)
different treatment of inter-clause linkage. We further discuss the long tail
of cases where the two schemes take markedly different approaches. Finally, we
show that the proposed comparison methodology can be used for fine-grained
evaluation of UCCA parsing, highlighting both challenges and potential sources
for improvement. The substantial differences between the schemes suggest that
semantic parsers are likely to benefit downstream text understanding
applications beyond their syntactic counterparts.Comment: NAACL-HLT 2019 camera read
Deep learning for extracting protein-protein interactions from biomedical literature
State-of-the-art methods for protein-protein interaction (PPI) extraction are
primarily feature-based or kernel-based by leveraging lexical and syntactic
information. But how to incorporate such knowledge in the recent deep learning
methods remains an open question. In this paper, we propose a multichannel
dependency-based convolutional neural network model (McDepCNN). It applies one
channel to the embedding vector of each word in the sentence, and another
channel to the embedding vector of the head of the corresponding word.
Therefore, the model can use richer information obtained from different
channels. Experiments on two public benchmarking datasets, AIMed and BioInfer,
demonstrate that McDepCNN compares favorably to the state-of-the-art
rich-feature and single-kernel based methods. In addition, McDepCNN achieves
24.4% relative improvement in F1-score over the state-of-the-art methods on
cross-corpus evaluation and 12% improvement in F1-score over kernel-based
methods on "difficult" instances. These results suggest that McDepCNN
generalizes more easily over different corpora, and is capable of capturing
long distance features in the sentences.Comment: Accepted for publication in Proceedings of the 2017 Workshop on
Biomedical Natural Language Processing, 10 pages, 2 figures, 6 table
Depicting urban boundaries from a mobility network of spatial interactions: A case study of Great Britain with geo-located Twitter data
Existing urban boundaries are usually defined by government agencies for
administrative, economic, and political purposes. Defining urban boundaries
that consider socio-economic relationships and citizen commute patterns is
important for many aspects of urban and regional planning. In this paper, we
describe a method to delineate urban boundaries based upon human interactions
with physical space inferred from social media. Specifically, we depicted the
urban boundaries of Great Britain using a mobility network of Twitter user
spatial interactions, which was inferred from over 69 million geo-located
tweets. We define the non-administrative anthropographic boundaries in a
hierarchical fashion based on different physical movement ranges of users
derived from the collective mobility patterns of Twitter users in Great
Britain. The results of strongly connected urban regions in the form of
communities in the network space yield geographically cohesive, non-overlapping
urban areas, which provide a clear delineation of the non-administrative
anthropographic urban boundaries of Great Britain. The method was applied to
both national (Great Britain) and municipal scales (the London metropolis).
While our results corresponded well with the administrative boundaries, many
unexpected and interesting boundaries were identified. Importantly, as the
depicted urban boundaries exhibited a strong instance of spatial proximity, we
employed a gravity model to understand the distance decay effects in shaping
the delineated urban boundaries. The model explains how geographical distances
found in the mobility patterns affect the interaction intensity among different
non-administrative anthropographic urban areas, which provides new insights
into human spatial interactions with urban space.Comment: 32 pages, 7 figures, International Journal of Geographic Information
Scienc
A Survey of Location Prediction on Twitter
Locations, e.g., countries, states, cities, and point-of-interests, are
central to news, emergency events, and people's daily lives. Automatic
identification of locations associated with or mentioned in documents has been
explored for decades. As one of the most popular online social network
platforms, Twitter has attracted a large number of users who send millions of
tweets on daily basis. Due to the world-wide coverage of its users and
real-time freshness of tweets, location prediction on Twitter has gained
significant attention in recent years. Research efforts are spent on dealing
with new challenges and opportunities brought by the noisy, short, and
context-rich nature of tweets. In this survey, we aim at offering an overall
picture of location prediction on Twitter. Specifically, we concentrate on the
prediction of user home locations, tweet locations, and mentioned locations. We
first define the three tasks and review the evaluation metrics. By summarizing
Twitter network, tweet content, and tweet context as potential inputs, we then
structurally highlight how the problems depend on these inputs. Each dependency
is illustrated by a comprehensive review of the corresponding strategies
adopted in state-of-the-art approaches. In addition, we also briefly review two
related problems, i.e., semantic location prediction and point-of-interest
recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur
SE-KGE: A Location-Aware Knowledge Graph Embedding Model for Geographic Question Answering and Spatial Semantic Lifting
Learning knowledge graph (KG) embeddings is an emerging technique for a
variety of downstream tasks such as summarization, link prediction, information
retrieval, and question answering. However, most existing KG embedding models
neglect space and, therefore, do not perform well when applied to (geo)spatial
data and tasks. For those models that consider space, most of them primarily
rely on some notions of distance. These models suffer from higher computational
complexity during training while still losing information beyond the relative
distance between entities. In this work, we propose a location-aware KG
embedding model called SE-KGE. It directly encodes spatial information such as
point coordinates or bounding boxes of geographic entities into the KG
embedding space. The resulting model is capable of handling different types of
spatial reasoning. We also construct a geographic knowledge graph as well as a
set of geographic query-answer pairs called DBGeo to evaluate the performance
of SE-KGE in comparison to multiple baselines. Evaluation results show that
SE-KGE outperforms these baselines on the DBGeo dataset for geographic logic
query answering task. This demonstrates the effectiveness of our
spatially-explicit model and the importance of considering the scale of
different geographic entities. Finally, we introduce a novel downstream task
called spatial semantic lifting which links an arbitrary location in the study
area to entities in the KG via some relations. Evaluation on DBGeo shows that
our model outperforms the baseline by a substantial margin.Comment: Accepted to Transactions in GI
- …