7,862 research outputs found
Pair-Linking for Collective Entity Disambiguation: Two Could Be Better Than All
Collective entity disambiguation aims to jointly resolve multiple mentions by
linking them to their associated entities in a knowledge base. Previous works
are primarily based on the underlying assumption that entities within the same
document are highly related. However, the extend to which these mentioned
entities are actually connected in reality is rarely studied and therefore
raises interesting research questions. For the first time, we show that the
semantic relationships between the mentioned entities are in fact less dense
than expected. This could be attributed to several reasons such as noise, data
sparsity and knowledge base incompleteness. As a remedy, we introduce MINTREE,
a new tree-based objective for the entity disambiguation problem. The key
intuition behind MINTREE is the concept of coherence relaxation which utilizes
the weight of a minimum spanning tree to measure the coherence between
entities. Based on this new objective, we design a novel entity disambiguation
algorithms which we call Pair-Linking. Instead of considering all the given
mentions, Pair-Linking iteratively selects a pair with the highest confidence
at each step for decision making. Via extensive experiments, we show that our
approach is not only more accurate but also surprisingly faster than many
state-of-the-art collective linking algorithms
Semi-supervised Embedding in Attributed Networks with Outliers
In this paper, we propose a novel framework, called Semi-supervised Embedding
in Attributed Networks with Outliers (SEANO), to learn a low-dimensional vector
representation that systematically captures the topological proximity,
attribute affinity and label similarity of vertices in a partially labeled
attributed network (PLAN). Our method is designed to work in both transductive
and inductive settings while explicitly alleviating noise effects from
outliers. Experimental results on various datasets drawn from the web, text and
image domains demonstrate the advantages of SEANO over state-of-the-art methods
in semi-supervised classification under transductive as well as inductive
settings. We also show that a subset of parameters in SEANO is interpretable as
outlier score and can significantly outperform baseline methods when applied
for detecting network outliers. Finally, we present the use of SEANO in a
challenging real-world setting -- flood mapping of satellite images and show
that it is able to outperform modern remote sensing algorithms for this task.Comment: in Proceedings of SIAM International Conference on Data Mining
(SDM'18
Iterative graph cuts for image segmentation with a nonlinear statistical shape prior
Shape-based regularization has proven to be a useful method for delineating
objects within noisy images where one has prior knowledge of the shape of the
targeted object. When a collection of possible shapes is available, the
specification of a shape prior using kernel density estimation is a natural
technique. Unfortunately, energy functionals arising from kernel density
estimation are of a form that makes them impossible to directly minimize using
efficient optimization algorithms such as graph cuts. Our main contribution is
to show how one may recast the energy functional into a form that is
minimizable iteratively and efficiently using graph cuts.Comment: Revision submitted to JMIV (02/24/13
Neural Collective Entity Linking
Entity Linking aims to link entity mentions in texts to knowledge bases, and
neural models have achieved recent success in this task. However, most existing
methods rely on local contexts to resolve entities independently, which may
usually fail due to the data sparsity of local information. To address this
issue, we propose a novel neural model for collective entity linking, named as
NCEL. NCEL applies Graph Convolutional Network to integrate both local
contextual features and global coherence information for entity linking. To
improve the computation efficiency, we approximately perform graph convolution
on a subgraph of adjacent entity mentions instead of those in the entire text.
We further introduce an attention scheme to improve the robustness of NCEL to
data noise and train the model on Wikipedia hyperlinks to avoid overfitting and
domain bias. In experiments, we evaluate NCEL on five publicly available
datasets to verify the linking performance as well as generalization ability.
We also conduct an extensive analysis of time complexity, the impact of key
modules, and qualitative results, which demonstrate the effectiveness and
efficiency of our proposed method.Comment: 12 pages, 3 figures, COLING201
- …