5,794 research outputs found
Knowledge Base Population using Semantic Label Propagation
A crucial aspect of a knowledge base population system that extracts new
facts from text corpora, is the generation of training data for its relation
extractors. In this paper, we present a method that maximizes the effectiveness
of newly trained relation extractors at a minimal annotation cost. Manual
labeling can be significantly reduced by Distant Supervision, which is a method
to construct training data automatically by aligning a large text corpus with
an existing knowledge base of known facts. For example, all sentences
mentioning both 'Barack Obama' and 'US' may serve as positive training
instances for the relation born_in(subject,object). However, distant
supervision typically results in a highly noisy training set: many training
sentences do not really express the intended relation. We propose to combine
distant supervision with minimal manual supervision in a technique called
feature labeling, to eliminate noise from the large and noisy initial training
set, resulting in a significant increase of precision. We further improve on
this approach by introducing the Semantic Label Propagation method, which uses
the similarity between low-dimensional representations of candidate training
instances, to extend the training set in order to increase recall while
maintaining high precision. Our proposed strategy for generating training data
is studied and evaluated on an established test collection designed for
knowledge base population tasks. The experimental results show that the
Semantic Label Propagation strategy leads to substantial performance gains when
compared to existing approaches, while requiring an almost negligible manual
annotation effort.Comment: Submitted to Knowledge Based Systems, special issue on Knowledge
Bases for Natural Language Processin
Simple and Effective Multi-Paragraph Reading Comprehension
We consider the problem of adapting neural paragraph-level question answering
models to the case where entire documents are given as input. Our proposed
solution trains models to produce well calibrated confidence scores for their
results on individual paragraphs. We sample multiple paragraphs from the
documents during training, and use a shared-normalization training objective
that encourages the model to produce globally correct output. We combine this
method with a state-of-the-art pipeline for training models on document QA
data. Experiments demonstrate strong performance on several document QA
datasets. Overall, we are able to achieve a score of 71.3 F1 on the web portion
of TriviaQA, a large improvement from the 56.7 F1 of the previous best system.Comment: 11 pages, updated a referenc
Simple and Effective Multi-Paragraph Reading Comprehension
We consider the problem of adapting neural paragraph-level question answering
models to the case where entire documents are given as input. Our proposed
solution trains models to produce well calibrated confidence scores for their
results on individual paragraphs. We sample multiple paragraphs from the
documents during training, and use a shared-normalization training objective
that encourages the model to produce globally correct output. We combine this
method with a state-of-the-art pipeline for training models on document QA
data. Experiments demonstrate strong performance on several document QA
datasets. Overall, we are able to achieve a score of 71.3 F1 on the web portion
of TriviaQA, a large improvement from the 56.7 F1 of the previous best system.Comment: 11 pages, updated a referenc
Knowledge-refined Denoising Network for Robust Recommendation
Knowledge graph (KG), which contains rich side information, becomes an
essential part to boost the recommendation performance and improve its
explainability. However, existing knowledge-aware recommendation methods
directly perform information propagation on KG and user-item bipartite graph,
ignoring the impacts of \textit{task-irrelevant knowledge propagation} and
\textit{vulnerability to interaction noise}, which limits their performance. To
solve these issues, we propose a robust knowledge-aware recommendation
framework, called \textit{Knowledge-refined Denoising Network} (KRDN), to prune
the task-irrelevant knowledge associations and noisy implicit feedback
simultaneously. KRDN consists of an adaptive knowledge refining strategy and a
contrastive denoising mechanism, which are able to automatically distill
high-quality KG triplets for aggregation and prune noisy implicit feedback
respectively. Besides, we also design the self-adapted loss function and the
gradient estimator for model optimization. The experimental results on three
benchmark datasets demonstrate the effectiveness and robustness of KRDN over
the state-of-the-art knowledge-aware methods like KGIN, MCCLK, and KGCL, and
also outperform robust recommendation models like SGL and SimGCL
ALEX: Towards Effective Graph Transfer Learning with Noisy Labels
Graph Neural Networks (GNNs) have garnered considerable interest due to their
exceptional performance in a wide range of graph machine learning tasks.
Nevertheless, the majority of GNN-based approaches have been examined using
well-annotated benchmark datasets, leading to suboptimal performance in
real-world graph learning scenarios. To bridge this gap, the present paper
investigates the problem of graph transfer learning in the presence of label
noise, which transfers knowledge from a noisy source graph to an unlabeled
target graph. We introduce a novel technique termed Balance Alignment and
Information-aware Examination (ALEX) to address this challenge. ALEX first
employs singular value decomposition to generate different views with crucial
structural semantics, which help provide robust node representations using
graph contrastive learning. To mitigate both label shift and domain shift, we
estimate a prior distribution to build subgraphs with balanced label
distributions. Building on this foundation, an adversarial domain discriminator
is incorporated for the implicit domain alignment of complex multi-modal
distributions. Furthermore, we project node representations into a different
space, optimizing the mutual information between the projected features and
labels. Subsequently, the inconsistency of similarity structures is evaluated
to identify noisy samples with potential overfitting. Comprehensive experiments
on various benchmark datasets substantiate the outstanding superiority of the
proposed ALEX in different settings.Comment: Accepted by the ACM International Conference on Multimedia (MM) 202
Explainable Reasoning over Knowledge Graphs for Recommendation
Incorporating knowledge graph into recommender systems has attracted
increasing attention in recent years. By exploring the interlinks within a
knowledge graph, the connectivity between users and items can be discovered as
paths, which provide rich and complementary information to user-item
interactions. Such connectivity not only reveals the semantics of entities and
relations, but also helps to comprehend a user's interest. However, existing
efforts have not fully explored this connectivity to infer user preferences,
especially in terms of modeling the sequential dependencies within and holistic
semantics of a path. In this paper, we contribute a new model named
Knowledge-aware Path Recurrent Network (KPRN) to exploit knowledge graph for
recommendation. KPRN can generate path representations by composing the
semantics of both entities and relations. By leveraging the sequential
dependencies within a path, we allow effective reasoning on paths to infer the
underlying rationale of a user-item interaction. Furthermore, we design a new
weighted pooling operation to discriminate the strengths of different paths in
connecting a user with an item, endowing our model with a certain level of
explainability. We conduct extensive experiments on two datasets about movie
and music, demonstrating significant improvements over state-of-the-art
solutions Collaborative Knowledge Base Embedding and Neural Factorization
Machine.Comment: 8 pages, 5 figures, AAAI-201
- …