21 research outputs found
Convolutional Spatial Attention Model for Reading Comprehension with Multiple-Choice Questions
Machine Reading Comprehension (MRC) with multiple-choice questions requires
the machine to read given passage and select the correct answer among several
candidates. In this paper, we propose a novel approach called Convolutional
Spatial Attention (CSA) model which can better handle the MRC with
multiple-choice questions. The proposed model could fully extract the mutual
information among the passage, question, and the candidates, to form the
enriched representations. Furthermore, to merge various attention results, we
propose to use convolutional operation to dynamically summarize the attention
values within the different size of regions. Experimental results show that the
proposed model could give substantial improvements over various
state-of-the-art systems on both RACE and SemEval-2018 Task11 datasets.Comment: 8 pages. Accepted as a conference paper at AAAI-19 Technical Trac
Commonsense Knowledge Base Completion with Structural and Semantic Context
Automatic KB completion for commonsense knowledge graphs (e.g., ATOMIC and
ConceptNet) poses unique challenges compared to the much studied conventional
knowledge bases (e.g., Freebase). Commonsense knowledge graphs use free-form
text to represent nodes, resulting in orders of magnitude more nodes compared
to conventional KBs (18x more nodes in ATOMIC compared to Freebase
(FB15K-237)). Importantly, this implies significantly sparser graph structures
- a major challenge for existing KB completion methods that assume densely
connected graphs over a relatively smaller set of nodes. In this paper, we
present novel KB completion models that can address these challenges by
exploiting the structural and semantic context of nodes. Specifically, we
investigate two key ideas: (1) learning from local graph structure, using graph
convolutional networks and automatic graph densification and (2) transfer
learning from pre-trained language models to knowledge graphs for enhanced
contextual representation of knowledge. We describe our method to incorporate
information from both these sources in a joint model and provide the first
empirical results for KB completion on ATOMIC and evaluation with ranking
metrics on ConceptNet. Our results demonstrate the effectiveness of language
model representations in boosting link prediction performance and the
advantages of learning from local graph structure (+1.5 points in MRR for
ConceptNet) when training on subgraphs for computational efficiency. Further
analysis on model predictions shines light on the types of commonsense
knowledge that language models capture well.Comment: AAAI 202