21 research outputs found

    Convolutional Spatial Attention Model for Reading Comprehension with Multiple-Choice Questions

    Full text link
    Machine Reading Comprehension (MRC) with multiple-choice questions requires the machine to read given passage and select the correct answer among several candidates. In this paper, we propose a novel approach called Convolutional Spatial Attention (CSA) model which can better handle the MRC with multiple-choice questions. The proposed model could fully extract the mutual information among the passage, question, and the candidates, to form the enriched representations. Furthermore, to merge various attention results, we propose to use convolutional operation to dynamically summarize the attention values within the different size of regions. Experimental results show that the proposed model could give substantial improvements over various state-of-the-art systems on both RACE and SemEval-2018 Task11 datasets.Comment: 8 pages. Accepted as a conference paper at AAAI-19 Technical Trac

    Commonsense Knowledge Base Completion with Structural and Semantic Context

    Full text link
    Automatic KB completion for commonsense knowledge graphs (e.g., ATOMIC and ConceptNet) poses unique challenges compared to the much studied conventional knowledge bases (e.g., Freebase). Commonsense knowledge graphs use free-form text to represent nodes, resulting in orders of magnitude more nodes compared to conventional KBs (18x more nodes in ATOMIC compared to Freebase (FB15K-237)). Importantly, this implies significantly sparser graph structures - a major challenge for existing KB completion methods that assume densely connected graphs over a relatively smaller set of nodes. In this paper, we present novel KB completion models that can address these challenges by exploiting the structural and semantic context of nodes. Specifically, we investigate two key ideas: (1) learning from local graph structure, using graph convolutional networks and automatic graph densification and (2) transfer learning from pre-trained language models to knowledge graphs for enhanced contextual representation of knowledge. We describe our method to incorporate information from both these sources in a joint model and provide the first empirical results for KB completion on ATOMIC and evaluation with ranking metrics on ConceptNet. Our results demonstrate the effectiveness of language model representations in boosting link prediction performance and the advantages of learning from local graph structure (+1.5 points in MRR for ConceptNet) when training on subgraphs for computational efficiency. Further analysis on model predictions shines light on the types of commonsense knowledge that language models capture well.Comment: AAAI 202
    corecore