20 research outputs found

    Neural Natural Language Inference Models Enhanced with External Knowledge

    Full text link
    Modeling natural language inference is a very challenging task. With the availability of large annotated data, it has recently become feasible to train complex models such as neural-network-based inference models, which have shown to achieve the state-of-the-art performance. Although there exist relatively large annotated data, can machines learn all knowledge needed to perform natural language inference (NLI) from these data? If not, how can neural-network-based NLI models benefit from external knowledge and how to build NLI models to leverage it? In this paper, we enrich the state-of-the-art neural natural language inference models with external knowledge. We demonstrate that the proposed models improve neural NLI models to achieve the state-of-the-art performance on the SNLI and MultiNLI datasets.Comment: Accepted by ACL 201

    Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces

    Get PDF
    We combine multi-task learning and semi-supervised learning by inducing a joint embedding space between disparate label spaces and learning transfer functions between label embeddings, enabling us to jointly leverage unlabelled data and auxiliary, annotated datasets. We evaluate our approach on a variety of sequence classification tasks with disparate label spaces. We outperform strong single and multi-task baselines and achieve a new state-of-the-art for topic-based sentiment analysis.Comment: To appear at NAACL 2018 (long

    LCT-MALTAs submission to RepEval 2017 shared task

    Get PDF
    We present in this paper our team LCTMALTA’s submission to the RepEval 2017 Shared Task on natural language inference. Our system is a simple system based on a standard BiLSTM architecture, using as input GloVe word embeddings augmented with further linguistic information. We use max pooling on the BiLSTM outputs to obtain embeddings for sentences. On both the matched and the mismatched test sets, our system clearly beats the shared task’s BiLSTM baseline model.peer-reviewe

    Compare, Compress and Propagate: Enhancing Neural Architectures with Alignment Factorization for Natural Language Inference

    Full text link
    This paper presents a new deep learning architecture for Natural Language Inference (NLI). Firstly, we introduce a new architecture where alignment pairs are compared, compressed and then propagated to upper layers for enhanced representation learning. Secondly, we adopt factorization layers for efficient and expressive compression of alignment vectors into scalar features, which are then used to augment the base word representations. The design of our approach is aimed to be conceptually simple, compact and yet powerful. We conduct experiments on three popular benchmarks, SNLI, MultiNLI and SciTail, achieving competitive performance on all. A lightweight parameterization of our model also enjoys a ≈3\approx 3 times reduction in parameter size compared to the existing state-of-the-art models, e.g., ESIM and DIIN, while maintaining competitive performance. Additionally, visual analysis shows that our propagated features are highly interpretable.Comment: EMNLP 2018 CRC and Update CAFE + ELMo result on SNL

    On the Evaluation of Semantic Phenomena in Neural Machine Translation Using Natural Language Inference

    Full text link
    We propose a process for investigating the extent to which sentence representations arising from neural machine translation (NMT) systems encode distinct semantic phenomena. We use these representations as features to train a natural language inference (NLI) classifier based on datasets recast from existing semantic annotations. In applying this process to a representative NMT system, we find its encoder appears most suited to supporting inferences at the syntax-semantics interface, as compared to anaphora resolution requiring world-knowledge. We conclude with a discussion on the merits and potential deficiencies of the existing process, and how it may be improved and extended as a broader framework for evaluating semantic coverage.Comment: To be presented at NAACL 2018 - 11 page
    corecore