21,754 research outputs found

    Analysis of Bag-of-n-grams Representation's Properties Based on Textual Reconstruction

    Full text link
    Despite its simplicity, bag-of-n-grams sen- tence representation has been found to excel in some NLP tasks. However, it has not re- ceived much attention in recent years and fur- ther analysis on its properties is necessary. We propose a framework to investigate the amount and type of information captured in a general- purposed bag-of-n-grams sentence represen- tation. We first use sentence reconstruction as a tool to obtain bag-of-n-grams representa- tion that contains general information of the sentence. We then run prediction tasks (sen- tence length, word content, phrase content and word order) using the obtained representation to look into the specific type of information captured in the representation. Our analysis demonstrates that bag-of-n-grams representa- tion does contain sentence structure level in- formation. However, incorporating n-grams with higher order n empirically helps little with encoding more information in general, except for phrase content information

    Deep Learning for Sentiment Analysis : A Survey

    Full text link
    Deep learning has emerged as a powerful machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results. Along with the success of deep learning in many other application domains, deep learning is also popularly used in sentiment analysis in recent years. This paper first gives an overview of deep learning and then provides a comprehensive survey of its current applications in sentiment analysis.Comment: 34 pages, 9 figures, 2 table

    Analysis Methods in Neural Language Processing: A Survey

    Full text link
    The field of natural language processing has seen impressive progress in recent years, with neural network models replacing many of the traditional systems. A plethora of new models have been proposed, many of which are thought to be opaque compared to their feature-rich counterparts. This has led researchers to analyze, interpret, and evaluate neural networks in novel and more fine-grained ways. In this survey paper, we review analysis methods in neural language processing, categorize them according to prominent research trends, highlight existing limitations, and point to potential directions for future work.Comment: Version including the supplementary materials (3 tables), also available at https://boknilev.github.io/nlp-analysis-method

    DisSent: Sentence Representation Learning from Explicit Discourse Relations

    Full text link
    Learning effective representations of sentences is one of the core missions of natural language understanding. Existing models either train on a vast amount of text, or require costly, manually curated sentence relation datasets. We show that with dependency parsing and rule-based rubrics, we can curate a high quality sentence relation task by leveraging explicit discourse relations. We show that our curated dataset provides an excellent signal for learning vector representations of sentence meaning, representing relations that can only be determined when the meanings of two sentences are combined. We demonstrate that the automatically curated corpus allows a bidirectional LSTM sentence encoder to yield high quality sentence embeddings and can serve as a supervised fine-tuning dataset for larger models such as BERT. Our fixed sentence embeddings achieve high performance on a variety of transfer tasks, including SentEval, and we achieve state-of-the-art results on Penn Discourse Treebank's implicit relation prediction task.Comment: 13 pages, 4 figures. ACL 201

    Understanding Cross-Lingual Syntactic Transfer in Multilingual Recurrent Neural Networks

    Full text link
    It is now established that modern neural language models can be successfully trained on multiple languages simultaneously without changes to the underlying architecture, providing an easy way to adapt a variety of NLP models to low-resource languages. But what kind of knowledge is really shared among languages within these models? Does multilingual training mostly lead to an alignment of the lexical representation spaces or does it also enable the sharing of purely grammatical knowledge? In this paper we dissect different forms of cross-lingual transfer and look for its most determining factors, using a variety of models and probing tasks. We find that exposing our language models to a related language does not always increase grammatical knowledge in the target language, and that optimal conditions for lexical-semantic transfer may not be optimal for syntactic transfer.Comment: v2: Added acknowledgements, 9 pages single column with 6 figure

    Novel Word Embedding and Translation-based Language Modeling for Extractive Speech Summarization

    Full text link
    Word embedding methods revolve around learning continuous distributed vector representations of words with neural networks, which can capture semantic and/or syntactic cues, and in turn be used to induce similarity measures among words, sentences and documents in context. Celebrated methods can be categorized as prediction-based and count-based methods according to the training objectives and model architectures. Their pros and cons have been extensively analyzed and evaluated in recent studies, but there is relatively less work continuing the line of research to develop an enhanced learning method that brings together the advantages of the two model families. In addition, the interpretation of the learned word representations still remains somewhat opaque. Motivated by the observations and considering the pressing need, this paper presents a novel method for learning the word representations, which not only inherits the advantages of classic word embedding methods but also offers a clearer and more rigorous interpretation of the learned word representations. Built upon the proposed word embedding method, we further formulate a translation-based language modeling framework for the extractive speech summarization task. A series of empirical evaluations demonstrate the effectiveness of the proposed word representation learning and language modeling techniques in extractive speech summarization

    ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission

    Full text link
    Clinical notes contain information about patients that goes beyond structured data like lab values and medications. However, clinical notes have been underused relative to structured data, because notes are high-dimensional and sparse. This work develops and evaluates representations of clinical notes using bidirectional transformers (ClinicalBERT). ClinicalBERT uncovers high-quality relationships between medical concepts as judged by humans. ClinicalBert outperforms baselines on 30-day hospital readmission prediction using both discharge summaries and the first few days of notes in the intensive care unit. Code and model parameters are available.Comment: CHIL 2020 Worksho

    Supervised Fine Tuning for Word Embedding with Integrated Knowledge

    Full text link
    Learning vector representation for words is an important research field which may benefit many natural language processing tasks. Two limitations exist in nearly all available models, which are the bias caused by the context definition and the lack of knowledge utilization. They are difficult to tackle because these algorithms are essentially unsupervised learning approaches. Inspired by deep learning, the authors propose a supervised framework for learning vector representation of words to provide additional supervised fine tuning after unsupervised learning. The framework is knowledge rich approacher and compatible with any numerical vectors word representation. The authors perform both intrinsic evaluation like attributional and relational similarity prediction and extrinsic evaluations like the sentence completion and sentiment analysis. Experiments results on 6 embeddings and 4 tasks with 10 datasets show that the proposed fine tuning framework may significantly improve the quality of the vector representation of words

    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

    Full text link
    We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement)

    Word Usage Similarity Estimation with Sentence Representations and Automatic Substitutes

    Full text link
    Usage similarity estimation addresses the semantic proximity of word instances in different contexts. We apply contextualized (ELMo and BERT) word and sentence embeddings to this task, and propose supervised models that leverage these representations for prediction. Our models are further assisted by lexical substitute annotations automatically assigned to word instances by context2vec, a neural model that relies on a bidirectional LSTM. We perform an extensive comparison of existing word and sentence representations on benchmark datasets addressing both graded and binary similarity. The best performing models outperform previous methods in both settings.Comment: *SEM 201
    • …
    corecore