5,606 research outputs found

    Improving Aspect Term Extraction with Bidirectional Dependency Tree Representation

    Full text link
    Aspect term extraction is one of the important subtasks in aspect-based sentiment analysis. Previous studies have shown that using dependency tree structure representation is promising for this task. However, most dependency tree structures involve only one directional propagation on the dependency tree. In this paper, we first propose a novel bidirectional dependency tree network to extract dependency structure features from the given sentences. The key idea is to explicitly incorporate both representations gained separately from the bottom-up and top-down propagation on the given dependency syntactic tree. An end-to-end framework is then developed to integrate the embedded representations and BiLSTM plus CRF to learn both tree-structured and sequential features to solve the aspect term extraction problem. Experimental results demonstrate that the proposed model outperforms state-of-the-art baseline models on four benchmark SemEval datasets.Comment: Accepted by TASL

    Deep Learning for Sentiment Analysis : A Survey

    Full text link
    Deep learning has emerged as a powerful machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results. Along with the success of deep learning in many other application domains, deep learning is also popularly used in sentiment analysis in recent years. This paper first gives an overview of deep learning and then provides a comprehensive survey of its current applications in sentiment analysis.Comment: 34 pages, 9 figures, 2 table

    Deep Learning applied to NLP

    Full text link
    Convolutional Neural Network (CNNs) are typically associated with Computer Vision. CNNs are responsible for major breakthroughs in Image Classification and are the core of most Computer Vision systems today. More recently CNNs have been applied to problems in Natural Language Processing and gotten some interesting results. In this paper, we will try to explain the basics of CNNs, its different variations and how they have been applied to NLP

    Structure Regularized Bidirectional Recurrent Convolutional Neural Network for Relation Classification

    Full text link
    Relation classification is an important semantic processing task in the field of natural language processing (NLP). In this paper, we present a novel model, Structure Regularized Bidirectional Recurrent Convolutional Neural Network(SR-BRCNN), to classify the relation of two entities in a sentence, and the new dataset of Chinese Sanwen for named entity recognition and relation classification. Some state-of-the-art systems concentrate on modeling the shortest dependency path (SDP) between two entities leveraging convolutional or recurrent neural networks. We further explore how to make full use of the dependency relations information in the SDP and how to improve the model by the method of structure regularization. We propose a structure regularized model to learn relation representations along the SDP extracted from the forest formed by the structure regularized dependency tree, which benefits reducing the complexity of the whole model and helps improve the F1F_{1} score by 10.3. Experimental results show that our method outperforms the state-of-the-art approaches on the Chinese Sanwen task and performs as well on the SemEval-2010 Task 8 dataset\footnote{The Chinese Sanwen corpus this paper developed and used will be released in the further.Comment: arXiv admin note: text overlap with arXiv:1411.6243 by other author

    Reasoning with Latent Structure Refinement for Document-Level Relation Extraction

    Full text link
    Document-level relation extraction requires integrating information within and across multiple sentences of a document and capturing complex interactions between inter-sentence entities. However, effective aggregation of relevant information in the document remains a challenging research question. Existing approaches construct static document-level graphs based on syntactic trees, co-references or heuristics from the unstructured text to model the dependencies. Unlike previous methods that may not be able to capture rich non-local interactions for inference, we propose a novel model that empowers the relational reasoning across sentences by automatically inducing the latent document-level graph. We further develop a refinement strategy, which enables the model to incrementally aggregate relevant information for multi-hop reasoning. Specifically, our model achieves an F1 score of 59.05 on a large-scale document-level dataset (DocRED), significantly improving over the previous results, and also yields new state-of-the-art results on the CDR and GDA dataset. Furthermore, extensive analyses show that the model is able to discover more accurate inter-sentence relations.Comment: Appeared in the proceedings of ACL 2020 (Long paper

    A Sequential Model for Classifying Temporal Relations between Intra-Sentence Events

    Full text link
    We present a sequential model for temporal relation classification between intra-sentence events. The key observation is that the overall syntactic structure and compositional meanings of the multi-word context between events are important for distinguishing among fine-grained temporal relations. Specifically, our approach first extracts a sequence of context words that indicates the temporal relation between two events, which well align with the dependency path between two event mentions. The context word sequence, together with a parts-of-speech tag sequence and a dependency relation sequence that are generated corresponding to the word sequence, are then provided as input to bidirectional recurrent neural network (LSTM) models. The neural nets learn compositional syntactic and semantic representations of contexts surrounding the two events and predict the temporal relation between them. Evaluation of the proposed approach on TimeBank corpus shows that sequential modeling is capable of accurately recognizing temporal relations between events, which outperforms a neural net model using various discrete features as input that imitates previous feature based models.Comment: EMNLP 201

    More Data, More Relations, More Context and More Openness: A Review and Outlook for Relation Extraction

    Full text link
    Relational facts are an important component of human knowledge, which are hidden in vast amounts of text. In order to extract these facts from text, people have been working on relation extraction (RE) for years. From early pattern matching to current neural networks, existing RE methods have achieved significant progress. Yet with explosion of Web text and emergence of new relations, human knowledge is increasing drastically, and we thus require "more" from RE: a more powerful RE system that can robustly utilize more data, efficiently learn more relations, easily handle more complicated context, and flexibly generalize to more open domains. In this paper, we look back at existing RE methods, analyze key challenges we are facing nowadays, and show promising directions towards more powerful RE. We hope our view can advance this field and inspire more efforts in the community

    Neural Metric Learning for Fast End-to-End Relation Extraction

    Full text link
    Relation extraction (RE) is an indispensable information extraction task in several disciplines. RE models typically assume that named entity recognition (NER) is already performed in a previous step by another independent model. Several recent efforts, under the theme of end-to-end RE, seek to exploit inter-task correlations by modeling both NER and RE tasks jointly. Earlier work in this area commonly reduces the task to a table-filling problem wherein an additional expensive decoding step involving beam search is applied to obtain globally consistent cell labels. In efforts that do not employ table-filling, global optimization in the form of CRFs with Viterbi decoding for the NER component is still necessary for competitive performance. We introduce a novel neural architecture utilizing the table structure, based on repeated applications of 2D convolutions for pooling local dependency and metric-based features, that improves on the state-of-the-art without the need for global optimization. We validate our model on the ADE and CoNLL04 datasets for end-to-end RE and demonstrate 1%\approx 1\% gain (in F-score) over prior best results with training and testing times that are seven to ten times faster --- the latter highly advantageous for time-sensitive end user applications

    Aspect Detection using Word and Char Embeddings with (Bi)LSTM and CRF

    Full text link
    We proposed a~new accurate aspect extraction method that makes use of both word and character-based embeddings. We have conducted experiments of various models of aspect extraction using LSTM and BiLSTM including CRF enhancement on five different pre-trained word embeddings extended with character embeddings. The results revealed that BiLSTM outperforms regular LSTM, but also word embedding coverage in train and test sets profoundly impacted aspect detection performance. Moreover, the additional CRF layer consistently improves the results across different models and text embeddings. Summing up, we obtained state-of-the-art F-score results for SemEval Restaurants (85%) and Laptops (80%).Comment: IEEE AIK

    Comprehensive Analysis of Aspect Term Extraction Methods using Various Text Embeddings

    Full text link
    Recently, a variety of model designs and methods have blossomed in the context of the sentiment analysis domain. However, there is still a lack of wide and comprehensive studies of aspect-based sentiment analysis (ABSA). We want to fill this gap and propose a comparison with ablation analysis of aspect term extraction using various text embedding methods. We particularly focused on architectures based on long short-term memory (LSTM) with optional conditional random field (CRF) enhancement using different pre-trained word embeddings. Moreover, we analyzed the influence on the performance of extending the word vectorization step with character embedding. The experimental results on SemEval datasets revealed that not only does bi-directional long short-term memory (BiLSTM) outperform regular LSTM, but also word embedding coverage and its source highly affect aspect detection performance. An additional CRF layer consistently improves the results as well
    corecore