19 research outputs found

    Method for Aspect-Based Sentiment Annotation Using Rhetorical Analysis

    Full text link
    This paper fills a gap in aspect-based sentiment analysis and aims to present a new method for preparing and analysing texts concerning opinion and generating user-friendly descriptive reports in natural language. We present a comprehensive set of techniques derived from Rhetorical Structure Theory and sentiment analysis to extract aspects from textual opinions and then build an abstractive summary of a set of opinions. Moreover, we propose aspect-aspect graphs to evaluate the importance of aspects and to filter out unimportant ones from the summary. Additionally, the paper presents a prototype solution of data flow with interesting and valuable results. The proposed method's results proved the high accuracy of aspect detection when applied to the gold standard dataset

    Discourse Indicators for Content Selection in Summaization

    Get PDF
    We present analyses aimed at eliciting which specific aspects of discourse provide the strongest indication for text importance. In the context of content selection for single document summarization of news, we examine the benefits of both the graph structure of text provided by discourse relations and the semantic sense of these relations. We find that structure information is the most robust indicator of importance. Semantic sense only provides constraints on content selection but is not indicative of important content by itself. However, sense features complement structure information and lead to improved performance. Further, both types of discourse information prove complementary to non-discourse features. While our results establish the usefulness of discourse features, we also find that lexical overlap provides a simple and cheap alternative to discourse for computing text structure with comparable performance for the task of content selection

    Jointly Modeling Topics and Intents with Global Order Structure

    Full text link
    Modeling document structure is of great importance for discourse analysis and related applications. The goal of this research is to capture the document intent structure by modeling documents as a mixture of topic words and rhetorical words. While the topics are relatively unchanged through one document, the rhetorical functions of sentences usually change following certain orders in discourse. We propose GMM-LDA, a topic modeling based Bayesian unsupervised model, to analyze the document intent structure cooperated with order information. Our model is flexible that has the ability to combine the annotations and do supervised learning. Additionally, entropic regularization can be introduced to model the significant divergence between topics and intents. We perform experiments in both unsupervised and supervised settings, results show the superiority of our model over several state-of-the-art baselines.Comment: Accepted by AAAI 201

    Identifying Relationships Among Sentences in Court Case Transcripts Using Discourse Relations

    Full text link
    Case Law has a significant impact on the proceedings of legal cases. Therefore, the information that can be obtained from previous court cases is valuable to lawyers and other legal officials when performing their duties. This paper describes a methodology of applying discourse relations between sentences when processing text documents related to the legal domain. In this study, we developed a mechanism to classify the relationships that can be observed among sentences in transcripts of United States court cases. First, we defined relationship types that can be observed between sentences in court case transcripts. Then we classified pairs of sentences according to the relationship type by combining a machine learning model and a rule-based approach. The results obtained through our system were evaluated using human judges. To the best of our knowledge, this is the first study where discourse relationships between sentences have been used to determine relationships among sentences in legal court case transcripts.Comment: Conference: 2018 International Conference on Advances in ICT for Emerging Regions (ICTer

    Rhetorical relations for information retrieval

    Full text link
    Typically, every part in most coherent text has some plausible reason for its presence, some function that it performs to the overall semantics of the text. Rhetorical relations, e.g. contrast, cause, explanation, describe how the parts of a text are linked to each other. Knowledge about this socalled discourse structure has been applied successfully to several natural language processing tasks. This work studies the use of rhetorical relations for Information Retrieval (IR): Is there a correlation between certain rhetorical relations and retrieval performance? Can knowledge about a document's rhetorical relations be useful to IR? We present a language model modification that considers rhetorical relations when estimating the relevance of a document to a query. Empirical evaluation of different versions of our model on TREC settings shows that certain rhetorical relations can benefit retrieval effectiveness notably (> 10% in mean average precision over a state-of-the-art baseline)

    Cross-lingual RST Discourse Parsing

    Get PDF
    Discourse parsing is an integral part of understanding information flow and argumentative structure in documents. Most previous research has focused on inducing and evaluating models from the English RST Discourse Treebank. However, discourse treebanks for other languages exist, including Spanish, German, Basque, Dutch and Brazilian Portuguese. The treebanks share the same underlying linguistic theory, but differ slightly in the way documents are annotated. In this paper, we present (a) a new discourse parser which is simpler, yet competitive (significantly better on 2/3 metrics) to state of the art for English, (b) a harmonization of discourse treebanks across languages, enabling us to present (c) what to the best of our knowledge are the first experiments on cross-lingual discourse parsing.Comment: To be published in EACL 2017, 13 page
    corecore