373 research outputs found

    A Multilingual Study of Compressive Cross-Language Text Summarization

    Full text link
    Cross-Language Text Summarization (CLTS) generates summaries in a language different from the language of the source documents. Recent methods use information from both languages to generate summaries with the most informative sentences. However, these methods have performance that can vary according to languages, which can reduce the quality of summaries. In this paper, we propose a compressive framework to generate cross-language summaries. In order to analyze performance and especially stability, we tested our system and extractive baselines on a dataset available in four languages (English, French, Portuguese, and Spanish) to generate English and French summaries. An automatic evaluation showed that our method outperformed extractive state-of-art CLTS methods with better and more stable ROUGE scores for all languages

    A Novel ILP Framework for Summarizing Content with High Lexical Variety

    Full text link
    Summarizing content contributed by individuals can be challenging, because people make different lexical choices even when describing the same events. However, there remains a significant need to summarize such content. Examples include the student responses to post-class reflective questions, product reviews, and news articles published by different news agencies related to the same events. High lexical diversity of these documents hinders the system's ability to effectively identify salient content and reduce summary redundancy. In this paper, we overcome this issue by introducing an integer linear programming-based summarization framework. It incorporates a low-rank approximation to the sentence-word co-occurrence matrix to intrinsically group semantically-similar lexical items. We conduct extensive experiments on datasets of student responses, product reviews, and news documents. Our approach compares favorably to a number of extractive baselines as well as a neural abstractive summarization system. The paper finally sheds light on when and why the proposed framework is effective at summarizing content with high lexical variety.Comment: Accepted for publication in the journal of Natural Language Engineering, 201

    Using tweets to help sentence compression for news highlights generation

    Get PDF
    We explore using relevant tweets of a given news article to help sentence com-pression for generating compressive news highlights. We extend an unsupervised dependency-tree based sentence compres-sion approach by incorporating tweet in-formation to weight the tree edge in terms of informativeness and syntactic impor-tance. The experimental results on a pub-lic corpus that contains both news arti-cles and relevant tweets show that our pro-posed tweets guided sentence compres-sion method can improve the summariza-tion performance significantly compared to the baseline generic sentence compres-sion method.
    • …
    corecore