Search CORE

373 research outputs found

A Multilingual Study of Compressive Cross-Language Text Summarization

Author: E Linhares Pontes
F Boudin
HM de Caseli
J Zhang
Publication venue
Publication date: 01/01/2018
Field of study

Cross-Language Text Summarization (CLTS) generates summaries in a language different from the language of the source documents. Recent methods use information from both languages to generate summaries with the most informative sentences. However, these methods have performance that can vary according to languages, which can reduce the quality of summaries. In this paper, we propose a compressive framework to generate cross-language summaries. In order to analyze performance and especially stability, we tested our system and extractive baselines on a dataset available in four languages (English, French, Portuguese, and Spanish) to generate English and French summaries. An automatic evaluation showed that our method outperformed extractive state-of-art CLTS methods with better and more stable ROUGE scores for all languages

arXiv.org e-Print Archive

Crossref

PolyPublie

A Novel ILP Framework for Summarizing Content with High Lexical Variety

Author: Almeida
Celikyilmaz
Conroy
DIANE LITMAN
FEI LIU
Goodfellow
Li
Li
Luo
Luo
Luo
Martins
Mazumder
Mosteller
Narayan
Qian
Ren
Tarnpradab
Wang
WENCAN LUO
Wilson
Xiong
ZITAO LIU
Publication venue
Publication date: 25/07/2018
Field of study

Summarizing content contributed by individuals can be challenging, because people make different lexical choices even when describing the same events. However, there remains a significant need to summarize such content. Examples include the student responses to post-class reflective questions, product reviews, and news articles published by different news agencies related to the same events. High lexical diversity of these documents hinders the system's ability to effectively identify salient content and reduce summary redundancy. In this paper, we overcome this issue by introducing an integer linear programming-based summarization framework. It incorporates a low-rank approximation to the sentence-word co-occurrence matrix to intrinsically group semantically-similar lexical items. We conduct extensive experiments on datasets of student responses, product reviews, and news documents. Our approach compares favorably to a number of extractive baselines as well as a neural abstractive summarization system. The paper finally sheds light on when and why the proposed framework is effective at summarizing content with high lexical variety.Comment: Accepted for publication in the journal of Natural Language Engineering, 201

arXiv.org e-Print Archive

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Using tweets to help sentence compression for news highlights generation

Author: GAO Wei
LI Chen
LIU Yang
WEI Zhongyu
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2015
Field of study

We explore using relevant tweets of a given news article to help sentence com-pression for generating compressive news highlights. We extend an unsupervised dependency-tree based sentence compres-sion approach by incorporating tweet in-formation to weight the tree edge in terms of informativeness and syntactic impor-tance. The experimental results on a pub-lic corpus that contains both news arti-cles and relevant tweets show that our pro-posed tweets guided sentence compres-sion method can improve the summariza-tion performance significantly compared to the baseline generic sentence compres-sion method.

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University