Search CORE

4,232 research outputs found

Controlling Output Length in Neural Encoder-Decoders

Author: Kikuchi Yuta
Neubig Graham
Okumura Manabu
Sasano Ryohei
Takamura Hiroya
Publication venue
Publication date: 01/01/2016
Field of study

Neural encoder-decoder models have shown great success in many sequence generation tasks. However, previous work has not investigated situations in which we would like to control the length of encoder-decoder outputs. This capability is crucial for applications such as text summarization, in which we have to generate concise summaries with a desired length. In this paper, we propose methods for controlling the output sequence length for neural encoder-decoder models: two decoding-based methods and two learning-based methods. Results show that our learning-based methods have the capability to control length without degrading summary quality in a summarization task.Comment: 11 pages. To appear in EMNLP 201

arXiv.org e-Print Archive

Crossref

Towards automatic tweet generation: A comparative study from the text summarization perspective in the journalism genre

Author: Lloret Elena
Palomar Manuel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

In recent years, Twitter has become one of the most important microblogging services of the Web 2.0. Among the possible uses it allows, it can be employed for communicating and broadcasting information in real time. The goal of this research is to analyze the task of automatic tweet generation from a text summarization perspective in the context of the journalism genre. To achieve this, different state-of-the-art summarizers are selected and employed for producing multi-lingual tweets in two languages (English and Spanish). A wide experimental framework is proposed, comprising the creation of a new corpus, the generation of the automatic tweets, and their assessment through a quantitative and a qualitative evaluation, where informativeness, indicativeness and interest are key criteria that should be ensured in the proposed context. From the results obtained, it was observed that although the original tweets were considered as model tweets with respect to their informativeness, they were not among the most interesting ones from a human viewpoint. Therefore, relying only on these tweets may not be the ideal way to communicate news through Twitter, especially if a more personalized and catchy way of reporting news wants to be performed. In contrast, we showed that recent text summarization techniques may be more appropriate, reflecting a balance between indicativeness and interest, even if their content was different from the tweets delivered by the news providers.This research work has been partially funded by the Spanish Government (Ministerio de Economía y competitividad) through the project “Técnicas de Deconstrucción en la Tecnologías del Lenguaje Humano” (TIN2012–31224), and by the Valencian Government through projects PROMETEO (PROMETEO/2009/199) and ACOMP/2011/001

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Paraphrasing Headlines by Machine Translation Sentential Paraphrase Acquisition and Generation using Google News

Author: Krahmer E.J.
van den Bosch A.
Wubben S.
Publication venue: 'The Korean Society of Clothing and Textiles'
Publication date: 01/01/2011
Field of study

Tilburg University Repository