Search CORE

427 research outputs found

SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression

Author: Androutsopoulos Ion
Baziotis Christos
Konstas Ioannis
Potamianos Alexandros
Publication venue
Publication date: 01/01/2019
Field of study

Neural sequence-to-sequence models are currently the dominant approach in several natural language processing tasks, but require large parallel corpora. We present a sequence-to-sequence-to-sequence autoencoder (SEQ^3), consisting of two chained encoder-decoder pairs, with words used as a sequence of discrete latent variables. We apply the proposed model to unsupervised abstractive sentence compression, where the first and last sequences are the input and reconstructed sentences, respectively, while the middle sequence is the compressed sentence. Constraining the length of the latent word sequences forces the model to distill important information from the input. A pretrained language model, acting as a prior over the latent sequences, encourages the compressed sentences to be human-readable. Continuous relaxations enable us to sample from categorical distributions, allowing gradient-based optimization, unlike alternatives that rely on reinforcement learning. The proposed model does not require parallel text-summary pairs, achieving promising results in unsupervised sentence compression on benchmark datasets.Comment: Accepted to NAACL 201

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref

Deep Recurrent Generative Decoder for Abstractive Text Summarization

Author: Bing Lidong
Lam Wai
Li Piji
Wang Zihao
Publication venue
Publication date: 01/01/2017
Field of study

We propose a new framework for abstractive text summarization based on a sequence-to-sequence oriented encoder-decoder model equipped with a deep recurrent generative decoder (DRGN). Latent structure information implied in the target summaries is learned based on a recurrent latent random model for improving the summarization quality. Neural variational inference is employed to address the intractable posterior inference for the recurrent latent variables. Abstractive summaries are generated based on both the generative latent variables and the discriminative deterministic states. Extensive experiments on some benchmark datasets in different languages show that DRGN achieves improvements over the state-of-the-art methods.Comment: 10 pages, EMNLP 201

arXiv.org e-Print Archive

Crossref

Abstractive Multi-Document Summarization via Phrase Selection and Merging

Author: Bing Lidong
Guo Weiwei
Lam Wai
Li Piji
Liao Yi
Passonneau Rebecca J.
Publication venue
Publication date: 01/01/2015
Field of study

We propose an abstraction-based multi-document summarization framework that can construct new sentences by exploring more fine-grained syntactic units than sentences, namely, noun/verb phrases. Different from existing abstraction-based approaches, our method first constructs a pool of concepts and facts represented by phrases from the input documents. Then new sentences are generated by selecting and merging informative phrases to maximize the salience of phrases and meanwhile satisfy the sentence construction constraints. We employ integer linear optimization for conducting phrase selection and merging simultaneously in order to achieve the global optimal solution for a summary. Experimental results on the benchmark data set TAC 2011 show that our framework outperforms the state-of-the-art models under automated pyramid evaluation metric, and achieves reasonably well results on manual linguistic quality evaluation.Comment: 11 pages, 1 figure, accepted as a full paper at ACL 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

A Multi-task Learning Approach for Improving Product Title Compression with User Search Log Data

Author: Lan Man
Lang Jun
Li Sheng
Qiu Long
Si Luo
Tian Junfeng
Wang Jingang
Publication venue
Publication date: 05/01/2018
Field of study

It is a challenging and practical research problem to obtain effective compression of lengthy product titles for E-commerce. This is particularly important as more and more users browse mobile E-commerce apps and more merchants make the original product titles redundant and lengthy for Search Engine Optimization. Traditional text summarization approaches often require a large amount of preprocessing costs and do not capture the important issue of conversion rate in E-commerce. This paper proposes a novel multi-task learning approach for improving product title compression with user search log data. In particular, a pointer network-based sequence-to-sequence approach is utilized for title compression with an attentive mechanism as an extractive method and an attentive encoder-decoder approach is utilized for generating user search queries. The encoding parameters (i.e., semantic embedding of original titles) are shared among the two tasks and the attention distributions are jointly optimized. An extensive set of experiments with both human annotated data and online deployment demonstrate the advantage of the proposed research for both compression qualities and online business values.Comment: 8 Pages, accepted at AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Adapting End-to-End Speech Recognition for Readable Subtitles

Author: Liu Danni
Niehues Jan
Spanakis Gerasimos
Publication venue
Publication date: 01/01/2020
Field of study

Automatic speech recognition (ASR) systems are primarily evaluated on transcription accuracy. However, in some use cases such as subtitling, verbatim transcription would reduce output readability given limited screen size and reading time. Therefore, this work focuses on ASR with output compression, a task challenging for supervised approaches due to the scarcity of training data. We first investigate a cascaded system, where an unsupervised compression model is used to post-edit the transcribed speech. We then compare several methods of end-to-end speech recognition under output length constraints. The experiments show that with limited data far less than needed for training a model from scratch, we can adapt a Transformer-based ASR model to incorporate both transcription and compression capabilities. Furthermore, the best performance in terms of WER and ROUGE scores is achieved by explicitly modeling the length constraints within the end-to-end ASR system.Comment: IWSLT 202

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref

Controlling Output Length in Neural Encoder-Decoders

Author: Kikuchi Yuta
Neubig Graham
Okumura Manabu
Sasano Ryohei
Takamura Hiroya
Publication venue
Publication date: 01/01/2016
Field of study

Neural encoder-decoder models have shown great success in many sequence generation tasks. However, previous work has not investigated situations in which we would like to control the length of encoder-decoder outputs. This capability is crucial for applications such as text summarization, in which we have to generate concise summaries with a desired length. In this paper, we propose methods for controlling the output sequence length for neural encoder-decoder models: two decoding-based methods and two learning-based methods. Results show that our learning-based methods have the capability to control length without degrading summary quality in a summarization task.Comment: 11 pages. To appear in EMNLP 201

arXiv.org e-Print Archive

Crossref