Search CORE

10,913 research outputs found

A Novel Method of Sentence Ordering Based on Support Vector Machine

Author: He Yanxiang
Peng Gongfu
Tian Ye
Tian Yingsheng
Wen Weidong
Publication venue: City University of Hong Kong
Publication date: 01/01/2009
Field of study

PACLIC 23 / City University of Hong Kong / 3-5 December 200

Waseda University Repository

Order-Preserving Abstractive Summarization for Spoken Content Based on Connectionist Temporal Classification

Author: Chen Yun-Nung
Lee Hung-Yi
Lee Lin-shan
Lu Bo-Ru
Shyu Frank
Publication venue
Publication date: 15/11/2017
Field of study

Connectionist temporal classification (CTC) is a powerful approach for sequence-to-sequence learning, and has been popularly used in speech recognition. The central ideas of CTC include adding a label "blank" during training. With this mechanism, CTC eliminates the need of segment alignment, and hence has been applied to various sequence-to-sequence learning problems. In this work, we applied CTC to abstractive summarization for spoken content. The "blank" in this case implies the corresponding input data are less important or noisy; thus it can be ignored. This approach was shown to outperform the existing methods in term of ROUGE scores over Chinese Gigaword and MATBN corpora. This approach also has the nice property that the ordering of words or characters in the input documents can be better preserved in the generated summaries.Comment: Accepted by Interspeech 201

arXiv.org e-Print Archive

Crossref

Top-Rank Enhanced Listwise Optimization for Statistical Machine Translation

Author: Chen Huadong
Chen Jiajun
Chiang David
Dai Xinyu
Huang Shujian
Publication venue
Publication date: 01/01/2017
Field of study

Pairwise ranking methods are the basis of many widely used discriminative training approaches for structure prediction problems in natural language processing(NLP). Decomposing the problem of ranking hypotheses into pairwise comparisons enables simple and efficient solutions. However, neglecting the global ordering of the hypothesis list may hinder learning. We propose a listwise learning framework for structure prediction problems such as machine translation. Our framework directly models the entire translation list's ordering to learn parameters which may better fit the given listwise samples. Furthermore, we propose top-rank enhanced loss functions, which are more sensitive to ranking errors at higher positions. Experiments on a large-scale Chinese-English translation task show that both our listwise learning framework and top-rank enhanced listwise losses lead to significant improvements in translation quality.Comment: Accepted to CONLL 201

arXiv.org e-Print Archive

Crossref

Multilingual Models for Compositional Distributed Semantics

Author: Blunsom Phil
Hermann Karl Moritz
Publication venue
Publication date: 01/01/2014
Field of study

We present a novel technique for learning semantic representations, which extends the distributional hypothesis to multilingual data and joint-space embeddings. Our models leverage parallel data and learn to strongly align the embeddings of semantically equivalent sentences, while maintaining sufficient distance between those of dissimilar sentences. The models do not rely on word alignments or any syntactic information and are successfully applied to a number of diverse languages. We extend our approach to learn semantic representations at the document level, too. We evaluate these models on two cross-lingual document classification tasks, outperforming the prior state of the art. Through qualitative analysis and the study of pivoting effects we demonstrate that our representations are semantically plausible and can capture semantic relationships across languages without parallel data.Comment: Proceedings of ACL 2014 (Long papers

arXiv.org e-Print Archive

Crossref