275 research outputs found
Induction of Word and Phrase Alignments for Automatic Document Summarization
Current research in automatic single document summarization is dominated by
two effective, yet naive approaches: summarization by sentence extraction, and
headline generation via bag-of-words models. While successful in some tasks,
neither of these models is able to adequately capture the large set of
linguistic devices utilized by humans when they produce summaries. One possible
explanation for the widespread use of these models is that good techniques have
been developed to extract appropriate training data for them from existing
document/abstract and document/headline corpora. We believe that future
progress in automatic summarization will be driven both by the development of
more sophisticated, linguistically informed models, as well as a more effective
leveraging of document/abstract corpora. In order to open the doors to
simultaneously achieving both of these goals, we have developed techniques for
automatically producing word-to-word and phrase-to-phrase alignments between
documents and their human-written abstracts. These alignments make explicit the
correspondences that exist in such document/abstract pairs, and create a
potentially rich data source from which complex summarization algorithms may
learn. This paper describes experiments we have carried out to analyze the
ability of humans to perform such alignments, and based on these analyses, we
describe experiments for creating them automatically. Our model for the
alignment task is based on an extension of the standard hidden Markov model,
and learns to create alignments in a completely unsupervised fashion. We
describe our model in detail and present experimental results that show that
our model is able to learn to reliably identify word- and phrase-level
alignments in a corpus of pairs
Align and Copy: UZH at SIGMORPHON 2017 Shared Task for Morphological Reinflection
This paper presents the submissions by the University of Zurich to the
SIGMORPHON 2017 shared task on morphological reinflection. The task is to
predict the inflected form given a lemma and a set of morpho-syntactic
features. We focus on neural network approaches that can tackle the task in a
limited-resource setting. As the transduction of the lemma into the inflected
form is dominated by copying over lemma characters, we propose two recurrent
neural network architectures with hard monotonic attention that are strong at
copying and, yet, substantially different in how they achieve this. The first
approach is an encoder-decoder model with a copy mechanism. The second approach
is a neural state-transition system over a set of explicit edit actions,
including a designated COPY action. We experiment with character alignment and
find that naive, greedy alignment consistently produces strong results for some
languages. Our best system combination is the overall winner of the SIGMORPHON
2017 Shared Task 1 without external resources. At a setting with 100 training
samples, both our approaches, as ensembles of models, outperform the next best
competitor.Comment: To appear in Proceedings of the 15th Annual SIGMORPHON Workshop on
Computational Research in Phonetics, Phonology, and Morphology at CoNLL 201
- …