15,066 research outputs found
When Are Tree Structures Necessary for Deep Learning of Representations?
Recursive neural models, which use syntactic parse trees to recursively
generate representations bottom-up, are a popular architecture. But there have
not been rigorous evaluations showing for exactly which tasks this syntax-based
method is appropriate. In this paper we benchmark {\bf recursive} neural models
against sequential {\bf recurrent} neural models (simple recurrent and LSTM
models), enforcing apples-to-apples comparison as much as possible. We
investigate 4 tasks: (1) sentiment classification at the sentence level and
phrase level; (2) matching questions to answer-phrases; (3) discourse parsing;
(4) semantic relation extraction (e.g., {\em component-whole} between nouns).
Our goal is to understand better when, and why, recursive models can
outperform simpler models. We find that recursive models help mainly on tasks
(like semantic relation extraction) that require associating headwords across a
long distance, particularly on very long sequences. We then introduce a method
for allowing recurrent models to achieve similar performance: breaking long
sentences into clause-like units at punctuation and processing them separately
before combining. Our results thus help understand the limitations of both
classes of models, and suggest directions for improving recurrent models
An investigation of speaker independent phrase break models in End-to-End TTS systems
This paper presents our work on phrase break prediction in the context of
end-to-end TTS systems, motivated by the following questions: (i) Is there any
utility in incorporating an explicit phrasing model in an end-to-end TTS
system?, and (ii) How do you evaluate the effectiveness of a phrasing model in
an end-to-end TTS system? In particular, the utility and effectiveness of
phrase break prediction models are evaluated in in the context of childrens
story synthesis, using listener comprehension. We show by means of perceptual
listening evaluations that there is a clear preference for stories synthesized
after predicting the location of phrase breaks using a trained phrasing model,
over stories directly synthesized without predicting the location of phrase
breaks.Comment: Submitted for review to IEEE Acces
Compositional Morphology for Word Representations and Language Modelling
This paper presents a scalable method for integrating compositional
morphological representations into a vector-based probabilistic language model.
Our approach is evaluated in the context of log-bilinear language models,
rendered suitably efficient for implementation inside a machine translation
decoder by factoring the vocabulary. We perform both intrinsic and extrinsic
evaluations, presenting results on a range of languages which demonstrate that
our model learns morphological representations that both perform well on word
similarity tasks and lead to substantial reductions in perplexity. When used
for translation into morphologically rich languages with large vocabularies,
our models obtain improvements of up to 1.2 BLEU points relative to a baseline
system using back-off n-gram models.Comment: Proceedings of the 31st International Conference on Machine Learning
(ICML
Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation
We introduce the multiresolution recurrent neural network, which extends the
sequence-to-sequence framework to model natural language generation as two
parallel discrete stochastic processes: a sequence of high-level coarse tokens,
and a sequence of natural language tokens. There are many ways to estimate or
learn the high-level coarse tokens, but we argue that a simple extraction
procedure is sufficient to capture a wealth of high-level discourse semantics.
Such procedure allows training the multiresolution recurrent neural network by
maximizing the exact joint log-likelihood over both sequences. In contrast to
the standard log- likelihood objective w.r.t. natural language tokens (word
perplexity), optimizing the joint log-likelihood biases the model towards
modeling high-level abstractions. We apply the proposed model to the task of
dialogue response generation in two challenging domains: the Ubuntu technical
support domain, and Twitter conversations. On Ubuntu, the model outperforms
competing approaches by a substantial margin, achieving state-of-the-art
results according to both automatic evaluation metrics and a human evaluation
study. On Twitter, the model appears to generate more relevant and on-topic
responses according to automatic evaluation metrics. Finally, our experiments
demonstrate that the proposed model is more adept at overcoming the sparsity of
natural language and is better able to capture long-term structure.Comment: 21 pages, 2 figures, 10 table
- …