54,682 research outputs found
Future word contexts in neural network language models
Recently, bidirectional recurrent network language models (bi-RNNLMs) have
been shown to outperform standard, unidirectional, recurrent neural network
language models (uni-RNNLMs) on a range of speech recognition tasks. This
indicates that future word context information beyond the word history can be
useful. However, bi-RNNLMs pose a number of challenges as they make use of the
complete previous and future word context information. This impacts both
training efficiency and their use within a lattice rescoring framework. In this
paper these issues are addressed by proposing a novel neural network structure,
succeeding word RNNLMs (su-RNNLMs). Instead of using a recurrent unit to
capture the complete future word contexts, a feedforward unit is used to model
a finite number of succeeding, future, words. This model can be trained much
more efficiently than bi-RNNLMs and can also be used for lattice rescoring.
Experimental results on a meeting transcription task (AMI) show the proposed
model consistently outperformed uni-RNNLMs and yield only a slight degradation
compared to bi-RNNLMs in N-best rescoring. Additionally, performance
improvements can be obtained using lattice rescoring and subsequent confusion
network decoding
Recommended from our members
Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition
Language modelling is a crucial component in a wide range of applications including speech recognition. Language models (LMs) are usually constructed by splitting a sentence into words and computing the probability of a word based on its word history. This sentence probability calculation, making use of conditional probability distributions, assumes that there is little impact from approximations used in the LMs including:
the word history representations; and approaches to handle finite training data. This motivates examining models that make use of additional information from the sentence. In this work future word information, in addition to the history, is used to predict the probability of the current word. For recurrent neural network LMs (RNNLMs) this information can be encapsulated in a bi-directional model. However, if used directly this form
of model is computationally expensive when training on large quantities of data, and can be problematic when used with word lattices. This paper proposes a novel neural network language model structure, the succeeding-word RNNLM, su-RNNLM, to address these issues. Instead of using a recurrent unit to capture the complete future word contexts, a feed-forward unit is used to model a fixed finite number of succeeding words. This is more efficient in training than bi-directional models and can be applied to lattice rescoring. The generated lattices can be used for downstream applications, such as confusion network decoding and keyword search. Experimental results on speech recognition and keyword spotting tasks illustrate the empirical usefulness of future word information, and the flexibility of the proposed model to represent this information
Word Sense Determination from Wikipedia Data Using Neural Networks
Many words have multiple meanings. For example, “plant” can mean a type of living organism or a factory. Being able to determine the sense of such words is very useful in natural language processing tasks, such as speech synthesis, question answering, and machine translation. For the project described in this report, we used a modular model to classify the sense of words to be disambiguated. This model consisted of two parts: The first part was a neural-network-based language model to compute continuous vector representations of words from data sets created from Wikipedia pages. The second part classified the meaning of the given word without explicitly knowing what the meaning is. In this unsupervised word sense determination task, we did not need human-tagged training data or a dictionary of senses for each word. We tested the model with some naturally ambiguous words, and compared our experimental results with the related work by Schütze in 1998. Our model achieved similar accuracy as Schütze’s work for some words
Effective Spoken Language Labeling with Deep Recurrent Neural Networks
Understanding spoken language is a highly complex problem, which can be
decomposed into several simpler tasks. In this paper, we focus on Spoken
Language Understanding (SLU), the module of spoken dialog systems responsible
for extracting a semantic interpretation from the user utterance. The task is
treated as a labeling problem. In the past, SLU has been performed with a wide
variety of probabilistic models. The rise of neural networks, in the last
couple of years, has opened new interesting research directions in this domain.
Recurrent Neural Networks (RNNs) in particular are able not only to represent
several pieces of information as embeddings but also, thanks to their recurrent
architecture, to encode as embeddings relatively long contexts. Such long
contexts are in general out of reach for models previously used for SLU. In
this paper we propose novel RNNs architectures for SLU which outperform
previous ones. Starting from a published idea as base block, we design new deep
RNNs achieving state-of-the-art results on two widely used corpora for SLU:
ATIS (Air Traveling Information System), in English, and MEDIA (Hotel
information and reservation in France), in French.Comment: 8 pages. Rejected from IJCAI 2017, good remarks overall, but slightly
off-topic as from global meta-reviews. Recommendations: 8, 6, 6, 4. arXiv
admin note: text overlap with arXiv:1706.0174
- …