640 research outputs found
Improving large vocabulary continuous speech recognition by combining GMM-based and reservoir-based acoustic modeling
In earlier work we have shown that good phoneme recognition is possible with a so-called reservoir, a special type of recurrent neural network. In this paper, different architectures based on Reservoir Computing (RC) for large vocabulary continuous speech recognition are investigated. Besides experiments with HMM hybrids, it is shown that a RC-HMM tandem can achieve the same recognition accuracy as a classical HMM, which is a promising result for such a fairly new paradigm. It is also demonstrated that a state-level combination of the scores of the tandem and the baseline HMM leads to a significant improvement over the baseline. A word error rate reduction of the order of 20\% relative is possible
Knowledge Graph-Augmented Language Models for Knowledge-Grounded Dialogue Generation
Language models have achieved impressive performances on dialogue generation
tasks. However, when generating responses for a conversation that requires
factual knowledge, they are far from perfect, due to an absence of mechanisms
to retrieve, encode, and reflect the knowledge in the generated responses. Some
knowledge-grounded dialogue generation methods tackle this problem by
leveraging facts from Knowledge Graphs (KGs); however, they do not guarantee
that the model utilizes a relevant piece of knowledge from the KG. To overcome
this limitation, we propose SUbgraph Retrieval-augmented GEneration (SURGE), a
framework for generating context-relevant and knowledge-grounded dialogues with
the KG. Specifically, our SURGE framework first retrieves the relevant subgraph
from the KG, and then enforces consistency across facts by perturbing their
word embeddings conditioned by the retrieved subgraph. Then, we utilize
contrastive learning to ensure that the generated texts have high similarity to
the retrieved subgraphs. We validate our SURGE framework on OpendialKG and
KOMODIS datasets, showing that it generates high-quality dialogues that
faithfully reflect the knowledge from KG.Comment: Preprint. Under revie
Bounding the search space for global optimization of neural networks learning error: an interval analysis approach
Training a multilayer perceptron (MLP) with algorithms employing global search strategies has been an important research direction in the field of neural networks. Despite a number of significant results, an important matter concerning the bounds of the search region---typically defined as a box---where a global optimization method has to search for a potential global minimizer seems to be unresolved. The approach presented in this paper builds on interval analysis and attempts to define guaranteed bounds in the search space prior to applying a global search algorithm for training an MLP. These bounds depend on the machine precision and the term guaranteed denotes that the region defined surely encloses weight sets that are global minimizers of the neural network's error function. Although the solution set to the bounding problem for an MLP is in general non-convex, the paper presents the theoretical results that help deriving a box which is a convex set. This box is an outer approximation of the algebraic solutions to the interval equations resulting from the function implemented by the network nodes. An experimental study using well known benchmarks is presented in accordance with the theoretical results
Recommended from our members
A language space representation for speech recognition
© 2015 IEEE. The number of languages for which speech recognition systems have become available is growing each year. This paper proposes to view languages as points in some rich space, termed language space, where bases are eigen-languages and a particular selection of the projection determines points. Such an approach could not only reduce development costs for each new language but also provide automatic means for language analysis. For the initial proof of the concept, this paper adopts cluster adaptive training (CAT) known for inducing similar spaces for speaker adaptation needs. The CAT approach used in this paper builds on the previous work for language adaptation in speech synthesis and extends it to Gaussian mixture modelling more appropriate for speech recognition. Experiments conducted on IARPA Babel program languages show that such language space representations can outperform language independent models and discover closely related languages in an automatic way
- …