292 research outputs found
English Broadcast News Speech Recognition by Humans and Machines
With recent advances in deep learning, considerable attention has been given
to achieving automatic speech recognition performance close to human
performance on tasks like conversational telephone speech (CTS) recognition. In
this paper we evaluate the usefulness of these proposed techniques on broadcast
news (BN), a similar challenging task. We also perform a set of recognition
measurements to understand how close the achieved automatic speech recognition
results are to human performance on this task. On two publicly available BN
test sets, DEV04F and RT04, our speech recognition system using LSTM and
residual network based acoustic models with a combination of n-gram and neural
network language models performs at 6.5% and 5.9% word error rate. By achieving
new performance milestones on these test sets, our experiments show that
techniques developed on other related tasks, like CTS, can be transferred to
achieve similar performance. In contrast, the best measured human recognition
performance on these test sets is much lower, at 3.6% and 2.8% respectively,
indicating that there is still room for new techniques and improvements in this
space, to reach human performance levels.Comment: \copyright 2019 IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in any current or
future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work
Getting Past the Language Gap: Innovations in Machine Translation
In this chapter, we will be reviewing state of the art machine translation systems, and will discuss innovative methods for machine translation, highlighting the most promising techniques and applications. Machine translation (MT) has benefited from a revitalization in the last 10 years or so, after a period of relatively slow activity. In 2005 the field received a jumpstart when a powerful complete experimental package for building MT systems from scratch became freely available as a result of the unified efforts of the MOSES international consortium. Around the same time, hierarchical methods had been introduced by Chinese researchers, which allowed the introduction and use of syntactic information in translation modeling. Furthermore, the advances in the related field of computational linguistics, making off-the-shelf taggers and parsers readily available, helped give MT an additional boost. Yet there is still more progress to be made. For example, MT will be enhanced greatly when both syntax and semantics are on board: this still presents a major challenge though many advanced research groups are currently pursuing ways to meet this challenge head-on. The next generation of MT will consist of a collection of hybrid systems. It also augurs well for the mobile environment, as we look forward to more advanced and improved technologies that enable the working of Speech-To-Speech machine translation on hand-held devices, i.e. speech recognition and speech synthesis. We review all of these developments and point out in the final section some of the most promising research avenues for the future of MT
De l’arabe standard vers l’arabe dialectal :projection de corpus et ressourceslinguistiques en vue du traitementautomatique de l’oral dans les médiastunisiens
International audienceRÉSUMÉ. Dans ce travail, nous nous intéressons aux problèmes liés au traitement automatique de l'oral parlé dans les médias tunisiens. Cet oral se caractérise par l'emploi de l'alternance codique entre l'arabe standard moderne (MSA) et le dialecte tunisien (DT). L'objectif consiste à construire des ressources utiles pour apprendre des modèles de langage dédiés à des applications de reconnaissance automatique de la parole. Comme il s'agit d'une variante du MSA, nous décrivons dans cet article une démarche d'adaptation des ressources MSA vers le DT. Une première évaluation en termes de couverture lexicale et de perplexité est présentée. ABSTRACT. In this work, we focus on the problems of the automatic treatment of oral spoken in the Tunisian media. This oral is marked by the use of code-switching between the Modern Standard Arabic (MSA) and the Tunisian dialect (TD). Our goal is to build useful resources to learn language models that can be used in automatic speech recognition applications. As it is a variant of MSA, we describe in this paper an adjustment process of the MSA resources to the TD. A first evaluation in terms of lexical coverage and perplexity is presented
- …