Search CORE

12,801 research outputs found

Low-resource speech recognition and keyword-spotting

Author: Gales MJF
Knill KM
Ragni A
Publication venue: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Publication date: 01/01/2017
Field of study

© Springer International Publishing AG 2017. The IARPA Babel program ran from March 2012 to November 2016. The aim of the program was to develop agile and robust speech technology that can be rapidly applied to any human language in order to provide effective search capability on large quantities of real world data. This paper will describe some of the developments in speech recognition and keyword-spotting during the lifetime of the project. Two technical areas will be briefly discussed with a focus on techniques developed at Cambridge University: the application of deep learning for low-resource speech recognition; and efficient approaches for keyword spotting. Finally a brief analysis of the Babel speech language characteristics and language performance will be presented

Crossref

Apollo (Cambridge)

White Rose Research Online

Distant speech recognition for home automation: Preliminary experimental results in a smart home

Author: Lecouteux Benjamin
Portet François
Vacher Michel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/05/2011
Field of study

International audienceThis paper presents a study that is part of the Sweet-Home project which aims at developing a new home automation system based on voice command. The study focused on two tasks: distant speech recognition and sentence spotting (e.g., recognition of domotic orders). Regarding the first task, different combinations of ASR systems, language and acoustic models were tested. Fusion of ASR outputs by consensus and with a triggered language model (using a priori knowledge) were investigated. For the sentence spotting task, an algorithm based on distance evaluation between the current ASR hypotheses and the predefine set of keyword patterns was introduced in order to retrieve the correct sentences in spite of the ASR errors. The techniques were assessed on real daily living data collected in a 4-room smart home that was fully equipped with standard tactile commands and with 7 wireless microphones set in the ceiling. Thanks to Driven Decoding Algorithm techniques, a classical ASR system reached 7.9% WER against 35% WER in standard configuration and 15% with MLLR adaptation only. The best keyword pattern classification result obtained in distant speech conditions was 7.5% CER

Crossref

Hal - Université Grenoble Alpes

Distant Speech Recognition for Home Automation: Preliminary Experimental Results in a Smart Home

Author: Lecouteux Benjamin
Portet François
Vacher Michel
Publication venue: HAL CCSD
Publication date: 18/05/2011
Field of study

Hal - Université Grenoble Alpes

On Distant Speech Recognition for Home Automation

Author: A Baba
B Lecouteux
B Vlasenko
D Istrate
F Mäyrä
F Portet
G Filho
J Barker
J Fozard
JM Valin
K McCoy
K McCoy
K Reidel
L Baeckman
L Lines
M Chan
M Hamill
M Vacher
M Vacher
M Vacher
M Wölfel
MK Wolters
N Takeda
P Chahuara
P Mueller
P Nocera
R López-Cózar
RC Vipperla
S Bouakaz
S Katz
T Koskela
T Pellegrini
W Edwards
W Ryan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/02/2015
Field of study

The official version of this draft is available at Springer via http://dx.doi.org/10.1007/978-3-319-16226-3_7International audienceIn the framework of Ambient Assisted Living, home automation may be a solution for helping elderly people living alone at home. This study is part of the Sweet-Home project which aims at developing a new home automation system based on voice command to improve support and well-being of people in loss of autonomy. The goal of the study is vocal order recognition with a focus on two aspects: distance speech recognition and sentence spotting. Several ASR techniques were evaluated on a realistic corpus acquired in a 4-room flat equipped with microphones set in the ceiling. This distant speech French corpus was recorded with 21 speakers who acted scenarios of activities of daily living. Techniques acting at the decoding stage, such as our novel approach called Driven Decoding Algorithm (DDA), gave better speech recognition results than the baseline and other approaches. This solution which uses the two best SNR channels and a priori knowledge (voice commands and distress sentences) has demonstrated an increase in recognition rate without introducing false alarms

Crossref

Hal - Université Grenoble Alpes

Phonetic Searching

Author
Publication venue
Publication date: 12/11/2006
Field of study

An improved method and apparatus is disclosed which uses probabilistic techniques to map an input search string with a prestored audio file, and recognize certain portions of a search string phonetically. An improved interface is disclosed which permits users to input search strings, linguistics, phonetics, or a combination of both, and also allows logic functions to be specified by indicating how far separated specific phonemes are in time.Georgia Tech Research Corporatio

Scholarly Materials And Research @ Georgia Tech

Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean

Author: Lee Geunbae
Lee Jong-Hyeok
Publication venue
Publication date: 01/01/1996
Field of study

A new tightly coupled speech and natural language integration model is presented for a TDNN-based continuous possibly large vocabulary speech recognition system for Korean. Unlike popular n-best techniques developed for integrating mainly HMM-based speech recognition and natural language processing in a {\em word level}, which is obviously inadequate for morphologically complex agglutinative languages, our model constructs a spoken language system based on a {\em morpheme-level} speech and language integration. With this integration scheme, the spoken Korean processing engine (SKOPE) is designed and implemented using a TDNN-based diphone recognition module integrated with a Viterbi-based lexical decoding and symbolic phonological/morphological co-analysis. Our experiment results show that the speaker-dependent continuous {\em eojeol} (Korean word) recognition and integrated morphological analysis can be achieved with over 80.6% success rate directly from speech inputs for the middle-level vocabularies.Comment: latex source with a4 style, 15 pages, to be published in computer processing of oriental language journa

arXiv.org e-Print Archive

포항공과대학교