Search CORE

15,426 research outputs found

Zero-shot keyword spotting for visual speech recognition in-the-wild

Author: Fei Tao
JS Chung
K Audhkhasi
K He
M Cooke
S Fernández
S Hochreiter
S Watanabe
Z Akata
Publication venue
Publication date: 25/07/2018
Field of study

Visual keyword spotting (KWS) is the problem of estimating whether a text query occurs in a given recording using only video information. This paper focuses on visual KWS for words unseen during training, a real-world, practical setting which so far has received no attention by the community. To this end, we devise an end-to-end architecture comprising (a) a state-of-the-art visual feature extractor based on spatiotemporal Residual Networks, (b) a grapheme-to-phoneme model based on sequence-to-sequence neural networks, and (c) a stack of recurrent neural networks which learn how to correlate visual features with the keyword representation. Different to prior works on KWS, which try to learn word representations merely from sequences of graphemes (i.e. letters), we propose the use of a grapheme-to-phoneme encoder-decoder model which learns how to map words to their pronunciation. We demonstrate that our system obtains very promising visual-only KWS results on the challenging LRS2 database, for keywords unseen during training. We also show that our system outperforms a baseline which addresses KWS via automatic speech recognition (ASR), while it drastically improves over other recently proposed ASR-free KWS methods.Comment: Accepted at ECCV-201

arXiv.org e-Print Archive

Crossref

The effect of automatic speech recognition EyeSpeak software on Iraqi students’ English pronunciation: a pilot study

Author: Shaari Ahmad Jelani
Sidgi Lina Fathi Sidig
Publication venue: 'Australian International Academic Centre'
Publication date: 01/01/2017
Field of study

The use of technology, such as computer-assisted language learning (CALL), is used in teaching and learning in the foreign language classrooms where it is most needed.One promising emerging technology that supports language learning is automatic speech recognition (ASR).Integrating such technology, especially in the instruction of pronunciation in the classroom, is important in helping students to achieve correct pronunciation. In Iraq, English is a foreign language, and it is not surprising that learners commit many pronunciation mistakes.One factor contributing to these mistakes is the difference between the Arabic and English phonetic systems.Thus, the sound transformation from the mother tongue (Arabic) to the target language (English) is one barrier for Arab learners.The purpose of this study is to investigate the effectiveness of using automatic speech recognition ASR EyeSpeak software in improving the pronunciation of Iraqi learners of English. An experimental research project with a pretest-posttest design is conducted over a one-month period in the Department of English at Al-Turath University College in Baghdad, Iraq.The ten participants are randomly selected first-year college students enrolled in a pronunciation class that uses traditional teaching methods and ASR EyeSpeak software.The findings show that using EyeSpeak software leads to a significant improvement in the students’ English pronunciation, evident from the test scores they achieve after using EyeSpeak software

UUM Repository

Australian International Academic Centre: AIAC Journals

Directory of Open Access Journals

Effects of two teaching methods of connected speech in a Polish EFL classroom

Author: Bell
Bell
Bell
Bell
Boersma
Boersma
Bussmann
Bussmann
Bybee
Bybee
Carr
Carr
Cook
Cook
Cruttenden
Cruttenden
Ellis
Ellis
Ellis
Ellis
Ernestus
Ernestus
Gonet
Gonet
Gómez Lacabex
Gómez Lacabex
Huber
Huber
Jaworski
Jaworski
Labov
Labov
Lindblom
Lindblom
Lombardo
Lombardo
Lujan
Lujan
Lyster
Lyster
Małgorzata Kul
Morley
Morley
Munro
Munro
Newman
Newman
Pica
Pica
Roach
Roach
Rojczyk
Rojczyk
Saito
Saito
Sawicka
Sawicka
Schwartz
Schwartz
Shockey
Shockey
Silva
Silva
Spada
Spada
Thomson
Thomson
Trask
Trask
Waniek
Waniek
Wells
Wells
Wierzchowska
Wierzchowska
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2016
Field of study

The results demonstrate that in general, NF proved more effective than NNF. With regard to individual processes of connected speech, NF was more effective in production, whereas no such effect was found for perception

Crossref

Biblioteka Nauki - repozytorium artykuÅÃ³w

Repozytorium Uniwersytetu Łódzkiego (University of Lodz Repository)

Fast and Accurate OOV Decoder on High-Level Features

Author: Khokhlov Yuri
Medennikov Ivan
Romanenko Alexei
Tomashenko Natalia
Publication venue
Publication date: 19/07/2017
Field of study

This work proposes a novel approach to out-of-vocabulary (OOV) keyword search (KWS) task. The proposed approach is based on using high-level features from an automatic speech recognition (ASR) system, so called phoneme posterior based (PPB) features, for decoding. These features are obtained by calculating time-dependent phoneme posterior probabilities from word lattices, followed by their smoothing. For the PPB features we developed a special novel very fast, simple and efficient OOV decoder. Experimental results are presented on the Georgian language from the IARPA Babel Program, which was the test language in the OpenKWS 2016 evaluation campaign. The results show that in terms of maximum term weighted value (MTWV) metric and computational speed, for single ASR systems, the proposed approach significantly outperforms the state-of-the-art approach based on using in-vocabulary proxies for OOV keywords in the indexed database. The comparison of the two OOV KWS approaches on the fusion results of the nine different ASR systems demonstrates that the proposed OOV decoder outperforms the proxy-based approach in terms of MTWV metric given the comparable processing speed. Other important advantages of the OOV decoder include extremely low memory consumption and simplicity of its implementation and parameter optimization.Comment: Interspeech 2017, August 2017, Stockholm, Sweden. 201

arXiv.org e-Print Archive

Crossref

The Effectiveness of Integrated Teaching Methods in English as a Foreign Language Classrooms

Author: Orii-Akita Mamiko
Publication venue: 早稲田大学教育・総合科学学術院教育会
Publication date: 25/03/2016
Field of study

Waseda University Repository

Institutional Repositories DataBase (IRDB)