Search CORE

7 research outputs found

Semi-supervised discriminative language modeling for Turkish ASR

Author: Bikel Daniel M.
Callison-Burch Chris
Cao Yuan
Dikici Erinç
Glenn Nathan
Hall Keith B.
Hasler Eva
Karakos Damianos
Khudanpur Sanjeev
Koehn Philipp
Lehr Maider
Lopez Adam
Post Matt
Prud'hommeaux Emily Tucker
Riley Darcey
Roark Brian
Sagae Kenji
Sak Hasim
Saraclar Murat
Shafran Izhak
Xu Puyang
Çelebi Arda
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2012
Field of study

Crossref

Edinburgh Research Explorer

Multilingual Convolutional, Long Short-Term Memory, Deep Neural Networks for Low Resource Speech Recognition

Author: Dimitri Palaz
Dong Wang
Hasim Sak
Heigold
Lu
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Crossref

Contrastive Siamese Network for Semi-supervised Speech Recognition

Author: Khorram Soheil
Kim Jaeyoung
Lu Han
Sak Hasim
Tripathi Anshuman
Zhang Qian
Publication venue
Publication date: 27/05/2022
Field of study

This paper introduces contrastive siamese (c-siam) network, an architecture for leveraging unlabeled acoustic data in speech recognition. c-siam is the first network that extracts high-level linguistic information from speech by matching outputs of two identical transformer encoders. It contains augmented and target branches which are trained by: (1) masking inputs and matching outputs with a contrastive loss, (2) incorporating a stop gradient operation on the target branch, (3) using an extra learnable transformation on the augmented branch, (4) introducing new temporal augment functions to prevent the shortcut learning problem. We use the Libri-light 60k unsupervised data and the LibriSpeech 100hrs/960hrs supervised data to compare c-siam and other best-performing systems. Our experiments show that c-siam provides 20% relative word error rate improvement over wav2vec baselines. A c-siam network with 450M parameters achieves competitive results compared to the state-of-the-art networks with 600M parameters

arXiv.org e-Print Archive