Search CORE

10 research outputs found

Improving Voice Trigger Detection with Metric Learning

Author: Adya Saurabh
Cho Minsik
Dhir Chandra
Gupta Anmol
Higuchi Takuya
Lakshminarasimhan Varun
Marchi Erik
Nayak Prateeth
Ranjan Shivesh
Shum Stephen
Sigtia Siddharth
Tewfik Ahmed
Publication venue
Publication date: 05/04/2022
Field of study

Voice trigger detection is an important task, which enables activating a voice assistant when a target user speaks a keyword phrase. A detector is typically trained on speech data independent of speaker information and used for the voice trigger detection task. However, such a speaker independent voice trigger detector typically suffers from performance degradation on speech from underrepresented groups, such as accented speakers. In this work, we propose a novel voice trigger detector that can use a small number of utterances from a target speaker to improve detection accuracy. Our proposed model employs an encoder-decoder architecture. While the encoder performs speaker independent voice trigger detection, similar to the conventional detector, the decoder predicts a personalized embedding for each utterance. A personalized voice trigger score is then obtained as a similarity score between the embeddings of enrollment utterances and a test utterance. The personalized embedding allows adapting to target speaker's speech when computing the voice trigger score, hence improving voice trigger detection accuracy. Experimental results show that the proposed approach achieves a 38% relative reduction in a false rejection rate (FRR) compared to a baseline speaker independent voice trigger model.Comment: Submitted to InterSpeech 202

arXiv.org e-Print Archive

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the results and lessons learned based on the twelve sub-systems and their fusion submitted to SRE'18. It is also our intention to present a shared view on the advancements, progresses, and major paradigm shifts that we have witnessed as an SRE participant in the past decade from SRE'08 to SRE'18. In this regard, we have seen, among others, a paradigm shift from supervector representation to deep speaker embedding, and a switch of research challenge from channel compensation to domain adaptation.Comment: 5 page

arXiv.org e-Print Archive

HAL AMU

INRIA a CCSD electronic archive server

Hal-Diderot

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the results and lessons learned based on the twelve subsystems and their fusion submitted to SRE'18. It is also our intention to present a shared view on the advancements, progresses, and major paradigm shifts that we have witnessed as an SRE participant in the past decade from SRE'08 to SRE'18. In this regard, we have seen, among others , a paradigm shift from supervector representation to deep speaker embedding, and a switch of research challenge from channel compensation to domain adaptation

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

International audienceThe I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the results and lessons learned based on the twelve subsystems and their fusion submitted to SRE'18. It is also our intention to present a shared view on the advancements, progresses, and major paradigm shifts that we have witnessed as an SRE participant in the past decade from SRE'08 to SRE'18. In this regard, we have seen, among others , a paradigm shift from supervector representation to deep speaker embedding, and a switch of research challenge from channel compensation to domain adaptation

INRIA a CCSD electronic archive server

Dual-tree complex wavelet transform-based image enhancement for accurate long-term change assessment in coal mining areas

Author: Shivesh Kishore Karan
Sukha Ranjan Samadder
Publication venue: Taylor & Francis Group
Publication date: 01/10/2018
Field of study

The main objective of this study was to improve the long-term land use change detection by improving classification accuracy of previous generation satellite image using a recent super-resolution technique. The study also analysed the change in land cover over a period of 41 years in a coal mining area. A dual-tree complex wavelet transform-based image super-resolution technique was used to enhance Landsat images of 1975 and 2016. Separating pixels with similar spectral response is an enigmatical task, especially when those pixel represent different ground features. Therefore, an advanced neural net supervised classifier was used to minimize classification errors. Accuracy of the classified images (both super-resolved and original) were measured using confusion matrices and kappa coefficients. A significant improvement of more than 10% was observed in the overall classification accuracy for the image of 1975, highlighting that the classification accuracy of earlier generation satellite data can be improved substantially

Crossref

Directory of Open Access Journals

Improving accuracy of long-term land-use change in coal mining areas using wavelets and Support Vector Machines

Author: Boser B. E.
Canters F.
Hermes L.
Koukoulas S.
Liu J. G.
Piao Y.
Richards J. A.
Rosenfield G. H.
Shivesh Kishore Karan
Strang G.
Sukha Ranjan Samadder
Vapnik V.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Author: A Bsaibes
A Ghosh
A Ramoelo
A Stumpf
A Wang
AYM Lin
BW Szuster
C Cortes
CC Liu
CJ Sande Van der
D Mishra
DD Yavaşlı
F Santoro
F Yamazaki
FA Kruse
FA Kruse
FM Coillie Van
G Dyke
G Mallinis
G Wang
GA Licciardi
GM Foody
GP Petropoulos
HJ Lynch
HW Yates
I Ozdemir
J Campbell
J Yang
JA Richards
JA Richards
JG Liu
JR Otukei
L Wang
M Bagnardi
M Belgiu
M Bolognesi
M Sonka
MA Aguilar
N Ni
O Hagolle
O Mutanga
P Hyde
PC Mahalanobis
PF Fisher
PJ Mumby
PM Mather
R Momeni
R Welch
RA Schowengerdt
RG Congalton
S Andréfouët
S Bhaskaran
SD Jawak
Shivesh Kishore Karan
SK Karan
SK Karan
SK Karan
SK Karan
SK Karan
SK Singh
Sukha Ranjan Samadder
SW Myint
T Wang
T Yang
U Shankar
Victor Mesev
VN Vapnik
W Li
Y Shao
Z Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref