Search CORE

2,376 research outputs found

A speaker rediarization scheme for improving diarization in large two-speaker telephone datasets

Author: Dean David
Ghaemmaghami Houman
Sridharan Sridha
Publication venue: European Association for Signal Processing
Publication date: 01/01/2014
Field of study

In this paper we propose a novel scheme for carrying out speaker diarization in an iterative manner. We aim to show that the information obtained through the first pass of speaker diarization can be reused to refine and improve the original diarization results. We call this technique speaker rediarization and demonstrate the practical application of our rediarization algorithm using a large archive of two-speaker telephone conversation recordings. We use the NIST 2008 SRE summed telephone corpora for evaluating our speaker rediarization system. This corpus contains recurring speaker identities across independent recording sessions that need to be linked across the entire corpus. We show that our speaker rediarization scheme can take advantage of inter-session speaker information, linked in the initial diarization pass, to achieve a 30% relative improvement over the original diarization error rate (DER) after only two iterations of rediarization

CiteSeerX

Queensland University of Technology ePrints Archive

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Speaker Diarization with Lexical Information

Author: Georgiou Panayiotis
Han Kyu J.
He Xiaodong
Huang Jing
Narayanan Shrikanth
Park Tae Jin
Zhou Bowen
Publication venue: 'International Speech Communication Association'
Publication date: 13/04/2020
Field of study

This work presents a novel approach for speaker diarization to leverage lexical information provided by automatic speech recognition. We propose a speaker diarization system that can incorporate word-level speaker turn probabilities with speaker embeddings into a speaker clustering process to improve the overall diarization accuracy. To integrate lexical and acoustic information in a comprehensive way during clustering, we introduce an adjacency matrix integration for spectral clustering. Since words and word boundary information for word-level speaker turn probability estimation are provided by a speech recognition system, our proposed method works without any human intervention for manual transcriptions. We show that the proposed method improves diarization performance on various evaluation datasets compared to the baseline diarization system using acoustic information only in speaker embeddings

arXiv.org e-Print Archive

Crossref

Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation

Author: Chen Qian
Chen Yafeng
Cheng Luyao
Wang Hui
Zhang Qinglin
Zhang Shiliang
Zheng Siqi
Publication venue
Publication date: 19/09/2023
Field of study

Speaker diarization has gained considerable attention within speech processing research community. Mainstream speaker diarization rely primarily on speakers' voice characteristics extracted from acoustic signals and often overlook the potential of semantic information. Considering the fact that speech signals can efficiently convey the content of a speech, it is of our interest to fully exploit these semantic cues utilizing language models. In this work we propose a novel approach to effectively leverage semantic information in clustering-based speaker diarization systems. Firstly, we introduce spoken language understanding modules to extract speaker-related semantic information and utilize these information to construct pairwise constraints. Secondly, we present a novel framework to integrate these constraints into the speaker diarization pipeline, enhancing the performance of the entire system. Extensive experiments conducted on the public dataset demonstrate the consistent superiority of our proposed approach over acoustic-only speaker diarization systems.Comment: Submitted to ICASSP 202

arXiv.org e-Print Archive