Examining the contributions of automatic speech transcriptions and metadata sources for searching spontaneous conversational speech

Jones, Gareth J.F.; Lam-Adesina, Adenike M.; Newman, Eamonn; Zhang, Ke

research

Examining the contributions of automatic speech transcriptions and metadata sources for searching spontaneous conversational speech

Authors: Gareth J.F. Jones
Adenike M. Lam-Adesina
Eamonn Newman
Ke Zhang
Publication date: 1 July 2007
Publisher: Centre for Telematics and Information Technology, Enschede, The Netherlands

Abstract

The searching spontaneous speech can be enhanced by combining automatic speech transcriptions with semantically related metadata. An important question is what can be expected from search of such transcriptions and different sources of related metadata in terms of retrieval effectiveness. The Cross-Language Speech Retrieval (CL-SR) track at recent CLEF workshops provides a spontaneous speech test collection with manual and automatically derived metadata fields. Using this collection we investigate the comparative search effectiveness of individual fields comprising automated transcriptions and the available metadata. A further important question is how transcriptions and metadata should be combined for the greatest benefit to search accuracy. We compare simple field merging of individual fields with the extended BM25 model for weighted field combination (BM25F). Results indicate that BM25F can produce improved search accuracy, but that it is currently important to set its parameters suitably using a suitable training set

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Name not available

oai:doras.dcu.ie:383

Last time updated on 09/02/2018

Irish Universities

Last time updated on 30/12/2017

DCU Online Research Access Service

oai:doras.dcu.ie:383

Last time updated on 10/07/2013