research

Multimedia retrieval in MultiMatch: The impact of speech transcript errors on search behaviour

Abstract

This study discusses the findings of an evaluation study on the performance of a multimedia multimodal information access sub-system (MIAS), incorporating automatic speech recognition technology (ASR) to automatically transcribe the speech content of video soundtracks. The study’s results indicate that an information-rich but minimalist graphical interface is preferred. It was also discovered that users tend to have a misplaced confidence in the accuracy of ASR-generated speech transcripts, thus they are not inclined to conduct a systematic auditory inspection (their usual search behaviour) of a video’s soundtrack if the query term does not appear in the transcript. In order to alert the user to the possibility that a search term may be incorrectly recognised as some other word, a matching algorithm is proposed that searches for word sequences of similar phonemic structure to the query term

    Similar works