Search CORE

27 research outputs found

Integrating lexical and prosodic features for automatic paragraph segmentation

Author: Farrús Mireia
Lai Catherine
Moore Johanna
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Spoken documents, such as podcasts or lectures, are a growing presence in everyday life. Being able to automatically identify their discourse structure is an important step to understanding what a spoken document is about. Moreover, finer-grained units, such as paragraphs, are highly desirable for presenting and analyzing spoken content. However, little work has been done on discourse based speech segmentation below the level of broad topics. In order to examine how discourse transitions are cued in speech, we investigate automatic paragraph segmentation of TED talks using lexical and prosodic features. Experiments using Support Vector Machines, AdaBoost, and Neural Networks show that models using supra-sentential prosodic features and induced cue words perform better than those based on the type of lexical cohesion measures often used in broad topic segmentation. Moreover, combining a wide range of individually weak lexical and prosodic predictors improves performance, and modelling contextual information using recurrent neural networks outperforms other approaches by a large margin. Our best results come from using late fusion methods that integrate representations generated by separate lexical and prosodic models while allowing interactions between these features streams rather than treating them as independent information sources. Application to ASR outputs shows that adding prosodic features, particularly using late fusion, can significantly ameliorate decreases in performance due to transcription errors.The second author was funded from the EU’s Horizon 2020 Research and Innovation Programme under the GA H2020-RIA-645012 and the Spanish Ministry of Economy and Competitivity Juan de la Cierva program. The other authors were funded by the University of Edinburgh

Edinburgh Research Explorer

UPF Digital Repository

Diposit Digital de la Universitat de Barcelona

Hierarchical Recurrent Neural Network for Story Segmentation

Author: Bell Peter
Renals Steve
Tsunoo Emiru
Publication venue: 'International Speech Communication Association'
Publication date: 24/08/2017
Field of study

Crossref

Edinburgh Research Explorer

Hierarchical recurrent neural network for story segmentation using fusion of lexical and acoustic features

Author: Bell Peter
Klejch Ondrej
Renals Steve
Tsunoo Emiru
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/01/2018
Field of study

Edinburgh Research Explorer

Searching Spontaneous Conversational Speech:Proceedings of ACM SIGIR Workshop (SSCS2008)

Author: Kraaij W.
Larson M.
Publication venue: Centre for Telematics and Information Technology (CTIT)
Publication date: 24/07/2008
Field of study

University of Twente Research Information

Broadcast News Segmentation Using Automatic Speech Recognition System Combination With Rescoring And Noun Unification

Author: Ali Khalaf Zainab
Publication venue
Publication date: 01/07/2015
Field of study

Siaran berita memaklumkan perkembangan terbaru, peristiwa dan isu-isu terkini yang berlaku di dunia kepada penonton. Pada masa kini, berita yang disiarkan boleh diakses dengan mudah atas talian. Broadcast news keeps viewers informed about the latest developments, events and issues occurring in the world. Nowadays, broadcast news can be easily accessed online

Repository@USM

Advances in automatic meeting record creation and access

Author: Bett Michael
Metze Florian
Ries Klaus
Schaaf Thomas
Schultz Tanja
Soltau Hagen
Waibel Alex
Yu Hua
Zechner Klaus
Publication venue
Publication date: 16/01/2008
Field of study

KITopen

Spoken content retrieval: A survey of techniques and technologies

Author: Ani Nenkova
C A. Nenkova
K. Mckeown
Kathleen Mckeown
Publication venue: 'Now Publishers'
Publication date: 01/01/2012
Field of study

Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

WooIR: A New Open Page Stream Segmentation Dataset

Author: Kamps J.
Marx M.
van Heusden R.
Publication venue
Publication date: 01/01/2022
Field of study

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Indexing and retrieval of broadcast news

Author: Abberley Dave
Kirby David
Renals Steve
Robinson Tony
Publication venue: 'Elsevier BV'
Publication date: 01/01/2000
Field of study

This paper describes a spoken document retrieval (SDR) system for British and North American Broadcast News. The system is based on a connectionist large vocabulary speech recognizer and a probabilistic information retrieval system. We discuss the development of a realtime Broadcast News speech recognizer, and its integration into an SDR system. Two advances were made for this task: automatic segmentation and statistical query expansion using a secondary corpus. Precision and recall results using the Text Retrieval Conference (TREC) SDR evaluation infrastructure are reported throughout the paper, and we discuss the application of these developments to a large scale SDR task based on an archive of British English broadcast news

CiteSeerX

Edinburgh Research Archive

Edinburgh Research Explorer