85 research outputs found
Formatting Time-Aligned ASR Transcripts for Readability, in
Abstract We address the problem of formatting the output of an automatic speech recognition (ASR) system for readability, while preserving wordlevel timing information of the transcript. Our system enriches the ASR transcript with punctuation, capitalization and properly written dates, times and other numeric entities, and our approach can be applied to other formatting tasks. The method we describe combines hand-crafted grammars with a class-based language model trained on written text and relies on Weighted Finite State Transducers (WFSTs) for the preservation of start and end time of each word
Joint Learning of Correlated Sequence Labelling Tasks Using Bidirectional Recurrent Neural Networks
The stream of words produced by Automatic Speech Recognition (ASR) systems is
typically devoid of punctuations and formatting. Most natural language
processing applications expect segmented and well-formatted texts as input,
which is not available in ASR output. This paper proposes a novel technique of
jointly modeling multiple correlated tasks such as punctuation and
capitalization using bidirectional recurrent neural networks, which leads to
improved performance for each of these tasks. This method could be extended for
joint modeling of any other correlated sequence labeling tasks.Comment: Accepted in Interspeech 201
Many uses, many annotations for large speech corpora: Switchboard and TDT as case studies
This paper discusses the challenges that arise when large speech corpora
receive an ever-broadening range of diverse and distinct annotations. Two case
studies of this process are presented: the Switchboard Corpus of telephone
conversations and the TDT2 corpus of broadcast news. Switchboard has undergone
two independent transcriptions and various types of additional annotation, all
carried out as separate projects that were dispersed both geographically and
chronologically. The TDT2 corpus has also received a variety of annotations,
but all directly created or managed by a core group. In both cases, issues
arise involving the propagation of repairs, consistency of references, and the
ability to integrate annotations having different formats and levels of detail.
We describe a general framework whereby these issues can be addressed
successfully.Comment: 7 pages, 2 figure
Access to recorded interviews: A research agenda
Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed
- âŚ