Search CORE

2,841 research outputs found

Spoken content retrieval: A survey of techniques and technologies

Author: Ani Nenkova
C A. Nenkova
K. Mckeown
Kathleen Mckeown
Publication venue: 'Now Publishers'
Publication date: 01/01/2012
Field of study

Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Automatic Phonetic Transcription of Non-Prompted Speech

Author: Ohala John J.
Schiel Florian
Publication venue
Publication date: 01/01/1999
Field of study

A reliable method for automatic phonetic transcription of non− prompted German speech has been developed at th

CiteSeerX

Open Access LMU

Recommended from our members

The Challenge of Spoken Language Systems: Research Directions for the Nineties

Author: Atlas Les
Beckman Mary
Biermann Alan
Bush Marcia
Clements Mark
Cohen Jordan
Cole Ron
Garcia Oscar
Hanson Brian
Hermansky Hynek
Hirschman Lynette
Levinson Steve
McKeown Kathleen
Morgan Nelson
Novick David G.
Ostendorf Mari
Oviatt Sharon
Price Patti
Silverman Harvey
Spitz Judy
Waibel Alex
Weinstein Clifford
Zahorian Steve
Zue Victor
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1995
Field of study

A spoken language system combines speech recognition, natural language processing and human interface technology. It functions by recognizing the person's words, interpreting the sequence of words to obtain a meaning in terms of the application, and providing an appropriate response back to the user. Potential applications of spoken language systems range from simple tasks, such as retrieving information from an existing database (traffic reports, airline schedules), to interactive problem solving tasks involving complex planning and reasoning (travel planning, traffic routing), to support for multilingual interactions. We examine eight key areas in which basic research is needed to produce spoken language systems: (1) robust speech recognition; (2) automatic training and adaptation; (3) spontaneous speech; (4) dialogue models; (5) natural language response generation; (6) speech synthesis and speech generation; (7) multilingual systems; and (8) interactive multimodal systems. In each area, we identify key research challenges, the infrastructure needed to support research, and the expected benefits. We conclude by reviewing the need for multidisciplinary research, for development of shared corpora and related resources, for computational support and far rapid communication among researchers. The successful development of this technology will increase accessibility of computers to a wide range of users, will facilitate multinational communication and trade, and will create new research specialties and jobs in this rapidly expanding area

Columbia University Academic Commons

Recommended from our members

The Challenge of Spoken Language Systems: Research Directions for the Nineties

Author: McKeown Kathleen
Cole Ron
Hirschman Lynette
Atlas Les
Beckman Mary
Biermann Alan
Bush Marcia
Clements Mark
Cohen Jordan
Garcia Oscar
Hanson Brian
Hermansky Hynek
Levinson Steve
Morgan Nelson
Novick David G.
Ostendorf Mari
Oviatt Sharon
Price Patti
Silverman Harvey
Spitz Judy
Waibel Alex
Weinstein Clifford
Zahorian Steve
Zue Victor
Publication venue
Publication date: 01/01/1995
Field of study

Columbia University Academic Commons

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

Dialogue Act Recognition Approaches

Author: Cerisara Christophe
Král Pavel
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 26/01/2012
Field of study

This paper deals with automatic dialogue act (DA) recognition. Dialogue acts are sentence-level units that represent states of a dialogue, such as questions, statements, hesitations, etc. The knowledge of dialogue act realizations in a discourse or dialogue is part of the speech understanding and dialogue analysis process. It is of great importance for many applications: dialogue systems, speech recognition, automatic machine translation, etc. The main goal of this paper is to study the existing works about DA recognition and to discuss their respective advantages and drawbacks. A major concern in the DA recognition domain is that, although a few DA annotation schemes seem now to emerge as standards, most of the time, these DA tag-sets have to be adapted to the specificities of a given application, which prevents the deployment of standardized DA databases and evaluation procedures. The focus of this review is put on the various kinds of information that can be used to recognize DAs, such as prosody, lexical, etc., and on the types of models proposed so far to capture this information. Combining these information sources tends to appear nowadays as a prerequisite to recognize DAs

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Automated Metadata Extraction for Semantic Access to Spoken Word Archives

Author: de Jong Franciska M.G.
Heeren W.F.L.
Nijholt Antinus
Ordelman Roeland J.F.
van Hessen Adrianus J.
Publication venue: Centre for Applied Linguistics
Publication date: 17/01/2011
Field of study

University of Twente Research Information