4 research outputs found

    Radio Oranje: Enhanced Access to a Historical Spoken Word Collection

    Get PDF
    Access to historical audio collections is typically very restricted:\ud content is often only available on physical (analog) media and the\ud metadata is usually limited to keywords, giving access at the level\ud of relatively large fragments, e.g., an entire tape. Many spoken\ud word heritage collections are now being digitized, which allows the\ud introduction of more advanced search technology. This paper presents\ud an approach that supports online access and search for recordings of\ud historical speeches. A demonstrator has been built, based on the\ud so-called Radio Oranje collection, which contains radio speeches by\ud the Dutch Queen Wilhelmina that were broadcast during World War II.\ud The audio has been aligned with its original 1940s manual\ud transcriptions to create a time-stamped index that enables the speeches to be\ud searched at the word level. Results are presented together with\ud related photos from an external database

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Automatic translation of formal data specifications to voice data-input applications.

    Get PDF
    This thesis introduces a complete solution for automatic translation of formal data specifications to voice data-input applications. The objective of the research is to automatically generate applications for inputting data through speech from specifications of the structure of the data. The formal data specifications are XML DTDs. A new formalization called Grammar-DTD (G-DTD) is introduced as an extended DTD that contains grammars to describe valid values of the DTD elements and attributes. G-DTDs facilitate the automatic generation of Voice XML applications that correspond to the original DTD structure. The development of the automatic application-generator included identifying constraints on the G-DTD to ensure a feasible translation, using predicate calculus to build a knowledge base of inference rules that describes the mapping procedure, and writing an algorithm for the automatic translation based on the inference rules.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2006 .H355. Source: Masters Abstracts International, Volume: 45-01, page: 0354. Thesis (M.Sc.)--University of Windsor (Canada), 2006

    A graphical interface for speech-based retrieval

    No full text
    This paper describes preliminary usability testing for a graphical interface designed to facilitate rapid browsing of speech records. Expert interviews and focus group discussions were used to assess the alignment between browsing behaviors employed by members of the intended user population and an early mockup of the interface. The results provide guidelines for the next iteration of prototype development and suggest that graphical representations offer a viable method for browsing audio and multimedia recordings
    corecore