9,683 research outputs found

    An introduction to crowdsourcing for language and multimedia technology research

    Get PDF
    Language and multimedia technology research often relies on large manually constructed datasets for training or evaluation of algorithms and systems. Constructing these datasets is often expensive with significant challenges in terms of recruitment of personnel to carry out the work. Crowdsourcing methods using scalable pools of workers available on-demand offers a flexible means of rapid low-cost construction of many of these datasets to support existing research requirements and potentially promote new research initiatives that would otherwise not be possible

    Relevance of ASR for the Automatic Generation of Keywords Suggestions for TV programs

    Get PDF
    Semantic access to multimedia content in audiovisual archives is to a large extent dependent on quantity and quality of the metadata, and particularly the content descriptions that are attached to the individual items. However, given the growing amount of materials that are being created on a daily basis and the digitization of existing analogue collections, the traditional manual annotation of collections puts heavy demands on resources, especially for large audiovisual archives. One way to address this challenge, is to introduce (semi) automatic annotation techniques for generating and/or enhancing metadata. The NWO funded CATCH-CHOICE project has investigated the extraction of keywords form textual resources related to the TV programs to be archived (context documents), in collaboration with the Dutch audiovisual archives, Sound and Vision. Besides the descriptions of the programs published by the broadcasters on their Websites, Automatic Speech Transcription (ASR) techniques from the CATCH-CHoral project, also provide textual resources that might be relevant for suggesting keywords. This paper investigates the suitability of ASR for generating such keywords, which we evaluate against manual annotations of the documents and against keywords automatically generated from context documents

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Loss of visual working memory within seconds: The combined use of refreshable and non-refreshable features

    Get PDF
    We re-examine the role of time in the loss of information from working memory, the limited information accessible for cognitive tasks. The controversial issue of whether working memory deteriorates over time was investigated using arrays of unconventional visual characters. Each array was followed by a post-perceptual mask, a variable retention interval (RI), and a recognition probe character. Dramatic forgetting across an unfilled RI of up to 6 s was observed. Adding a distracting task during the RI (repetition, subtraction, or parity judgment using spoken digits) lowered the level of recall, but not increasingly so across RIs. Also, arrays of English letters were not forgotten during the RI unless distracting stimuli were included, in contrast to the finding for unconventional characters. The results suggest that unconventional visual items include some features inevitably lost over time. Attention-related processing, however, assists in the retention of other features, and of English letters. We identify important constraints for working memory theories and propose that an equilibrium between forgetting and reactivation holds, but only for elements that are not inevitably lost over time

    Context based multimedia information retrieval

    Get PDF
    • 

    corecore