2,054 research outputs found

    Blip10000: a social video dataset containing SPUG content for tagging and retrieval

    Get PDF
    The increasing amount of digital multimedia content available is inspiring potential new types of user interaction with video data. Users want to easilyfind the content by searching and browsing. For this reason, techniques are needed that allow automatic categorisation, searching the content and linking to related information. In this work, we present a dataset that contains comprehensive semi-professional user generated (SPUG) content, including audiovisual content, user-contributed metadata, automatic speech recognition transcripts, automatic shot boundary les, and social information for multiple `social levels'. We describe the principal characteristics of this dataset and present results that have been achieved on different tasks

    TwNC: a Multifaceted Dutch News Corpus

    Get PDF
    This contribution describes the Twente News Corpus (TwNC), a multifaceted corpus for Dutch that is being deployed in a number of NLP research projects among which tracks within the Dutch national research programme MultimediaN, the NWO programme CATCH, and the Dutch-Flemish programme STEVIN.\ud \ud The development of the corpus started in 1998 within a predecessor project DRUID and has currently a size of 530M words. The text part has been built from texts of four different sources: Dutch national newspapers, television subtitles, teleprompter (auto-cues) files, and both manually and automatically generated broadcast news transcripts along with the broadcast news audio. TwNC plays a crucial role in the development and evaluation of a wide range of tools and applications for the domain of multimedia indexing, such as large vocabulary speech recognition, cross-media indexing, cross-language information retrieval etc. Part of the corpus was fed into the Dutch written text corpus in the context of the Dutch-Belgian STEVIN project D-COI that was completed in 2007. The sections below will describe the rationale that was the starting point for the corpus development; it will outline the cross-media linking approach adopted within MultimediaN, and finally provide some facts and figures about the corpus

    FLAX: Flexible and open corpus-based language collections development

    Get PDF
    In this case study we present innovative work in building open corpus-based language collections by focusing on a description of the opensource multilingual Flexible Language Acquisition (FLAX) language project, which is an ongoing example of open materials development practices for language teaching and learning. We present language-learning contexts from across formal and informal language learning in English for Academic Purposes (EAP). Our experience relates to Open Educational Resource (OER) options and Practices (OEP) which are available for developing and distributing online subject-specific language materials for uses in academic and professional settings. We are particularly concerned with closing the gap in language teacher training where competencies in materials development are still dominated by print-based proprietary course book publications. We are also concerned with the growing gap in language teaching practitioner competencies for understanding important issues of copyright and licencing that are changing rapidly in the context of digital and web literacy developments. These key issues are being largely ignored in the informal language teaching practitioner discussions and in the formal research into teaching and materials development practices

    The TREC-2002 video track report

    Get PDF
    TREC-2002 saw the second running of the Video Track, the goal of which was to promote progress in content-based retrieval from digital video via open, metrics-based evaluation. The track used 73.3 hours of publicly available digital video (in MPEG-1/VCD format) downloaded by the participants directly from the Internet Archive (Prelinger Archives) (internetarchive, 2002) and some from the Open Video Project (Marchionini, 2001). The material comprised advertising, educational, industrial, and amateur films produced between the 1930's and the 1970's by corporations, nonprofit organizations, trade associations, community and interest groups, educational institutions, and individuals. 17 teams representing 5 companies and 12 universities - 4 from Asia, 9 from Europe, and 4 from the US - participated in one or more of three tasks in the 2001 video track: shot boundary determination, feature extraction, and search (manual or interactive). Results were scored by NIST using manually created truth data for shot boundary determination and manual assessment of feature extraction and search results. This paper is an introduction to, and an overview of, the track framework - the tasks, data, and measures - the approaches taken by the participating groups, the results, and issues regrading the evaluation. For detailed information about the approaches and results, the reader should see the various site reports in the final workshop proceedings

    TRECVID 2004 - an overview

    Get PDF

    TREC video retrieval evaluation: a case study and status report

    Get PDF
    The TREC Video Retrieval Evaluation is a multiyear, international effort, funded by the US Advanced Research and Development Agency (ARDA) and the National Institute of Standards and Technology (NIST) to promote progress in content-based retrieval from digital video via open, metrics-based evaluation. Now beginning its fourth year, it aims over time to develop both a better understanding of how systems can effectively accomplish such retrieval and how one can reliably benchmark their performance. This paper can be seen as a case study in the development of video retrieval systems and their evaluation as well as a report on their status to-date. After an introduction to the evolution of the evaluation over the past three years, the paper reports on the most recent evaluation TRECVID 2003: the evaluation framework — the 4 tasks (shot boundary determination, high-level feature extraction, story segmentation and typing, search), 133 hours of US television news data, and measures —, the results, and the approaches taken by the 24 participating groups

    Caption This: Creating Efficiency in Audiovisual Accessibility Using Automatic Speech Recognition Toolkit

    Get PDF
    In the last several years, there has been a growing awareness of the need for digital accessibility in cultural heritage institutions. While initiatives to make content accessible and equitable for all patrons are vital for the continued growth and effectiveness of these institutions, they are not changes that can be made overnight. Remediating content requires time, knowledge, and effective tools. For many solo or siloed cultural heritage institutions, it can be difficult to commit the resources necessary for remediation. Nevertheless, these institutions will need to dedicate significant amounts of time to increasing accessibility in their digital collections, including audiovisual (A/V) content. For A/V collections, the process of making material accessible to all users is time consuming and labor-intensive. It requires listening to the recording in real time, replaying the recording at different speeds to decipher difficult passages, and writing down every word, pause, and non-verbal communication with a time-stamp to indicate where in the recording the text occurred. Existing models of auto-generating caption files, such as uploading to YouTube, are known to be mediocre and do not remove the need for proofreading. This toolkit is intended to create an easily replicable, low-cost, efficient solution for transcribing and captioning library and archival video content, making A/V remediation feasible for institutions that lack the resources to undertake an in-depth transcription project
    corecore