5,741 research outputs found

    Automated speech and audio analysis for semantic access to multimedia

    Get PDF
    The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to increased granularity of automatically extracted metadata. A number of techniques will be presented, including the alignment of speech and text resources, large vocabulary speech recognition, key word spotting and speaker classification. The applicability of techniques will be discussed from a media crossing perspective. The added value of the techniques and their potential contribution to the content value chain will be illustrated by the description of two (complementary) demonstrators for browsing broadcast news archives

    Multimedia search without visual analysis: the value of linguistic and contextual information

    Get PDF
    This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features

    The FĂ­schlĂĄr-News-Stories system: personalised access to an archive of TV news

    Get PDF
    The “Físchlár” systems are a family of tools for capturing, analysis, indexing, browsing, searching and summarisation of digital video information. Físchlár-News-Stories, described in this paper, is one of those systems, and provides access to a growing archive of broadcast TV news. Físchlár-News-Stories has several notable features including the fact that it automatically records TV news and segments a broadcast news program into stories, eliminating advertisements and credits at the start/end of the broadcast. Físchlár-News-Stories supports access to individual stories via calendar lookup, text search through closed captions, automatically-generated links between related stories, and personalised access using a personalisation and recommender system based on collaborative filtering. Access to individual news stories is supported either by browsing keyframes with synchronised closed captions, or by playback of the recorded video. One strength of the Físchlár-News-Stories system is that it is actually used, in practice, daily, to access news. Several aspects of the Físchlár systems have been published before, bit in this paper we give a summary of the Físchlár-News-Stories system in operation by following a scenario in which it is used and also outlining how the underlying system realises the functions it offers

    From media crossing to media mining

    Get PDF
    This paper reviews how the concept of Media Crossing has contributed to the advancement of the application domain of information access and explores directions for a future research agenda. These will include themes that could help to broaden the scope and to incorporate the concept of medium-crossing in a more general approach that not only uses combinations of medium-specific processing, but that also exploits more abstract medium-independent representations, partly based on the foundational work on statistical language models for information retrieval. Three examples of successful applications of media crossing will be presented, with a focus on the aspects that could be considered a first step towards a generalized form of media mining

    Measuring the impact of temporal context on video retrieval

    Get PDF
    In this paper we describe the findings from the K-Space interactive video search experiments in TRECVid 2007, which examined the effects of including temporal context in video retrieval. The traditional approach to presenting video search results is to maximise recall by offering a user as many potentially relevant shots as possible within a limited amount of time. ‘Context’-oriented systems opt to allocate a portion of theresults presentation space to providing additional contextual cues about the returned results. In video retrieval these cues often include temporal information such as a shot’s location within the overall video broadcast and/or its neighbouring shots. We developed two interfaces with identical retrieval functionality in order to measure the effects of such context on user performance. The first system had a ‘recall-oriented’ interface, where results from a query were presented as a ranked list of shots. The second was ‘contextoriented’, with results presented as a ranked list of broadcasts. 10 users participated in the experiments, of which 8 were novices and 2 experts. Participants completed a number of retrieval topics using both the recall-oriented and context-oriented systems

    Dublin City University video track experiments for TREC 2003

    Get PDF
    In this paper, we describe our experiments for both the News Story Segmentation task and Interactive Search task for TRECVID 2003. Our News Story Segmentation task involved the use of a Support Vector Machine (SVM) to combine evidence from audio-visual analysis tools in order to generate a listing of news stories from a given news programme. Our Search task experiment compared a video retrieval system based on text, image and relevance feedback with a text-only video retrieval system in order to identify which was more effective. In order to do so we developed two variations of our FĂ­schlĂĄr video retrieval system and conducted user testing in a controlled lab environment. In this paper we outline our work on both of these two tasks

    Robust audio indexing for Dutch spoken-word collections

    Get PDF
    Abstract—Whereas the growth of storage capacity is in accordance with widely acknowledged predictions, the possibilities to index and access the archives created is lagging behind. This is especially the case in the oral history domain and much of the rich content in these collections runs the risk to remain inaccessible for lack of robust search technologies. This paper addresses the history and development of robust audio indexing technology for searching Dutch spoken-word collections and compares Dutch audio indexing in the well-studied broadcast news domain with an oral-history case-study. It is concluded that despite significant advances in Dutch audio indexing technology and demonstrated applicability in several domains, further research is indispensable for successful automatic disclosure of spoken-word collections

    A spoken document retrieval application in the oral history domain

    Get PDF
    The application of automatic speech recognition in the broadcast news domain is well studied. Recognition performance is generally high and accordingly, spoken document retrieval can successfully be applied in this domain, as demonstrated by a number of commercial systems. In other domains, a similar recognition performance is hard to obtain, or even far out of reach, for example due to lack of suitable training material. This is a serious impediment for the successful application of spoken document retrieval techniques for other data then news. This paper outlines our first steps towards a retrieval system that can automatically be adapted to new domains. We discuss our experience with a recently implemented spoken document retrieval application attached to a web-portal that aims at the disclosure of a multimedia data collection in the oral history domain. The paper illustrates that simply deploying an off-theshelf\ud broadcast news system in this task domain will produce error rates that are too high to be useful for retrieval tasks. By applying adaptation techniques on the acoustic level and language model level, system performance can be improved considerably, but additional research on unsupervised adaptation and search interfaces is required to create an adequate search environment based on speech transcripts

    Exploration of audiovisual heritage using audio indexing technology

    Get PDF
    This paper discusses audio indexing tools that have been implemented for the disclosure of Dutch audiovisual cultural heritage collections. It explains the role of language models and their adaptation to historical settings and the adaptation of acoustic models for homogeneous audio collections. In addition to the benefits of cross-media linking, the requirements for successful tuning and improvement of available tools for indexing the heterogeneous A/V collections from the cultural heritage domain are reviewed. And finally the paper argues that research is needed to cope with the varying information needs for different types of users
    • …
    corecore