18,949 research outputs found

    Multimedia search without visual analysis: the value of linguistic and contextual information

    Get PDF
    This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features

    Automated speech and audio analysis for semantic access to multimedia

    Get PDF
    The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to increased granularity of automatically extracted metadata. A number of techniques will be presented, including the alignment of speech and text resources, large vocabulary speech recognition, key word spotting and speaker classification. The applicability of techniques will be discussed from a media crossing perspective. The added value of the techniques and their potential contribution to the content value chain will be illustrated by the description of two (complementary) demonstrators for browsing broadcast news archives

    From media crossing to media mining

    Get PDF
    This paper reviews how the concept of Media Crossing has contributed to the advancement of the application domain of information access and explores directions for a future research agenda. These will include themes that could help to broaden the scope and to incorporate the concept of medium-crossing in a more general approach that not only uses combinations of medium-specific processing, but that also exploits more abstract medium-independent representations, partly based on the foundational work on statistical language models for information retrieval. Three examples of successful applications of media crossing will be presented, with a focus on the aspects that could be considered a first step towards a generalized form of media mining

    Access to recorded interviews: A research agenda

    Get PDF
    Recorded interviews form a rich basis for scholarly inquiry. Examples include oral histories, community memory projects, and interviews conducted for broadcast media. Emerging technologies offer the potential to radically transform the way in which recorded interviews are made accessible, but this vision will demand substantial investments from a broad range of research communities. This article reviews the present state of practice for making recorded interviews available and the state-of-the-art for key component technologies. A large number of important research issues are identified, and from that set of issues, a coherent research agenda is proposed

    Metadata enrichment for digital heritage: users as co-creators

    Get PDF
    This paper espouses the concept of metadata enrichment through an expert and user-focused approach to metadata creation and management. To this end, it is argued the Web 2.0 paradigm enables users to be proactive metadata creators. As Shirky (2008, p.47) argues Web 2.0’s social tools enable “action by loosely structured groups, operating without managerial direction and outside the profit motive”. Lagoze (2010, p. 37) advises, “the participatory nature of Web 2.0 should not be dismissed as just a popular phenomenon [or fad]”. Carletti (2016) proposes a participatory digital cultural heritage approach where Web 2.0 approaches such as crowdsourcing can be sued to enrich digital cultural objects. It is argued that “heritage crowdsourcing, community-centred projects or other forms of public participation”. On the other hand, the new collaborative approaches of Web 2.0 neither negate nor replace contemporary standards-based metadata approaches. Hence, this paper proposes a mixed metadata approach where user created metadata augments expert-created metadata and vice versa. The metadata creation process no longer remains to be the sole prerogative of the metadata expert. The Web 2.0 collaborative environment would now allow users to participate in both adding and re-using metadata. The case of expert-created (standards-based, top-down) and user-generated metadata (socially-constructed, bottom-up) approach to metadata are complementary rather than mutually-exclusive. The two approaches are often mistakenly considered as dichotomies, albeit incorrectly (Gruber, 2007; Wright, 2007) . This paper espouses the importance of enriching digital information objects with descriptions pertaining the about-ness of information objects. Such richness and diversity of description, it is argued, could chiefly be achieved by involving users in the metadata creation process. This paper presents the importance of the paradigm of metadata enriching and metadata filtering for the cultural heritage domain. Metadata enriching states that a priori metadata that is instantiated and granularly structured by metadata experts is continually enriched through socially-constructed (post-hoc) metadata, whereby users are pro-actively engaged in co-creating metadata. The principle also states that metadata that is enriched is also contextually and semantically linked and openly accessible. In addition, metadata filtering states that metadata resulting from implementing the principle of enriching should be displayed for users in line with their needs and convenience. In both enriching and filtering, users should be considered as prosumers, resulting in what is called collective metadata intelligence

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Overview of VideoCLEF 2009: New perspectives on speech-based multimedia content enrichment

    Get PDF
    VideoCLEF 2009 offered three tasks related to enriching video content for improved multimedia access in a multilingual environment. For each task, video data (Dutch-language television, predominantly documentaries) accompanied by speech recognition transcripts were provided. The Subject Classification Task involved automatic tagging of videos with subject theme labels. The best performance was achieved by approaching subject tagging as an information retrieval task and using both speech recognition transcripts and archival metadata. Alternatively, classifiers were trained using either the training data provided or data collected from Wikipedia or via general Web search. The Affect Task involved detecting narrative peaks, defined as points where viewers perceive heightened dramatic tension. The task was carried out on the “Beeldenstorm” collection containing 45 short-form documentaries on the visual arts. The best runs exploited affective vocabulary and audience directed speech. Other approaches included using topic changes, elevated speaking pitch, increased speaking intensity and radical visual changes. The Linking Task, also called “Finding Related Resources Across Languages,” involved linking video to material on the same subject in a different language. Participants were provided with a list of multimedia anchors (short video segments) in the Dutch-language “Beeldenstorm” collection and were expected to return target pages drawn from English-language Wikipedia. The best performing methods used the transcript of the speech spoken during the multimedia anchor to build a query to search an index of the Dutch language Wikipedia. The Dutch Wikipedia pages returned were used to identify related English pages. Participants also experimented with pseudo-relevance feedback, query translation and methods that targeted proper names

    Collaboration in the Semantic Grid: a Basis for e-Learning

    Get PDF
    The CoAKTinG project aims to advance the state of the art in collaborative mediated spaces for the Semantic Grid. This paper presents an overview of the hypertext and knowledge based tools which have been deployed to augment existing collaborative environments, and the ontology which is used to exchange structure, promote enhanced process tracking, and aid navigation of resources before, after, and while a collaboration occurs. While the primary focus of the project has been supporting e-Science, this paper also explores the similarities and application of CoAKTinG technologies as part of a human-centred design approach to e-Learning

    Characterizing the Landscape of Musical Data on the Web: State of the Art and Challenges

    Get PDF
    Musical data can be analysed, combined, transformed and exploited for diverse purposes. However, despite the proliferation of digital libraries and repositories for music, infrastructures and tools, such uses of musical data remain scarce. As an initial step to help fill this gap, we present a survey of the landscape of musical data on the Web, available as a Linked Open Dataset: the musoW dataset of catalogued musical resources. We present the dataset and the methodology and criteria for its creation and assessment. We map the identified dimensions and parameters to existing Linked Data vocabularies, present insights gained from SPARQL queries, and identify significant relations between resource features. We present a thematic analysis of the original research questions associated with surveyed resources and identify the extent to which the collected resources are Linked Data-ready

    Video browsing interfaces and applications: a review

    Get PDF
    We present a comprehensive review of the state of the art in video browsing and retrieval systems, with special emphasis on interfaces and applications. There has been a significant increase in activity (e.g., storage, retrieval, and sharing) employing video data in the past decade, both for personal and professional use. The ever-growing amount of video content available for human consumption and the inherent characteristics of video data—which, if presented in its raw format, is rather unwieldy and costly—have become driving forces for the development of more effective solutions to present video contents and allow rich user interaction. As a result, there are many contemporary research efforts toward developing better video browsing solutions, which we summarize. We review more than 40 different video browsing and retrieval interfaces and classify them into three groups: applications that use video-player-like interaction, video retrieval applications, and browsing solutions based on video surrogates. For each category, we present a summary of existing work, highlight the technical aspects of each solution, and compare them against each other
    corecore