18 research outputs found

    Overview of VideoCLEF 2008: Automatic generation of topic-based feeds for dual language audio-visual content

    Get PDF
    The VideoCLEF track, introduced in 2008, aims to develop and evaluate tasks related to analysis of and access to multilingual multimedia content. In its first year, VideoCLEF piloted the Vid2RSS task, whose main subtask was the classification of dual language video (Dutchlanguage television content featuring English-speaking experts and studio guests). The task offered two additional discretionary subtasks: feed translation and automatic keyframe extraction. Task participants were supplied with Dutch archival metadata, Dutch speech transcripts, English speech transcripts and 10 thematic category labels, which they were required to assign to the test set videos. The videos were grouped by class label into topic-based RSS-feeds, displaying title, description and keyframe for each video. Five groups participated in the 2008 VideoCLEF track. Participants were required to collect their own training data; both Wikipedia and general web content were used. Groups deployed various classifiers (SVM, Naive Bayes and k-NN) or treated the problem as an information retrieval task. Both the Dutch speech transcripts and the archival metadata performed well as sources of indexing features, but no group succeeded in exploiting combinations of feature sources to significantly enhance performance. A small scale fluency/adequacy evaluation of the translation task output revealed the translation to be of sufficient quality to make it valuable to a non-Dutch speaking English speaker. For keyframe extraction, the strategy chosen was to select the keyframe from the shot with the most representative speech transcript content. The automatically selected shots were shown, with a small user study, to be competitive with manually selected shots. Future years of VideoCLEF will aim to expand the corpus and the class label list, as well as to extend the track to additional tasks

    Classification of dual language audio-visual content: Introduction to the VideoCLEF 2008 pilot benchmark evaluation task

    Get PDF
    VideoCLEF is a new track for the CLEF 2008 campaign. This track aims to develop and evaluate tasks in analyzing multilingual video content. A pilot of a Vid2RSS task involving assigning thematic class labels to video kicks off the VideoCLEF track in 2008. Task participants deliver classification results in the form of a series of feeds, one for each thematic class. The data for the task are dual language television documentaries. Dutch is the dominant language and English-language content (mostly interviews) is embedded. Participants are provided with speech recognition transcripts of the data in both Dutch and English, and also with metadata generated by archivists. In addition to the classification task, participants can choose to participate in a translation task (translating the feed into a language of their choice) and a keyframe selection task (choosing a semantically appropriate keyframe for depiction of the videos in the feed)

    CHORUS Deliverable 4.5: Report of the 3rd CHORUS Conference

    Get PDF
    The third and last CHORUS conference on Multimedia Search Engines took place from the 26th to the 27th of May 2009 in Brussels, Belgium. About 100 participants from 15 European countries, the US, Japan and Australia learned about the latest developments in the domain. An exhibition of 13 stands presented 16 research projects currently ongoing around the world

    Retrieval of high-dimensional visual data: current state, trends and challenges ahead

    Get PDF
    Information retrieval algorithms have changed the way we manage and use various data sources, such as images, music or multimedia collections. First, free text information of documents from varying sources became accessible in addition to structured data in databases, initially for exact search and then for more probabilistic models. Novel approaches enable content-based visual search of images using computerized image analysis making visual image content searchable without requiring high quality manual annotations. Other multimedia data followed such as video and music retrieval, sometimes based on techniques such as extracting objects and classifying genre. 3D (surface) objects and solid textures have also been produced in quickly increasing quantities, for example in medical tomographic imaging. For these two types of 3D information sources, systems have become available to characterize the objects or textures and search for similar visual content in large databases. With 3D moving sequences (i.e., 4D), in particular medical imaging, even higher-dimensional data have become available for analysis and retrieval and currently present many multimedia retrieval challenges. This article systematically reviews current techniques in various fields of 3D and 4D visual information retrieval and analyses the currently dominating application areas. The employed techniques are analysed and regrouped to highlight similarities and complementarities among them in order to guide the choice of optimal approaches for new 3D and 4D retrieval problems. Opportunities for future applications conclude the article. 3D or higher-dimensional visual information retrieval is expected to grow quickly in the coming years and in this respect this article can serve as a basis for designing new applications

    Searching Spontaneous Conversational Speech:Proceedings of ACM SIGIR Workshop (SSCS2008)

    Get PDF

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Developing a distributed electronic health-record store for India

    Get PDF
    The DIGHT project is addressing the problem of building a scalable and highly available information store for the Electronic Health Records (EHRs) of the over one billion citizens of India

    Use Case Oriented Medical Visual Information Retrieval & System Evaluation

    Get PDF
    Large amounts of medical visual data are produced daily in hospitals, while new imaging techniques continue to emerge. In addition, many images are made available continuously via publications in the scientific literature and can also be valuable for clinical routine, research and education. Information retrieval systems are useful tools to provide access to the biomedical literature and fulfil the information needs of medical professionals. The tools developed in this thesis can potentially help clinicians make decisions about difficult diagnoses via a case-based retrieval system based on a use case associated with a specific evaluation task. This system retrieves articles from the biomedical literature when querying with a case description and attached images. This thesis proposes a multimodal approach for medical case-based retrieval with focus on the integration of visual information connected to text. Furthermore, the ImageCLEFmed evaluation campaign was organised during this thesis promoting medical retrieval system evaluation
    corecore