Search CORE

9,059 research outputs found

Relevance of ASR for the Automatic Generation of Keywords Suggestions for TV programs

Author: Brugman Hennie
Gazendam Luit
Heeren Willemijn
Malaisé Véronique
Ordelman Roeland
Publication venue: Association pour le traitement automatique des langues
Publication date: 01/01/2009
Field of study

Semantic access to multimedia content in audiovisual archives is to a large extent dependent on quantity and quality of the metadata, and particularly the content descriptions that are attached to the individual items. However, given the growing amount of materials that are being created on a daily basis and the digitization of existing analogue collections, the traditional manual annotation of collections puts heavy demands on resources, especially for large audiovisual archives. One way to address this challenge, is to introduce (semi) automatic annotation techniques for generating and/or enhancing metadata. The NWO funded CATCH-CHOICE project has investigated the extraction of keywords form textual resources related to the TV programs to be archived (context documents), in collaboration with the Dutch audiovisual archives, Sound and Vision. Besides the descriptions of the programs published by the broadcasters on their Websites, Automatic Speech Transcription (ASR) techniques from the CATCH-CHoral project, also provide textual resources that might be relevant for suggesting keywords. This paper investigates the suitability of ASR for generating such keywords, which we evaluate against manual annotations of the documents and against keywords automatically generated from context documents

VU Research Portal

University of Twente Research Information

Sound and Vision Publications

Experiments in terabyte searching, genomic retrieval and novelty detection for TREC 2004

Author: Blott Stephen
Boydell Oisín
Camous Fabrice
Ferguson Paul
Gaughan Georgina
Gurrin Cathal
Jones Gareth J.F.
Murphy Noel
O'Connor Noel E.
Smeaton Alan F.
Smyth Barry
Wilkins Peter
Publication venue: 'University of Aden - Faculty of Economics and Administration'
Publication date: 01/11/2004
Field of study

In TREC2004, Dublin City University took part in three tracks, Terabyte (in collaboration with University College Dublin), Genomic and Novelty. In this paper we will discuss each track separately and present separate conclusions from this work. In addition, we present a general description of a text retrieval engine that we have developed in the last year to support our experiments into large scale, distributed information retrieval, which underlies all of the track experiments described in this document

Irish Universities

DCU Online Research Access Service

Finding predominant word senses in untagged text

Author: Carroll John
Koeling Rob
McCarthy Diana Frances
Weeds Julie
Publication venue
Publication date: 01/01/2004
Field of study

In word sense disambiguation (WSD), the heuristic of choosing the most common sense is extremely powerful because the distribution of the senses of a word is often skewed. The problem with using the predominant, or first sense heuristic, aside from the fact that it does not take surrounding context into account, is that it assumes some quantity of handtagged data. Whilst there are a few hand-tagged corpora available for some languages, one would expect the frequency distribution of the senses of words, particularly topical words, to depend on the genre and domain of the text under consideration. We present work on the use of a thesaurus acquired from raw textual corpora and the WordNet similarity package to find predominant noun senses automatically. The acquired predominant senses give a precision of 64% on the nouns of the SENSEVAL- 2 English all-words task. This is a very promising result given that our method does not require any hand-tagged text, such as SemCor. Furthermore, we demonstrate that our method discovers appropriate predominant senses for words from two domainspecific corpora

CiteSeerX

Crossref

Sussex Research Online

Cross-concordances: terminology mapping and its effectiveness for information retrieval

Author: Mayr Philipp
Petras Vivien
Publication venue
Publication date: 01/01/2008
Field of study

The German Federal Ministry for Education and Research funded a major terminology mapping initiative, which found its conclusion in 2007. The task of this terminology mapping initiative was to organize, create and manage 'cross-concordances' between controlled vocabularies (thesauri, classification systems, subject heading lists) centred around the social sciences but quickly extending to other subject areas. 64 crosswalks with more than 500,000 relations were established. In the final phase of the project, a major evaluation effort to test and measure the effectiveness of the vocabulary mappings in an information system environment was conducted. The paper reports on the cross-concordance work and evaluation results.Comment: 19 pages, 4 figures, 11 tables, IFLA conference 200

arXiv.org e-Print Archive

E-LIS

ProThes: Thesaurus-based Meta-Search Engine for a Specific Application Domain

Author: Alshanski G.
Braslavski P.
Shishkin A.
Браславский П. И.
Publication venue: WWW2004 : NY USA
Publication date: 01/01/2004
Field of study

In this poster we introduce ProThes, a pilot meta-search engine (MSE) for a specific application domain. ProThes combines three approaches: meta-search, graphical user interface (GUI) for query specification, and thesaurus-based query techniques. ProThes attempts to employ domain-specific knowledge, which is represented by both a conceptual thesaurus and results ranking heuristics. Since the knowledge representation is separated from the MSE core, adjusting the system to a specific domain is trouble free. Thesaurus allows for manual query building and automatic query techniques. This poster outlines the overall system architecture, thesaurus representation format, and query operations. ProThes is implemented on J2EE platform as a Web service.The project was supported in part by the Russian Fund of Basic Research, grant # 03-07-90342

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Query expansion with naive bayes for searching distributed collections

Author: Yang Hui
Zhang Minjie
Publication venue
Publication date: 01/01/2002
Field of study

The proliferation of online information resources increases the importance of effective and efficient distributed searching. However, the problem of word mismatch seriously hurts the effectiveness of distributed information retrieval. Automatic query expansion has been suggested as a technique for dealing with the fundamental issue of word mismatch. In this paper, we propose a method - query expansion with Naive Bayes to address the problem, discuss its implementation in IISS system, and present experimental results demonstrating its effectiveness. Such technique not only enhances the discriminatory power of typical queries for choosing the right collections but also hence significantly improves retrieval results

CiteSeerX

Open Research Online (The Open University)

Thesaurus-assisted search term selection and query expansion: a review of user-centred studies

Author: Chowdhury G.
Revie C.W.
Shiri A.A.
Publication venue
Publication date: 01/01/2002
Field of study

This paper provides a review of the literature related to the application of domain-specific thesauri in the search and retrieval process. Focusing on studies which adopt a user-centred approach, the review presents a survey of the methodologies and results from empirical studies undertaken on the use of thesauri as sources of term selection for query formulation and expansion during the search process. It summaries the ways in which domain-specific thesauri from different disciplines have been used by various types of users and how these tools aid users in the selection of search terms. The review consists of two main sections covering, firstly studies on thesaurus-aided search term selection and secondly those dealing with query expansion using thesauri. Both sections are illustrated with case studies that have adopted a user-centred approach

University of Strathclyde Institutional Repository