Search CORE

99,434 research outputs found

Content-based video indexing for the support of digital library search

Author: Agrawal Rakesh
Apers Peter M.G.
Blok H.E.
Jonker Willem
Kersten M.
Petkovic M.
van Zwol Roelof
Windhouwer M.
Publication venue: IEEE Computer society Press
Publication date: 01/01/2002
Field of study

Presents a digital library search engine that combines efforts of the AMIS and DMW research projects, each covering significant parts of the problem of finding the required information in an enormous mass of data. The most important contributions of our work are the following: (1) We demonstrate a flexible solution for the extraction and querying of meta-data from multimedia documents in general. (2) Scalability and efficiency support are illustrated for full-text indexing and retrieval. (3) We show how, for a more limited domain, like an intranet, conceptual modelling can offer additional and more powerful query facilities. (4) In the limited domain case, we demonstrate how domain knowledge can be used to interpret low-level features into semantic content. In this short description, we focus on the first and fourth item

CWI's Institutional Repository

Pure OAI Repository

University of Twente Research Information

International Migration, Integration and Social Cohesion online publications

Associative conceptual space-based information retrieval systems

Author: Berg J.H. (Jan) van den
Schuemie M.J. (Martijn)
Publication venue
Publication date: 01/01/1998
Field of study

In this `Information Era' with the availability of large collections of books, articles, journals, CD-ROMs, video films and so on, there exists an increasing need for intelligent information retrieval systems that enable users to find the information desired easily. Many attempts have been made to construct such retrieval systems, including the electronic ones used in libraries and including the search engines for the World Wide Web. In many cases, however, the so-called `precision' and `recall' of these systems leave much to be desired. In this paper, a new AI-based retrieval system is proposed, inspired by, among other things, the WEBSOM-algorithm. However, contrary to that approach where domain knowledge is extracted from the full text of all books, we propose a system where certain specific meta-information is automatically assembled using only the index of every document. This knowledge extraction process results into a new type of concept space, the so-called Associative Conceptual Space where the `concepts' as found in all documents are clustered using a Hebbian-type of learning algorithm. Then, each document can be characterised by comparing the concepts as occurring in it to those present in the associative conceptual space. Applying these characterisations, all documents can be clustered such that semantically similar documents lie close together on a Self-Organising Map. This map can easily be inspected by its user

EUR Research Repository

Erasmus University Digital Repository

Ontology-based document representation for biomedical information retrieval

Author: Camous Fabrice
Publication venue: Dublin City University. School of Computing
Publication date: 01/01/2007
Field of study

In the current era of fast sequencing of entire genomes, more data is becoming available for analysis. This data analysis, in turn, leads to an increasing amount of scientfic publications. Consequently, biologists spend a considerable part of their time searching the biomedical literature. This avoids expensive experiment duplications in wet labs, and provides inspiration for new hypotheses. Unfortunately, the fast growth of biological information, in the form of free-text, has led to a lack of standard in the naming of biological entities. As a result, different genes are referred to with the same name, or acronym, and different names refer to tlze same gene. The ambiguity of free-text is problematic, as the success of a search often relies on the matching of a query term with a term contained in the document representation. Biomedical ontologies, when available, can help disambiguate the information expressed in free-text: they provide unique terms to represent concepts and therefore counterweiglzt the occurrence of synonyms and polysems in free-text. They also contain information about the relationships between concepts. This information can be used to understand and evaluate semantic similarities between concepts. The largest repository of biomedical research literature in the world, MEDLINE, is an entry point to biomedical information for most biologists (Hersh et al., 2004). The Medical Subject Headings (MeSH) is the controlled vocabulary used in MEDLINE to annotate the conceptual content of biomedical articles. The annotations include information about the importance of MeSH concepts in the article, and their contexts. The MeSH ontology is organized in several hierarchies that indicate the level of specificity of the MeSH concepts. This hierarchical information can be used to generate semantic similarities between concepts. Our inotivation is the inzprovelnent of MEDLINE search, as it is still a central information access point for biologists in spite of the growing availability of full journal articles on the Web. In particular, we focus on the use of the MeSH ontology to represent and retrieve biomedical articles. Although MeSH is widely used by current MELDINE search methods, we show that the information contained in MEDLINE MeSH annotations and tlze MeSH hierarchies is often overlooked. We hypothesize that MeSH-based document representation can ilzzprove MEDLINE information retrieval. Specifically, our hypothesis is that the integration of iliforniatioli about concept relevance (from the MEDLINE annotation), and interconcept similarities (from tlze MeSH hierarchies), will ilzzprove retrieval performance. We evaluate methods using such information to discriminate and compare MeSH concepts. Our methods are evaluated in the context of MEDLINE ad hoc document retrieval and document binary classifications. Our evaluatiolls use standard datasets and metrics recently used at the Genonzics track of the 2005 Text Retrieval Conference workshop

Irish Universities

DCU Online Research Access Service

Automated speech and audio analysis for semantic access to multimedia

Author: Huijbregts Marijn
Jong Franciska de
Ordelman Roeland
Publication venue: Springer Verlag
Publication date: 01/01/2006
Field of study

The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to increased granularity of automatically extracted metadata. A number of techniques will be presented, including the alignment of speech and text resources, large vocabulary speech recognition, key word spotting and speaker classification. The applicability of techniques will be discussed from a media crossing perspective. The added value of the techniques and their potential contribution to the content value chain will be illustrated by the description of two (complementary) demonstrators for browsing broadcast news archives

University of Twente Research Information

Multimedia search without visual analysis: the value of linguistic and contextual information

Author: Jong Franciska M.G. de
Vries Arjen P. de
Westerveld Thijs
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2007
Field of study

This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features

CiteSeerX

CWI's Institutional Repository

University of Twente Research Information

Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy

Author: Bekhuis T
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/04/2006
Field of study

Innovative biomedical librarians and information specialists who want to expand their roles as expert searchers need to know about profound changes in biology and parallel trends in text mining. In recent years, conceptual biology has emerged as a complement to empirical biology. This is partly in response to the availability of massive digital resources such as the network of databases for molecular biologists at the National Center for Biotechnology Information. Developments in text mining and hypothesis discovery systems based on the early work of Swanson, a mathematician and information scientist, are coincident with the emergence of conceptual biology. Very little has been written to introduce biomedical digital librarians to these new trends. In this paper, background for data and text mining, as well as for knowledge discovery in databases (KDD) and in text (KDT) is presented, then a brief review of Swanson's ideas, followed by a discussion of recent approaches to hypothesis discovery and testing. 'Testing' in the context of text mining involves partially automated methods for finding evidence in the literature to support hypothetical relationships. Concluding remarks follow regarding (a) the limits of current strategies for evaluation of hypothesis discovery systems and (b) the role of literature-based discovery in concert with empirical research. Report of an informatics-driven literature review for biomarkers of systemic lupus erythematosus is mentioned. Swanson's vision of the hidden value in the literature of science and, by extension, in biomedical digital databases, is still remarkably generative for information scientists, biologists, and physicians. © 2006Bekhuis; licensee BioMed Central Ltd

Springer - Publisher Connector

PubMed Central

D-Scholarship@Pitt

Natural Language Processing for Information Retrieval and Knowledge Discovery

Author: Liddy Elizabeth D.
Publication venue: Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
Publication date: 01/01/1998
Field of study

Natural Language Processing (NLP) is a powerful technology for the vital tasks of information retrieval (IR) and knowledge discovery (KD) which, in turn, feed the visualization systems of the present and future and enable knowledge workers to focus more of their time on the vital tasks of analysis and prediction.published or submitted for publicatio

Illinois Digital Environment for Access to Learning and Scholarship Repository

Abstracts and Abstracting in Knowledge Discovery

Author: Lancaster F.W.
Pinto Maria
Publication venue: Graduate School of Library and Information Science. University of Illinois at Urbana-Champaign
Publication date: 01/01/1999
Field of study

published or submitted for publicatio

Illinois Digital Environment for Access to Learning and Scholarship Repository

Conceptual search – ESI, litigation and the issue of language

Author: Chaplin D.T.
Publication venue
Publication date: 25/06/2008
Field of study

Across the globe, legal, business and technical practitioners charged with managing information are continually challenged by rapid-fire evolution and growth in the legal and technology fields. In the United States, new compliance requirements, amendments to the Federal Rules of Civil Procedure (FRCP) and corresponding case law, along with technical advances, have made litigation support one of the most exciting professions in the legal arena. In the UK, revisions to the Practice Direction to CPR Rule 31 require parties in civil litigation to consider the impacts associated with electronic documents. One emerging technology trends—both aiding and complicating the management of electronically stored information (ESI) in litigation in the US, EU and UK alike—is the notion of “conceptual search.” This paper focuses on the evolution of conceptual search technology, and predictions of where this science will take legal professionals and technical information managers in coming years and a look at the advantages conceptual search can provide in dealing with the issue of language. This paper will focus primarily and the latent semantic analysis approach to conceptual search and why this approach is advantageous when searching ESI regardless of the language used in the documents, even to the extent of allowing for cross language searching and accurate searching of documents that contain co-mingle foreign terms with the native language

UCL Discovery