92,401 research outputs found
Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval
Although more and more language pairs are covered by machine translation
services, there are still many pairs that lack translation resources.
Cross-language information retrieval (CLIR) is an application which needs
translation functionality of a relatively low level of sophistication since
current models for information retrieval (IR) are still based on a
bag-of-words. The Web provides a vast resource for the automatic construction
of parallel corpora which can be used to train statistical translation models
automatically. The resulting translation models can be embedded in several ways
in a retrieval model. In this paper, we will investigate the problem of
automatically mining parallel texts from the Web and different ways of
integrating the translation models within the retrieval process. Our
experiments on standard test collections for CLIR show that the Web-based
translation models can surpass commercial MT systems in CLIR tasks. These
results open the perspective of constructing a fully automatic query
translation device for CLIR at a very low cost.Comment: 37 page
A Molecular Biology Database Digest
Computational Biology or Bioinformatics has been defined as the application of mathematical
and Computer Science methods to solving problems in Molecular Biology that require large scale
data, computation, and analysis [18]. As expected, Molecular Biology databases play an essential
role in Computational Biology research and development. This paper introduces into current
Molecular Biology databases, stressing data modeling, data acquisition, data retrieval, and the
integration of Molecular Biology data from different sources. This paper is primarily intended
for an audience of computer scientists with a limited background in Biology
Bilingual episodic memory: an introduction
Our current models of bilingual memory are essentially accounts of semantic memory whose goal is to explain bilingual lexical access to underlying imagistic and conceptual referents. While this research has included episodic memory, it has focused largely on recall for words, phrases, and sentences in the service of understanding the structure of semantic memory. Building on the four papers in this special issue, this article focuses on larger units of episodic memory(from quotidian events with simple narrative form to complex autobiographical memories) in service of developing a model of bilingual episodic memory. This requires integrating theory and research on how culture-specific narrative traditions inform encoding and retrieval with theory and research on the relation between(monolingual) semantic and episodic memory(Schank, 1982; Schank & Abelson, 1995; Tulving, 2002). Then, taking a cue from memory-based text processing studies in psycholinguistics(McKoon & Ratcliff, 1998), we suggest that as language forms surface in the progressive retrieval of features of an event, they trigger further forms within the same language serving to guide a within-language/ within-culture retrieval
Exploiting Query Structure and Document Structure to Improve Document Retrieval Effectiveness
In this paper we present a systematic analysis of document
retrieval using unstructured and structured queries within
the score region algebra (SRA) structured retrieval framework. The behavior of di®erent retrieval models, namely
Boolean, tf.idf, GPX, language models, and Okapi, is tested
using the transparent SRA framework in our three-level structured retrieval system called TIJAH. The retrieval models are implemented along four elementary retrieval aspects: element and term selection, element score computation, score combination, and score propagation.
The analysis is performed on a numerous experiments
evaluated on TREC and CLEF collections, using manually
generated unstructured and structured queries. Unstructured queries range from the short title queries to long title
+ description + narrative queries. For generating structured
queries we exploit the knowledge of the document structure
and the content used to semantically describe or classify
documents. We show that such structured information can
be utilized in retrieval engines to give more precise answers to user queries then when using unstructured queries
A Survey on Retrieval of Mathematical Knowledge
We present a short survey of the literature on indexing and retrieval of
mathematical knowledge, with pointers to 72 papers and tentative taxonomies of
both retrieval problems and recurring techniques.Comment: CICM 2015, 20 page
- …