126,737 research outputs found
Query recovery of short user queries: on query expansion with stopwords
User queries to search engines are observed to predominantly contain inflected content words but lack stopwords and capitalization. Thus, they often resemble natural language queries after case folding and stopword removal. Query recovery aims to generate a linguistically well-formed query from a given user query as input to provide natural language processing tasks and cross-language information retrieval (CLIR). The evaluation of query translation shows that translation scores (NIST and BLEU) decrease after case folding, stopword removal, and stemming. A baseline method for query recovery reconstructs capitalization and stopwords, which considerably increases translation scores and significantly increases mean average precision for a standard CLIR task
DCU and UTA at ImageCLEFPhoto 2007
Dublin City University (DCU) and University of Tampere(UTA) participated in the ImageCLEF 2007 photographic ad-hoc retrieval task with several monolingual and bilingual
runs. Our approach was language independent: text retrieval based on fuzzy s-gram query translation was combined with visual retrieval. Data fusion between text and image content
was performed using unsupervised query-time weight generation approaches. Our baseline was a combination of dictionary-based query translation and visual retrieval, which achieved the best result. The best mixed modality runs using fuzzy s-gram translation achieved on average around 83% of the performance of the baseline. Performance was more similar when only top rank precision levels of P10 and P20 were considered. This suggests that fuzzy sgram
query translation combined with visual retrieval is a cheap alternative for cross-lingual image retrieval where only a small number of relevant items are required. Both sets of results emphasize the merit of our query-time weight generation schemes for data fusion, with the fused runs exhibiting marked performance increases over single modalities, this is achieved without the use of any prior training data
Trio-One: Layering Uncertainty and Lineage on a Conventional DBMS
Trio is a new kind of database system that supports data, uncertainty, and lineage in a fully integrated manner. The first Trio prototype, dubbed Trio-One, is built on top of a conventional DBMS using data and query translation techniques together with a small number of stored procedures. This paper describes Trio-One's translation scheme and system architecture, showing how it efficiently and easily supports the Trio data model and query language
PRIME: A System for Multi-lingual Patent Retrieval
Given the growing number of patents filed in multiple countries, users are
interested in retrieving patents across languages. We propose a multi-lingual
patent retrieval system, which translates a user query into the target
language, searches a multilingual database for patents relevant to the query,
and improves the browsing efficiency by way of machine translation and
clustering. Our system also extracts new translations from patent families
consisting of comparable patents, to enhance the translation dictionary
Domain-specific query translation for multilingual access to digital libraries
Accurate high-coverage translation is a vital component of reliable cross language information access (CLIR) systems. This is particularly true of access to archives such as Digital Libraries which are often specific to certain domains. While general machine translation (MT) has been shown to be effective for CLIR tasks in information retrieval evaluation workshops, it is not well suited to specialized tasks where domain specific translations are required. We demonstrate that effective query translation
in the domain of cultural heritage (CH) can be achieved by augmenting a standard MT system with domain-specific phrase dictionaries automatically mined from the online Wikipedia. Experiments using our hybrid translation system with sample query logs from users of CH websites demonstrate a large improvement in the accuracy of domain specific phrase detection and translation
Object-oriented querying of existing relational databases
In this paper, we present algorithms which allow an object-oriented
querying of existing relational databases. Our goal is to provide an improved query
interface for relational systems with better query facilities than SQL. This
seems to be very important since, in real world applications, relational systems
are most commonly used and their dominance will remain in the near future. To
overcome the drawbacks of relational systems, especially the poor query facilities
of SQL, we propose a schema transformation and a query translation algorithm.
The schema transformation algorithm uses additional semantic information to enhance
the relational schema and transform it into a corresponding object-oriented
schema. If the additional semantic information can be deducted from an underlying
entity-relationship design schema, the schema transformation may be done
fully automatically. To query the created object-oriented schema, we use the
Structured Object Query Language (SOQL) which provides declarative query facilities
on objects. SOQL queries using the created object-oriented schema are
much shorter, easier to write and understand and more intuitive than corresponding
S Q L queries leading to an enhanced usability and an improved querying of
the database. The query translation algorithm automatically translates SOQL queries
into equivalent SQL queries for the original relational schema
- …