Search CORE

6,111 research outputs found

Learning to merge search results for efficient Distributed Information Retrieval

Author: Hiemstra Djoerd
Tjin-Kam-Jet Kien-Tsoi T.E.
Publication venue: Radboud University
Publication date: 01/01/2010
Field of study

Merging search results from different servers is a major problem in Distributed Information Retrieval. We used Regression-SVM and Ranking-SVM which would learn a function that merges results based on information that is readily available: i.e. the ranks, titles, summaries and URLs contained in the results pages. By not downloading additional information, such as the full document, we decrease bandwidth usage. CORI and Round Robin merging were used as our baselines; surprisingly, our results show that the SVM-methods do not improve over those baselines

CiteSeerX

Radboud Repository

University of Twente Research Information

Multiple Retrieval Models and Regression Models for Prior Art Search

Author: Lopez Patrice
Romary Laurent
Publication venue
Publication date: 01/01/2009
Field of study

This paper presents the system called PATATRAS (PATent and Article Tracking, Retrieval and AnalysiS) realized for the IP track of CLEF 2009. Our approach presents three main characteristics: 1. The usage of multiple retrieval models (KL, Okapi) and term index definitions (lemma, phrase, concept) for the three languages considered in the present track (English, French, German) producing ten different sets of ranked results. 2. The merging of the different results based on multiple regression models using an additional validation set created from the patent collection. 3. The exploitation of patent metadata and of the citation structures for creating restricted initial working sets of patents and for producing a final re-ranking regression model. As we exploit specific metadata of the patent documents and the citation relations only at the creation of initial working sets and during the final post ranking step, our architecture remains generic and easy to extend

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Rennes 1

Distributed Information Retrieval using Keyword Auctions

Author: Hiemstra D.
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2008
Field of study

This report motivates the need for large-scale distributed approaches to information retrieval, and proposes solutions based on keyword auctions

CiteSeerX

Radboud Repository

University of Twente Research Information

Neo: A Learned Query Optimizer

Author: Alizadeh Mohammad
Kraska Tim
Mao Hongzi
Marcus Ryan
Negi Parimarjan
Papaemmanouil Olga
Tatbul Nesime
Zhang Chi
Publication venue: 'VLDB Endowment'
Publication date: 07/04/2019
Field of study

Query optimization is one of the most challenging problems in database systems. Despite the progress made over the past decades, query optimizers remain extremely complex components that require a great deal of hand-tuning for specific workloads and datasets. Motivated by this shortcoming and inspired by recent advances in applying machine learning to data management challenges, we introduce Neo (Neural Optimizer), a novel learning-based query optimizer that relies on deep neural networks to generate query executions plans. Neo bootstraps its query optimization model from existing optimizers and continues to learn from incoming queries, building upon its successes and learning from its failures. Furthermore, Neo naturally adapts to underlying data patterns and is robust to estimation errors. Experimental results demonstrate that Neo, even when bootstrapped from a simple optimizer like PostgreSQL, can learn a model that offers similar performance to state-of-the-art commercial optimizers, and in some cases even surpass them

arXiv.org e-Print Archive

DSpace@MIT