Search CORE

64 research outputs found

A Reinforcement Learning-driven Translation Model for Search-Oriented Conversational Systems

Author: Aissa Wafa
Denoyer Ludovic
Soulier Laure
Publication venue
Publication date: 29/08/2018
Field of study

Search-oriented conversational systems rely on information needs expressed in natural language (NL). We focus here on the understanding of NL expressions for building keyword-based queries. We propose a reinforcement-learning-driven translation model framework able to 1) learn the translation from NL expressions to queries in a supervised way, and, 2) to overcome the lack of large-scale dataset by framing the translation model as a word selection approach and injecting relevance feedback in the learning process. Experiments are carried out on two TREC datasets and outline the effectiveness of our approach.Comment: This is the author's pre-print version of the work. It is posted here for your personal use, not for redistribution. Please cite the definitive version which will be published in Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI - ISBN: 978-1-948087-75-

arXiv.org e-Print Archive

Overview of BioCreAtIvE: critical assessment of information extraction for biology

Author: Blaschke Christian
Hirschman Lynette
Valencia Alfonso
Yeh Alexander
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

Abstract Background The goal of the first BioCreAtIvE challenge (Critical Assessment of Information Extraction in Biology) was to provide a set of common evaluation tasks to assess the state of the art for text mining applied to biological problems. The results were presented in a workshop held in Granada, Spain March 28–31, 2004. The articles collected in this <it>BMC Bioinformatics </it>supplement entitled "A critical assessment of text mining methods in molecular biology" describe the BioCreAtIvE tasks, systems, results and their independent evaluation. Results BioCreAtIvE focused on two tasks. The first dealt with extraction of gene or protein names from text, and their mapping into standardized gene identifiers for three model organism databases (fly, mouse, yeast). The second task addressed issues of functional annotation, requiring systems to identify specific text passages that supported Gene Ontology annotations for specific proteins, given full text articles. Conclusion The first BioCreAtIvE assessment achieved a high level of international participation (27 groups from 10 countries). The assessment provided state-of-the-art performance results for a basic task (gene name finding and normalization), where the best systems achieved a balanced 80% precision / recall or better, which potentially makes them suitable for real applications in biology. The results for the advanced task (functional annotation from free text) were significantly lower, demonstrating the current limitations of text-mining approaches where knowledge extrapolation and interpretation are required. In addition, an important contribution of BioCreAtIvE has been the creation and release of training and test data sets for both tasks. There are 22 articles in this special issue, including six that provide analyses of results or data quality for the data sets, including a novel inter-annotator consistency assessment for the test set used in task 2.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital.CSIC

Building and Evaluating Open-Domain Dialogue Corpora with Clarifying Questions

Author: Aliannejadi M.
Burtsev M.
Chuklin A.
Dalton J.
Kiseleva J.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2021
Field of study

International Migration, Integration and Social Cohesion online publications

CONQRR: Conversational Query Rewriting for Retrieval with Reinforcement Learning

Author: Hajishirzi Hannaneh
Luan Yi
Ostendorf Mari
Rashkin Hannah
Reitter David
Tomar Gaurav Singh
Wu Zeqiu
Publication venue
Publication date: 01/05/2022
Field of study

Compared to standard retrieval tasks, passage retrieval for conversational question answering (CQA) poses new challenges in understanding the current user question, as each question needs to be interpreted within the dialogue context. Moreover, it can be expensive to re-train well-established retrievers such as search engines that are originally developed for non-conversational queries. To facilitate their use, we develop a query rewriting model CONQRR that rewrites a conversational question in the context into a standalone question. It is trained with a novel reward function to directly optimize towards retrieval using reinforcement learning and can be adapted to any off-the-shelf retriever. We show that CONQRR achieves state-of-the-art results on a recent open-domain CQA dataset containing conversations from three different sources, and is effective for two different off-the-shelf retrievers. Our extensive analysis also shows the robustness of CONQRR to out-of-domain dialogues as well as to zero query rewriting supervision

arXiv.org e-Print Archive

Corpus-Level End-to-End Exploration for Interactive Systems

Author: Tang Zhiwen
Yang Grace Hui
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 03/04/2020
Field of study

A core interest in building Artificial Intelligence (AI) agents is to let them interact with and assist humans. One example is Dynamic Search (DS), which models the process that a human works with a search engine agent to accomplish a complex and goal-oriented task. Early DS agents using Reinforcement Learning (RL) have only achieved limited success for (1) their lack of direct control over which documents to return and (2) the difficulty to recover from wrong search trajectories. In this paper, we present a novel corpus-level end-to-end exploration (CE3) method to address these issues. In our method, an entire text corpus is compressed into a global low-dimensional representation, which enables the agent to gain access to the full state and action spaces, including the under-explored areas. We also propose a new form of retrieval function, whose linear approximation allows end-to-end manipulation of documents. Experiments on the Text REtrieval Conference (TREC) Dynamic Domain (DD) Track show that CE3 outperforms the state-of-the-art DS systems.Comment: Accepted into AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Question Rewriting in Conversational Question Answering

Author: Anantha R.
Longpre S.
Tu Z.
Vakulenko S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

International Migration, Integration and Social Cohesion online publications

Patent Retrieval in Chemistry based on semantically tagged Named Entities

Author: Buckland Lori P.
Fluck Juliane
Friedrich Christoph M.
Gurulingappa Harsha
Hofmann-Apitius Martin
Klinger Roman
Mevissen Heinz-Theo
Müller Bernd
Voorhees Ellen M.
Publication venue
Publication date: 01/01/2009
Field of study

Gurulingappa H, Müller B, Klinger R, et al. Patent Retrieval in Chemistry based on semantically tagged Named Entities. In: Voorhees EM, Buckland LP, eds. The Eighteenth Text RETrieval Conference (TREC 2009) Proceedings. Gaithersburg, Maryland, USA; 2009.This paper reports on the work that has been conducted by Fraunhofer SCAI for Trec Chemistry (Trec-Chem) track 2009. The team of Fraunhofer SCAI participated in two tasks, namely Technology Survey and Prior Art Search. The core of the framework is an index of 1.2 million chemical patents provided as a data set by Trec. For the technology survey, three runs were submitted based on semantic dictionaries and noun phrases. For the prior art search task, several elds were introduced into the index that contained normalized noun phrases, biomedical as well as chemical entities. Altogether, 36 runs were submitted for this task that were based on automatic querying with tokens, noun phrases and entities along with dierent search strategies

Fraunhofer-ePrints

Publications at Bielefeld University

Integrating structure in the probabilistic model for Information Retrieval

Author: Géry Mathias
Largeron Christine
Thollard Franck
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/12/2008
Field of study

International audienceIn databases or in the World Wide Web, many documents are in a structured format (e.g. XML). We propose in this article to extend the classical IR probabilistic model in order to take into account the structure through the weighting of tags. Our approach includes a learning step in which the weight of each tag is computed. This weight estimates the probability that the tag distinguishes the terms which are the most relevant. Our model has been evaluated on a large collection during INEX IR evaluation campaigns

HAL-UJM

Crossref