Search CORE

10 research outputs found

Learning to Expand: Reinforced Pseudo-relevance Feedback Selection for Information-seeking Conversations

Author: Chen Cen
Chen Haiqing
Huang Jun
Ji Feng
Pan Haojie
Qiu Minghui
Yang Liu
Publication venue
Publication date: 25/11/2020
Field of study

Intelligent personal assistant systems for information-seeking conversations are increasingly popular in real-world applications, especially for e-commerce companies. With the development of research in such conversation systems, the pseudo-relevance feedback (PRF) has demonstrated its effectiveness in incorporating relevance signals from external documents. However, the existing studies are either based on heuristic rules or require heavy manual labeling. In this work, we treat the PRF selection as a learning task and proposed a reinforced learning based method that can be trained in an end-to-end manner without any human annotations. More specifically, we proposed a reinforced selector to extract useful PRF terms to enhance response candidates and a BERT based response ranker to rank the PRF-enhanced responses. The performance of the ranker serves as rewards to guide the selector to extract useful PRF terms, and thus boost the task performance. Extensive experiments on both standard benchmark and commercial datasets show the superiority of our reinforced PRF term selector compared with other potential soft or hard selection methods. Both qualitative case studies and quantitative analysis show that our model can not only select meaningful PRF terms to expand response candidates but also achieve the best results compared with all the baseline methods on a variety of evaluation metrics. We have also deployed our method on online production in an e-commerce company, which shows a significant improvement over the existing online ranking system

arXiv.org e-Print Archive

Response Ranking with Deep Matching Networks and External Knowledge in Information-seeking Conversation Systems

Author: Arguello J.
Bordes A.
Dhingra B.
Hu B.
Kenter T.
Kingma D. P.
Li J.
Li J.
Lowe R.
Mikolov T.
Pang L.
Qiu M.
Ritter A.
Robertson S.
Shang L.
Sordoni A.
Thomas P.
Tian Z.
Wan S.
Wang H.
Wen T.
Wu Y.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/05/2018
Field of study

Intelligent personal assistant systems with either text-based or voice-based conversational interfaces are becoming increasingly popular around the world. Retrieval-based conversation models have the advantages of returning fluent and informative responses. Most existing studies in this area are on open domain "chit-chat" conversations or task / transaction oriented conversations. More research is needed for information-seeking conversations. There is also a lack of modeling external knowledge beyond the dialog utterances among current conversational models. In this paper, we propose a learning framework on the top of deep neural matching networks that leverages external knowledge for response ranking in information-seeking conversation systems. We incorporate external knowledge into deep neural models with pseudo-relevance feedback and QA correspondence knowledge distillation. Extensive experiments with three information-seeking conversation data sets including both open benchmarks and commercial data show that, our methods outperform various baseline methods including several deep text matching models and the state-of-the-art method on response selection in multi-turn conversations. We also perform analysis over different response types, model variations and ranking examples. Our models and research findings provide new insights on how to utilize external knowledge with deep neural models for response selection and have implications for the design of the next generation of information-seeking conversation systems.Comment: Accepted by the 41th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018), Ann Arbor, Michigan, U.S.A. July 8-12, 2018 (Full Oral Paper

arXiv.org e-Print Archive

Crossref

Harnessing Evolution of Multi-Turn Conversations for Effective Answer Retrieval

Author: Aliannejadi Mohammad
Aliannejadi Mohammad
Aliannejadi Mohammad
Azzopardi Leif
Dalton Jeffrey
Devlin Jacob
Dietz Laura
Hashemi Seyyed Hadi
Loisel Alain
Nguyen Tri
Pennington Jeffrey
Robertson Stephen E.
Sun Yueming
Vaswani Ashish
Vtyurina Alexandra
Walker Marilyn A.
Williams Jason
Zhang Yongfeng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

With the improvements in speech recognition and voice generation technologies over the last years, a lot of companies have sought to develop conversation understanding systems that run on mobile phones or smart home devices through natural language interfaces. Conversational assistants, such as Google Assistant and Microsoft Cortana, can help users to complete various types of tasks. This requires an accurate understanding of the user's information need as the conversation evolves into multiple turns. Finding relevant context in a conversation's history is challenging because of the complexity of natural language and the evolution of a user's information need. In this work, we present an extensive analysis of language, relevance, dependency of user utterances in a multi-turn information-seeking conversation. To this aim, we have annotated relevant utterances in the conversations released by the TREC CaST 2019 track. The annotation labels determine which of the previous utterances in a conversation can be used to improve the current one. Furthermore, we propose a neural utterance relevance model based on BERT fine-tuning, outperforming competitive baselines. We study and compare the performance of multiple retrieval models, utilizing different strategies to incorporate the user's context. The experimental results on both classification and retrieval tasks show that our proposed approach can effectively identify and incorporate the conversation context. We show that processing the current utterance using the predicted relevant utterance leads to a 38% relative improvement in terms of nDCG@20. Finally, to foster research in this area, we have released the dataset of the annotations.Comment: To appear in ACM CHIIR 2020, Vancouver, BC, Canad

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Extracting audio summaries to support effective spoken document search

Author: Al-Maskari
Besser
Jayasinghe
Larson
Manning
Ordelman
Porter
Sahib
Sakai
Salton
Thong
Tombros
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Més que mil paraules: Funcionament i estat de la qüestió de la cerca i recuperació d'informació multimèdia basada en el contingut

Author: Cantarell Gutiérrez Oriol
Publication venue
Publication date: 25/09/2019
Field of study

Treballs Finals de Grau d'Informació i Documentació, Facultat d'Informació i Mitjans Audiovisuals, Universitat de Barcelona, Curs: 2018-2019, Gema Santos-HermosaEl creixement exponencial de la documentació multimèdia des de la popularització d’Internet, i en especial, des de l’aparició dels telèfons intel·ligents, ha provocat que la recuperació d’aquests continguts amb metadades associades necessiti metodologies de cerca addicionals i sistemes de recuperació d’informació adaptats a les necessitats d’aquests continguts. La cerca i recuperació d’informació basada en el seu contingut és un camp d’investigació de plena vigència, polifacètic i que implica un coneixement pluridisciplinari per a la seva aplicació. És un camp que implica conèixer com es descriu, com se cerca i com es recupera la informació, els processos de reconeixement de patrons per ordinador i el seu funcionament, i en alguns casos, la interacció i integració en sistemes més complexos que van més enllà de la simple consulta-resposta per part de l’usuari. De fet, l’ús d’aquests sistemes està tan integrat en d’altres que tant les cerques com les recuperacions acaben per ser processos interns d’un programari tancat. Per exemple, en el cas de la conducció assistida, la captació i interpretació d’imatges està completament gestionada per l’ordinador de bord del vehicle. Els processos que intervenen en el reconeixement d’objectes en moviment que fa el vehicle són els mateixos que es farien en una cerca d’imatge amb una altra imatge: s’identifica un cos mòbil i en resposta a la tipologia, el vehicle adapta gradualment la velocitat o en cas d’avançament, la trajectòria, per mantenir distàncies de seguretat. En altres casos, hi ha una interacció directa entre el sistema de recuperació d’informació basat en el contingut i l’usuari que cerca una resposta concreta, a vegades sense ser del tot conscient dels procediments de cerca que ha utilitzat el sistema. Per exemple, l’usuari captura una cançó amb Shazam, que li retorna el títol de la cançó i el grup; però la seva intenció final probablement no és la recuperació d’informació en si mateixa, sinó la possibilitat de poder tornar a escoltar-la més tard, així que l’aplicació dóna l’opció o bé de comprar-la en una botiga digital o inclús afegir-la a un servei de música en streaming. En definitiva, la cerca i recuperació d’informació multimèdia forma part de moltes tecnologies no només de present, sinó de futur, i que forma i formarà part de diferents serveis de recuperació d’informació multimèdia. Aquest treball pretén entendre el funcionament d’aquests sistemes i establir un estat de la qüestió sobre la recuperació i cerca d’informació basada en el contingut. En altres paraules, entendre i conèixer l’aplicació i implicacions de l’ús d’aquells sistemes on, en comptes d’introduir una consulta textual, l’usuari introdueix imatges, sons o inclús petits fragments de vídeo per obtenir-ne d’altres o informacions textuals associades

Diposit Digital de la Universitat de Barcelona