Search CORE

50 research outputs found

Strong and Efficient Baselines for Open Domain Conversational Question Answering

Author: Barlacchi Gianni
Coman Andrei C.
de Gispert Adrià
Publication venue
Publication date: 23/10/2023
Field of study

Unlike the Open Domain Question Answering (ODQA) setting, the conversational (ODConvQA) domain has received limited attention when it comes to reevaluating baselines for both efficiency and effectiveness. In this paper, we study the State-of-the-Art (SotA) Dense Passage Retrieval (DPR) retriever and Fusion-in-Decoder (FiD) reader pipeline, and show that it significantly underperforms when applied to ODConvQA tasks due to various limitations. We then propose and evaluate strong yet simple and efficient baselines, by introducing a fast reranking component between the retriever and the reader, and by performing targeted finetuning steps. Experiments on two ODConvQA tasks, namely TopiOCQA and OR-QuAC, show that our method improves the SotA results, while reducing reader's latency by 60%. Finally, we provide new and valuable insights into the development of challenging baselines that serve as a reference for future, more intricate approaches, including those that leverage Large Language Models (LLMs).Comment: Accepted to EMNLP 2023 Finding

arXiv.org e-Print Archive

N-gram posterior probability confidence measures for statistical machine translation: an empirical study

Author: Adrià de Gispert
Gonzalo Iglesias
Graeme Blackwood
William Byrne
Publication venue: Springer Nature
Publication date: 01/01/2012
Field of study

Springer - Publisher Connector

Multi-representation Ensembles and Delayed SGD Updates Improve Syntax-based NMT

Author: Byrne WJ
De Gispert Adrià
Saunders Danielle
Stahlberg Felix
Publication venue: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Publication date: 01/01/2018
Field of study

We explore strategies for incorporating target syntax into Neural Machine Translation. We specifically focus on syntax in ensembles containing multiple sentence representations. We formulate beam search over such ensembles using WFSTs, and describe a delayed SGD update training procedure that is especially effective for long representations like linearized syntax. Our approach gives state-of-the-art performance on a difficult Japanese-English task.This work was supported by EPSRC grant EP/L027623/1

arXiv.org e-Print Archive

Crossref

Apollo (Cambridge)

Modelo estocástico de traducción basado en N-gramas de tuplas bilingües y combinación log-lineal de características

Author: Banchs Martínez Rafael Enrique
Crego Clemente Josep María
Gispert Ramis Adrià de
Lambert Patrik
Mariño Acebal José Bernardo
Rodríguez Fonollosa José Adrián
Ruiz Costa-Jussà Marta
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2005
Field of study

En esta comunicación se presenta un sistema de traducción estocástica basado en el modelado mediante N-gramas de la probabilidad conjunta de textos bilingües. La unidad básica del modelo es la tupla, par de cadenas de palabras del lenguaje fuente (a traducir) y el lenguaje destino (traducción). La traducción se lleva a cabo mediante la maximización de una combinación lineal de los logaritmos de la probabilidad asignada a la traducción por el modelo de traducción y otras características, siguiendo la aproximación de entropía máxima. Las prestaciones del sistema de traducción son evaluadas con una tarea de traducción del habla: la traducción entre inglés y español (y viceversa) de transcripciones de intervenciones de los miembros del Parlamento Europeo. Los resultados alcanzados se encuentran al nivel del estado del arte.This communication introduces a stochastic machine translation system based on Ngram modelling of the joint probability of bilingual texts. The basic unit of this model is called a tuple and consists of a pair of both source (to be translated) language and target language (translation) word-strings. Translation is driven by a log-linear combination of the N-gram model probability and other features, according to the maximum entropy language modelling approach. The translation performance is evaluated by means of a speech-to-speech translation tasks: translation from Spanish to English (and viceversa) of European Parliament speeches. The system reaches a state-of-art performance.Este trabajo ha sido financiado parcialmente por la CICYT a través del proyecto TIC2002-04447-C02 (ALIADO) y la Unión Europea mediante el proyecto FP6-506738 (TC-STAR)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Secretaría de Estado de Cultura

Pushdown automata in statistical machine translation

Author: Adrià de Gispert
Aho Alfred V.
Bar-Hillel Y.
Bill Byrne
Blackwood Graeme
Brants Thorsten
Chang Yin-Wen
Chelba Ciprian
Cyril Allauzen
Dyer Chris
Gonzalo Iglesias
Hopkins M.
Huang Liang
Huang Liang
Huang Liang
Koo Terry
Kumar Shankar
Ljolje Andrej
Michael Riley
Mohri Mehryar
Nederhof Mark-Jan
Roark Brian
Roark Brian
Rush Alexander M.
Stolcke Andreas
Stolcke Andreas
Wu Dekai
Zens Richard
Publication venue: Computational Linguistics
Publication date: 01/01/2013
Field of study

This article describes the use of pushdown automata (PDA) in the context of statistical machine translation and alignment under a synchronous context-free grammar. We use PDAs to compactly represent the space of candidate translations generated by the grammar when applied to an input sentence. General-purpose PDA algorithms for replacement, composition, shortest path, and expansion are presented. We describe HiPDT, a hierarchical phrase-based decoder using the PDA representation and these algorithms. We contrast the complexity of this decoder with a decoder based on a finite state automata representation, showing that PDAs provide a more suitable framework to achieve exact decoding for larger synchronous context-free grammars and smaller language models. We assess this experimentally on a large-scale Chinese-to-English alignment and translation task. In translation, we propose a two-pass decoding strategy involving a weaker language model in the first-pass to address the results of PDA complexity analysis. We study in depth the experimental conditions and tradeoffs in which HiPDT can achieve state-of-the-art performance for large-scale SMT. </jats:p

CiteSeerX

Crossref

Apollo (Cambridge)

Famílies botàniques de plantes medicinals

Author: Alcoverro Godoy Carmen
Alles Pascual Roser
Alvarez Lorenzo Paula
Anglí Herrero Oriol
Bagan Perez Nuria
Balaguer Pi Nuria
Balcells Mestre Maria
Ballesté Márquez Jéssica
Bartolomé Schneider Carla
Batlle de Balle Mercadé Laura
Belmonte Llorens Judit
Bujedo Moreno Sergi
Caballero Roman Aitor
Calafat Gestoso Mario
Camps Vilar Nuria
Cardús Agra Aida
Casas Serrano Rut
Castanheira Margarida
Castilla i Amorós Laia
Castro de la Cortina Laia
Chavero Pieres Marta
Ciocia Nicola
Collado Lorenzo Jessica
Corominas Auguets Mònica
Costa Santamaria Berta
Dalmases Gener Marc de
Dios Regadera Montserrat de
Domingo Gesteiro Adrián
Dorado Cordero Desirée
Escrigas Albó Helena Evangelista
Farré Carrera Laia
Farré Segura Jordi
Fernández Catalán Miren
Fernández Martínez Gerard
Ferreras Barrero Oriol
Forcen Arenas Meritxell
Formiga Ribas Estel
Fragüet Español Beatriz
Fuertes Flores Lara
Fàbregas Vàzquez Júlia
Garcia i Salvador Nestor
García Marquina Cristian
Garrell Soler Gemma
Garrido Lopez Ainoa
Giménez García del Moral María
Gispert Latorre Laia
Gomez de la Peña Celia
Gomez-Guiu Hormigos Ma. Lourdes
Grau Calzada Victoria
Grau Ortiz Miquel
Grima Arcos Núria
Gómez Fusté Clàudia Maria
Herráez Nieto Silvia
Hurtado Espino Silvia
Iglesias Rodrigo Mireia
Izquierdo Pérez Noelia
Kaichouh Agrirch Mimoun
Labraña Sánchez Carme
Lara Arteaga Maria Dolores
Lasurt Barés Claudia
Llarden Mediavilla Arnau
Llibre Perez Monica
Llibre Perez Sandra
Llorca Lorenzo Sonia
Llorente Lopez Xavier
Luque Castro Adrià
Luque Salvat Marc
López Alonso Javier
López Ruiz Sergio
Madurell i Blanes Laura
Manouchehri Aminian S.
March Rodríguez Elena
Martell Alonso Clàudia
Martinez Alguacil Helena
Martinez Bosch Laia
Martínez Riveros Héctor
Martínez Samitier Àlex
Mas Rincón Irene
Matas Ayala Anaïs
Miranda Jimenez Cristina
Molas Casellas Júlia
Molina Trullàs Júlia
Moner Gomez Sofia
Moral Anter David
Morera Nadal Júlia
Moya Martinez Mari Carmen
Munte Jesus Guillem
Nadal Serrano Maria de Lluc
Narvaez Serrano Daniel Lluis
Navarro Pinin Laura
Nevado Maza Sara
Oliver Sintes Cristóbal
Oros Olondriz Alberto
Ortega Herrero Natalia
Ortega Moreno Angel
Pachón Díaz Carles
Pagans Llivina Silvia
Pau Parra Alba
Perez Prats Marc
Pino Alamos Maria Pilar
Pons Hospital Santiago
Pucuji Tierra Lizeth Estefaní
Puig Puig Júlia
Raventós Aymar Cristina
Redondo Vahle Ana
Reyner Parra Andrés Joaquím
Riba Baqués Marta
Rio Martinez Helena
Rodriguez Isidro Pol
Roig Rossello Mariona
Roig Turner Gemma
Rojas Ligeron John Henry
Ros Peña Alba
Rosendo Masià Cristina
Rubio Petit Núria
Ruiz Avila Genesis
Ruíz Mateo Héctor
Sanchez Perez Cristina
Santomà Cardús Alex
Sanz Peñalver Sara
Soler Mallart Guillem
Tanyà Rovira Anna
Thorson Bofarull Leif
Torres Solera Olga
Torres Vila Maria
Turu Pedrola Marta
Vega Rodríguez Laura
Veiret Duart Gabriel
Vela Pérez Miriam
Ventura Molina Pere
Vidal Bernaltes Marta
Vintu Stefan Silviu
Ye Wenxi
Publication venue
Publication date: 01/09/2014
Field of study

Facultat de Farmàcia, Universitat de Barcelona. Ensenyament: Grau de Farmàcia, Assignatura: Botànica Farmacèutica, Curs: 2013-2014, Coordinadors: Joan Simon, Cèsar Blanché i Maria Bosch.Els materials que aquí es presenten són els recull de 175 treballs d’una família botànica d’interès medicinal realitzats de manera individual. Els treballs han estat realitzat per la totalitat dels estudiants dels grups M-2 i M-3 de l’assignatura Botànica Farmacèutica durant els mesos d’abril i maig del curs 2013-14. Tots els treballs s’han dut a terme a través de la plataforma de GoogleDocs i han estat tutoritzats pel professor de l’assignatura i revisats i finalment co-avaluats entre els propis estudiants. L’objectiu principal de l’activitat ha estat fomentar l’aprenentatge autònom i col·laboratiu en Botànica farmacèutica

Diposit Digital de la Universitat de Barcelona