35 research outputs found

    Segmentation-Free Streaming Machine Translation

    Full text link
    Streaming Machine Translation (MT) is the task of translating an unbounded input text stream in real-time. The traditional cascade approach, which combines an Automatic Speech Recognition (ASR) and an MT system, relies on an intermediate segmentation step which splits the transcription stream into sentence-like units. However, the incorporation of a hard segmentation constrains the MT system and is a source of errors. This paper proposes a Segmentation-Free framework that enables the model to translate an unsegmented source stream by delaying the segmentation decision until the translation has been generated. Extensive experiments show how the proposed Segmentation-Free framework has better quality-latency trade-off than competing approaches that use an independent segmentation model. Software, data and models will be released upon paper acceptance.Comment: 11 pages, 5 figure

    VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining

    Get PDF
    In this paper, we describe VivesDebate-Speech, a corpus of spoken argumentation created to leverage audio features for argument mining tasks. The creation of this corpus represents an important contribution to the intersection of speech processing and argument mining communities, and one of the most complete publicly available resources in this topic. Moreover, we have performed a set of first-of-their-kind experiments which show an improvement when integrating audio features into the argument mining pipeline. The provided results can be used as a baseline for future research

    The spatial ultimatum game revisited

    Get PDF
    We revisit the issue of the emergence of fair behavior in the framework of the spatial Ultimatum game, adding many important results and insights to the pioneering work by Page et al. [2000. The spatial Ultimatum game. Proc. R. Soc. London B 267, 2177], who showed in a specific example that on a two dimensional setup evolution may lead to strategies with some degree of fairness. With in this spatial framework, we carry out a thorough simulation study and show that the emergence of altruism is a very generic phenomenon whose details depend on the dynamics considered. A very frequent feature is the spontaneous emergence and fixation of quasiempathetic individuals,whose offers are very close to their acceptance thresholds. We present analytical arguments that allow an understanding of our results and give insights on the manner in which local effects in evolution may lead to such non rational or apparently maladaptive behaviors.This work was supported in part by MICINN (Spain) through grants MOSAICO and RESINEE, and by Comunidad de Madrid (Spain) through grant MODELICO. J. I. is supported by a contract from the Consejería de Educación of the Comunidad de Madrid (Spain) and from the European Social Fund.Publicad

    Vrain at IroSvA 2019:Exploring classical and transfer learning approaches to short message irony detection

    Get PDF
    This paper describes VRAIN’s participation at IroSvA 2019: Irony Detection in Spanish Variants task of the Iberian Languagues Evaluation Forum (IberLEF 2019). We describe the entire pre-processing, feature extraction, model selection and hyperparameter optimization carried out for our submissions to the shared task. A central part of our work is to provide an in-depth comparison of the performance of different classical Machine learning techniques, as well as some recent transfer learning proposals for Natural Language Processing (NLP) classification problems.</p

    VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining

    Get PDF
    In this paper, we describe VivesDebate-Speech, a corpus of spoken argumentation created to leverage audio features for argument mining tasks. The creation of this corpus represents an important contribution to the intersection of speech processing and argument mining communities, and one of the most complete publicly available resources in this topic. Moreover, we have performed a set of first-of-their-kind experiments which show an improvement when integrating audio features into the argument mining pipeline. The provided results can be used as a baseline for future research

    VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining

    Full text link
    In this paper, we describe VivesDebate-Speech, a corpus of spoken argumentation created to leverage audio features for argument mining tasks. The creation of this corpus represents an important contribution to the intersection of speech processing and argument mining communities, and one of the most complete publicly available resources in this topic. Moreover, we have performed a set of first-of-their-kind experiments which show an improvement when integrating audio features into the argument mining pipeline. The provided results can be used as a baseline for future research.Comment: 5 pages; EMNLP 2023 Accepted Versio

    Vrain at IroSvA 2019:Exploring classical and transfer learning approaches to short message irony detection

    Get PDF
    This paper describes VRAIN’s participation at IroSvA 2019: Irony Detection in Spanish Variants task of the Iberian Languagues Evaluation Forum (IberLEF 2019). We describe the entire pre-processing, feature extraction, model selection and hyperparameter optimization carried out for our submissions to the shared task. A central part of our work is to provide an in-depth comparison of the performance of different classical Machine learning techniques, as well as some recent transfer learning proposals for Natural Language Processing (NLP) classification problems.</p

    From Simultaneous to Streaming Machine Translation by Leveraging Streaming History

    Full text link
    Simultaneous Machine Translation is the task of incrementally translating an input sentence before it is fully available. Currently, simultaneous translation is carried out by translating each sentence independently of the previously translated text. More generally, Streaming MT can be understood as an extension of Simultaneous MT to the incremental translation of a continuous input text stream. In this work, a state-of-the-art simultaneous sentence-level MT system is extended to the streaming setup by leveraging the streaming history. Extensive empirical results are reported on IWSLT Translation Tasks, showing that leveraging the streaming history leads to significant quality gains. In particular, the proposed system proves to compare favorably to the best performing systems.Comment: ACL 2022 - Camera ready; v3: expanded data pre-processin

    Stream-level Latency Evaluation for Simultaneous Machine Translation

    Full text link
    [EN] Simultaneous machine translation has recently gained traction thanks to significant quality improvements and the advent of streaming applications. Simultaneous translation systems need to find a trade-off between translation quality and response time, and with this purpose multiple latency measures have been proposed. However, latency evaluations for simultaneous translation are estimated at the sentence level, not taking into account the sequential nature of a streaming scenario. Indeed, these sentence-level latency measures are not well suited for continuous stream translation, resulting in figures that are not coherent with the simultaneous translation policy of the system being assessed. This work proposes a stream-level adaptation of the current latency measures based on a re-segmentation approach applied to the output translation, that is successfully evaluated on streaming conditions for a reference IWSLT task.The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation program under grant agreement no. 761758 (X5Gon) and 952215 (TAILOR) and Erasmus+ Education program under grant agreement no. 20-226-093604-SCH; the Government of Spain's research project Multisub, ref. RTI2018-094879-B-I00 (MCIU/AEI/FEDER,EU) and FPU scholarships FPU18/04135; and the Generalitat Valenciana's research project Classroom Activity Recognition, ref. PROMETEO/2019/111.Iranzo-Sánchez, J.; Civera Saiz, J.; Juan, A. (2021). Stream-level Latency Evaluation for Simultaneous Machine Translation. The Association for Computational Linguistics. 664-670. http://hdl.handle.net/10251/182203S66467

    The MLLP-UPV Spanish-Portuguese and Portuguese-Spanish Machine Translation Systems for WMT19 Similar Language Translation Task

    Full text link
    [EN] This paper describes the participation of the MLLP research group of the Universitat Politècnica de València in the WMT 2019 Similar Language Translation Shared Task. We have submitted systems for the Portuguese &#8596; Spanish language pair, in both directions. We have submitted systems based on the Transformer architecture as well as an in development novel architecture which we have called 2D alternating RNN. We have carried out domain adaptation through fine-tuning.The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement no. 761758 (X5gon); the Government of Spain's research project Multisub, ref. RTI2018-094879-B-I00 (MCIU/AEI/FEDER, EU) and the Generalitat Valenciana's predoctoral research scholarship ACIF/2017/055.Baquero-Arnal, P.; Iranzo-Sánchez, J.; Civera Saiz, J.; Juan, A. (2019). The MLLP-UPV Spanish-Portuguese and Portuguese-Spanish Machine Translation Systems for WMT19 Similar Language Translation Task. The Association for Computational Linguistics. 179-184. http://hdl.handle.net/10251/180621S17918
    corecore