35 research outputs found
Segmentation-Free Streaming Machine Translation
Streaming Machine Translation (MT) is the task of translating an unbounded
input text stream in real-time. The traditional cascade approach, which
combines an Automatic Speech Recognition (ASR) and an MT system, relies on an
intermediate segmentation step which splits the transcription stream into
sentence-like units. However, the incorporation of a hard segmentation
constrains the MT system and is a source of errors. This paper proposes a
Segmentation-Free framework that enables the model to translate an unsegmented
source stream by delaying the segmentation decision until the translation has
been generated. Extensive experiments show how the proposed Segmentation-Free
framework has better quality-latency trade-off than competing approaches that
use an independent segmentation model. Software, data and models will be
released upon paper acceptance.Comment: 11 pages, 5 figure
VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining
In this paper, we describe VivesDebate-Speech, a corpus of spoken argumentation created to leverage audio features for argument mining tasks. The creation of this corpus represents an important contribution to the intersection of speech processing and argument mining communities, and one of the most complete publicly available resources in this topic. Moreover, we have performed a set of first-of-their-kind experiments which show an improvement when integrating audio features into the argument mining pipeline. The provided results can be used as a baseline for future research
The spatial ultimatum game revisited
We revisit the issue of the emergence of fair behavior in the framework of the spatial Ultimatum game, adding many important results and insights to the pioneering work by Page et al. [2000. The spatial Ultimatum game. Proc. R. Soc. London B 267, 2177], who showed in a specific example that on a two dimensional setup evolution may lead to strategies with some degree of fairness. With in this spatial
framework, we carry out a thorough simulation study and show that the emergence of altruism is a very generic phenomenon whose details depend on the dynamics considered. A very frequent feature is
the spontaneous emergence and fixation of quasiempathetic individuals,whose offers are very close to their acceptance thresholds. We present analytical arguments that allow an understanding of our results and give insights on the manner in which local effects in evolution may lead to such non
rational or apparently maladaptive behaviors.This work was supported in part by MICINN (Spain) through grants MOSAICO and RESINEE, and by Comunidad de Madrid (Spain) through grant MODELICO. J. I. is supported by a contract from the Consejería de Educación of the Comunidad de Madrid (Spain) and from the European Social Fund.Publicad
Vrain at IroSvA 2019:Exploring classical and transfer learning approaches to short message irony detection
This paper describes VRAIN’s participation at IroSvA 2019: Irony Detection in Spanish Variants task of the Iberian Languagues Evaluation Forum (IberLEF 2019). We describe the entire pre-processing, feature extraction, model selection and hyperparameter optimization carried out for our submissions to the shared task. A central part of our work is to provide an in-depth comparison of the performance of different classical Machine learning techniques, as well as some recent transfer learning proposals for Natural Language Processing (NLP) classification problems.</p
VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining
In this paper, we describe VivesDebate-Speech, a corpus of spoken argumentation created to leverage audio features for argument mining tasks. The creation of this corpus represents an important contribution to the intersection of speech processing and argument mining communities, and one of the most complete publicly available resources in this topic. Moreover, we have performed a set of first-of-their-kind experiments which show an improvement when integrating audio features into the argument mining pipeline. The provided results can be used as a baseline for future research
VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining
In this paper, we describe VivesDebate-Speech, a corpus of spoken
argumentation created to leverage audio features for argument mining tasks. The
creation of this corpus represents an important contribution to the
intersection of speech processing and argument mining communities, and one of
the most complete publicly available resources in this topic. Moreover, we have
performed a set of first-of-their-kind experiments which show an improvement
when integrating audio features into the argument mining pipeline. The provided
results can be used as a baseline for future research.Comment: 5 pages; EMNLP 2023 Accepted Versio
Vrain at IroSvA 2019:Exploring classical and transfer learning approaches to short message irony detection
This paper describes VRAIN’s participation at IroSvA 2019: Irony Detection in Spanish Variants task of the Iberian Languagues Evaluation Forum (IberLEF 2019). We describe the entire pre-processing, feature extraction, model selection and hyperparameter optimization carried out for our submissions to the shared task. A central part of our work is to provide an in-depth comparison of the performance of different classical Machine learning techniques, as well as some recent transfer learning proposals for Natural Language Processing (NLP) classification problems.</p
From Simultaneous to Streaming Machine Translation by Leveraging Streaming History
Simultaneous Machine Translation is the task of incrementally translating an
input sentence before it is fully available. Currently, simultaneous
translation is carried out by translating each sentence independently of the
previously translated text. More generally, Streaming MT can be understood as
an extension of Simultaneous MT to the incremental translation of a continuous
input text stream. In this work, a state-of-the-art simultaneous sentence-level
MT system is extended to the streaming setup by leveraging the streaming
history. Extensive empirical results are reported on IWSLT Translation Tasks,
showing that leveraging the streaming history leads to significant quality
gains. In particular, the proposed system proves to compare favorably to the
best performing systems.Comment: ACL 2022 - Camera ready; v3: expanded data pre-processin
Stream-level Latency Evaluation for Simultaneous Machine Translation
[EN] Simultaneous machine translation has recently
gained traction thanks to significant quality improvements and the advent of streaming applications. Simultaneous translation systems
need to find a trade-off between translation
quality and response time, and with this purpose multiple latency measures have been proposed. However, latency evaluations for simultaneous translation are estimated at the sentence level, not taking into account the sequential nature of a streaming scenario. Indeed,
these sentence-level latency measures are not
well suited for continuous stream translation,
resulting in figures that are not coherent with
the simultaneous translation policy of the system being assessed. This work proposes a
stream-level adaptation of the current latency
measures based on a re-segmentation approach
applied to the output translation, that is successfully evaluated on streaming conditions
for a reference IWSLT task.The research leading to these results has received
funding from the European Union's Horizon 2020
research and innovation program under grant agreement no. 761758 (X5Gon) and 952215 (TAILOR) and Erasmus+ Education program under
grant agreement no. 20-226-093604-SCH; the Government of Spain's research project Multisub, ref.
RTI2018-094879-B-I00 (MCIU/AEI/FEDER,EU)
and FPU scholarships FPU18/04135; and the Generalitat Valenciana's research project Classroom
Activity Recognition, ref. PROMETEO/2019/111.Iranzo-Sánchez, J.; Civera Saiz, J.; Juan, A. (2021). Stream-level Latency Evaluation for Simultaneous Machine Translation. The Association for Computational Linguistics. 664-670. http://hdl.handle.net/10251/182203S66467
The MLLP-UPV Spanish-Portuguese and Portuguese-Spanish Machine Translation Systems for WMT19 Similar Language Translation Task
[EN] This paper describes the participation of the MLLP research group of the Universitat Politècnica de València in the WMT 2019 Similar Language Translation Shared Task. We have submitted systems for the Portuguese ↔ Spanish language pair, in both directions. We have submitted systems based on the Transformer architecture as well as an in development novel architecture which we have called 2D alternating RNN. We have carried out domain adaptation through fine-tuning.The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement no. 761758 (X5gon); the
Government of Spain's research project Multisub,
ref. RTI2018-094879-B-I00 (MCIU/AEI/FEDER,
EU) and the Generalitat Valenciana's predoctoral
research scholarship ACIF/2017/055.Baquero-Arnal, P.; Iranzo-Sánchez, J.; Civera Saiz, J.; Juan, A. (2019). The MLLP-UPV Spanish-Portuguese and Portuguese-Spanish Machine Translation Systems for WMT19 Similar Language Translation Task. The Association for Computational Linguistics. 179-184. http://hdl.handle.net/10251/180621S17918