Search CORE

1,441 research outputs found

Considerations for meaningful sign language machine translation based on glosses

Author: Ebling Sarah
Jiang Zifan
Moryossef Amit
Müller Mathias
Rios Annette
Publication venue
Publication date: 28/11/2022
Field of study

Automatic sign language processing is gaining popularity in Natural Language Processing (NLP) research (Yin et al., 2021). In machine translation (MT) in particular, sign language translation based on glosses is a prominent approach. In this paper, we review recent works on neural gloss translation. We find that limitations of glosses in general and limitations of specific datasets are not discussed in a transparent manner and that there is no common standard for evaluation. To address these issues, we put forward concrete recommendations for future research on gloss translation. Our suggestions advocate awareness of the inherent limitations of gloss-based approaches, realistic datasets, stronger baselines and convincing evaluation

arXiv.org e-Print Archive

ZORA

Considerations for meaningful sign language machine translation based on glosses

Author: Ebling Sarah
Jiang Zifan
Moryossef Amit
Müller Mathias
Rios Annette
Publication venue: Association for Computational Linguistics
Publication date: 01/07/2023
Field of study

ZORA

Gloss Attention for Gloss-free Sign Language Translation

Author: Jin Tao
Jin Weike
Tang Li
Yin Aoxiong
Zhao Zhou
Zhong Tianyun
Publication venue
Publication date: 14/07/2023
Field of study

Most sign language translation (SLT) methods to date require the use of gloss annotations to provide additional supervision information, however, the acquisition of gloss is not easy. To solve this problem, we first perform an analysis of existing models to confirm how gloss annotations make SLT easier. We find that it can provide two aspects of information for the model, 1) it can help the model implicitly learn the location of semantic boundaries in continuous sign language videos, 2) it can help the model understand the sign language video globally. We then propose \emph{gloss attention}, which enables the model to keep its attention within video segments that have the same semantics locally, just as gloss helps existing models do. Furthermore, we transfer the knowledge of sentence-to-sentence similarity from the natural language model to our gloss attention SLT network (GASLT) to help it understand sign language videos at the sentence level. Experimental results on multiple large-scale sign language datasets show that our proposed GASLT model significantly outperforms existing methods. Our code is provided in \url{https://github.com/YinAoXiong/GASLT}

arXiv.org e-Print Archive