BSL-1K: Scaling up co-articulated sign language recognition using
  mouthing cues

A Schembri; B Woll; CG Fisher; H Cooper; J Forster; J Hu; O Koller; O Koller; OA Crasborn; R Bank; R Sutton-Spence; S Tamura; T Pfister; T Stafylakis; T-Y Lin; U Agris; Valli, C., University, G.; W Liu

BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues

Authors: A Schembri
B Woll
CG Fisher
H Cooper
J Forster
J Hu
O Koller
O Koller
OA Crasborn
R Bank
R Sutton-Spence
S Tamura
T Pfister
T Stafylakis
T-Y Lin
U Agris
C., University, G. Valli
W Liu
Publication date: 1 January 2020
Publisher
Doi

Abstract

Recent progress in fine-grained gesture and action classification, and machine translation, point to the possibility of automated sign language recognition becoming a reality. A key stumbling block in making progress towards this goal is a lack of appropriate training data, stemming from the high complexity of sign annotation and a limited supply of qualified annotators. In this work, we introduce a new scalable approach to data collection for sign recognition in continuous videos. We make use of weakly-aligned subtitles for broadcast footage together with a keyword spotting method to automatically localise sign-instances for a vocabulary of 1,000 signs in 1,000 hours of video. We make the following contributions: (1) We show how to use mouthing cues from signers to obtain high-quality annotations from video data - the result is the BSL-1K dataset, a collection of British Sign Language (BSL) signs of unprecedented scale; (2) We show that we can use BSL-1K to train strong sign recognition models for co-articulated signs in BSL and that these models additionally form excellent pretraining for other sign languages and benchmarks - we exceed the state of the art on both the MSASL and WLASL benchmarks. Finally, (3) we propose new large-scale evaluation sets for the tasks of sign recognition and sign spotting and provide baselines which we hope will serve to stimulate research in this area.Comment: Appears in: European Conference on Computer Vision 2020 (ECCV 2020). 28 page