Sign languages are the primary means of communication for many
hard-of-hearing people worldwide. Recently, to bridge the communication gap
between the hard-of-hearing community and the rest of the population, several
sign language translation datasets have been proposed to enable the development
of statistical sign language translation systems. However, there is a dearth of
sign language resources for the Indian sign language. This resource paper
introduces ISLTranslate, a translation dataset for continuous Indian Sign
Language (ISL) consisting of 31k ISL-English sentence/phrase pairs. To the best
of our knowledge, it is the largest translation dataset for continuous Indian
Sign Language. We provide a detailed analysis of the dataset. To validate the
performance of existing end-to-end Sign language to spoken language translation
systems, we benchmark the created dataset with a transformer-based model for
ISL translation.Comment: Accepted at ACL 2023 Findings, 8 Page