Search CORE

322 research outputs found

A Nested Attention Neural Hybrid Model for Grammatical Error Correction

Author: Gao Jianfeng
Gong Yongen
Ji Jianshu
Toutanova Kristina
Truong Steven
Wang Qinlong
Publication venue
Publication date: 01/01/2017
Field of study

Grammatical error correction (GEC) systems strive to correct both global errors in word order and usage, and local errors in spelling and inflection. Further developing upon recent work on neural machine translation, we propose a new hybrid neural model with nested attention layers for GEC. Experiments show that the new model can effectively correct errors of both types by incorporating word and character-level information,and that the model significantly outperforms previous neural models for GEC as measured on the standard CoNLL-14 benchmark dataset. Further analysis also shows that the superiority of the proposed model can be largely attributed to the use of the nested attention mechanism, which has proven particularly effective in correcting local errors that involve small edits in orthography

arXiv.org e-Print Archive

Crossref

It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners

Author: Schick Timo
Schütze Hinrich
Toutanova Kristina
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/06/2021
Field of study

When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown et al., 2020) achieve remarkable few-shot performance. However, enormous amounts of compute are required for training and applying such big models, resulting in a large carbon footprint and making it difficult for researchers and practitioners to use them. We show that performance similar to GPT-3 can be obtained with language models that are much “greener” in that their parameter count is several orders of magnitude smaller. This is achieved by converting textual inputs into cloze questions that contain a task description, combined with gradient-based optimization; exploiting unlabeled data gives further improvements. We identify key factors required for successful natural language understanding with small language models

Open Access LMU ( Ludwig-Maximilians-Univ. München)

A Dataset and Evaluation Metrics for Abstractive Compression of Sentences and Short Paragraphs

Author: Amershi S.
Brockett C.
Toutanova K.
Tran K.M.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

International Migration, Integration and Social Cohesion online publications