Search CORE

54 research outputs found

RUNTIME AUDIT OF NEURAL SEQUENCE MODELS FOR NLP

Author: Ding Shuoyang
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 26/09/2022
Field of study

Neural network sequence models have become a fundamental building block for natural language processing (NLP) applications. However, with the increasing performance and widespread adoption of these models, the social effects caused by errors in these models' outputs are also amplified. This thesis aims to mitigate such adverse effects by studying different methods that generate user-interpretable auxiliary signals along with model predictions, thus enabling efficient audits of the model output at runtime. We will look at two different types of auxiliary signals respectively generated for the input and the output of the model. The first type explains which input tokens are important for a certain prediction (Chapter 3 and 4), while the second estimates the quality of each output token (Chapter 5 and 6). For model explanations, our focus is to establish a comprehensive and quantitative evaluation framework, thus enabling a systematic comparison of different model explanation methods on a diverse set of architectures and configurations. For quality estimations, because there is already a solid evaluation framework in place, we instead focus on improving state of the art by introducing an end-task-oriented pre-training step that is based on a non-autoregressive neural machine translation architecture. Overall, we show that it is possible to generate auxiliary signals of high quality with little to no human supervision, and we also provide some guidance for best practices regarding future applications of these methods to NLP, such as conducting comprehensive quantitative evaluations for the auxiliary signals before deployment, and selecting the appropriate evaluation metric that best suits the user's goal

JScholarship