272 research outputs found

    Beyond Hard Samples: Robust and Effective Grammatical Error Correction with Cycle Self-Augmenting

    Full text link
    Recent studies have revealed that grammatical error correction methods in the sequence-to-sequence paradigm are vulnerable to adversarial attack, and simply utilizing adversarial examples in the pre-training or post-training process can significantly enhance the robustness of GEC models to certain types of attack without suffering too much performance loss on clean data. In this paper, we further conduct a thorough robustness evaluation of cutting-edge GEC methods for four different types of adversarial attacks and propose a simple yet very effective Cycle Self-Augmenting (CSA) method accordingly. By leveraging the augmenting data from the GEC models themselves in the post-training process and introducing regularization data for cycle training, our proposed method can effectively improve the model robustness of well-trained GEC models with only a few more training epochs as an extra cost. More concretely, further training on the regularization data can prevent the GEC models from over-fitting on easy-to-learn samples and thus can improve the generalization capability and robustness towards unseen data (adversarial noise/samples). Meanwhile, the self-augmented data can provide more high-quality pseudo pairs to improve model performance on the original testing data. Experiments on four benchmark datasets and seven strong models indicate that our proposed training method can significantly enhance the robustness of four types of attacks without using purposely built adversarial examples in training. Evaluation results on clean data further confirm that our proposed CSA method significantly improves the performance of four baselines and yields nearly comparable results with other state-of-the-art models. Our code is available at https://github.com/ZetangForward/CSA-GEC

    Neural Combinatory Constituency Parsing

    Get PDF
    東京都立大学Tokyo Metropolitan University博士(情報科学)doctoral thesi

    Decoding linguistic information from EEG signals

    Get PDF
    For many years, the fields of the cognitive neuroscience of language and natural language processing (NLP) have been relatively distinct and non-overlapping. Recent breakthrough research is starting to show that these two fields, in their common goal towards understanding and modelling language, have a lot to offer each other. As developments in machine learning continue to break into new ground, due largely in part to the successful development of novel classifiers that can be efficiently trained to model highly nonlinear dynamic systems, such as language, the open question is how well these models perform on human neural signals during language processing. Recent results are beginning to show that various types of human signals (eye-tracking, fMRI, MEG) can successfully model various linguistic aspects of what is being concurrently processed by the brain. EEG is a cheap and relatively accessible way to access neural signals and this thesis explores the extent to which decoding of EEG data, using state-of-the-art models common in NLP, to carry out this task. Critically, an important foundation needs to be in place that can fully explore the types of linguistic signal that is decodable with EEG. This thesis attempts to answer this question, setting the stage for joint modelling of text and neural signals to advance the field of NLP. This research is also of interest to cognitive neuroscientists as the data collected for this thesis will be openly accessible to all, with accompanying linguistic annotation, which can help to answer various questions about the spatiotemporal dynamics during the reading of naturalistic texts. In Chapter 1, I provide an overview of the major literature that has investigated the status of linguistic processing from neural signals, setting the research question in the correct historical context. This literature review serves as the basis for the two experimental chapters which follow and is thus subdivided into two main sections. Chapter 2 explores the various aspects of linguistic processing which are decodable from the novel EEG dataset collected for this thesis, with a strong emphasis on controlling for potential confounds as much as possible. Using a novel machine learning classifier, I show that with specialised training methods, generalisation to novel data relating to part-of-speech decoding is possible. In Chapter 3, the preprocessing steps involved in preparing the data are examined, in which I show that depending on the modelling goal, some steps are particularly useful to boost performance of linguistic decoding of EEG stimuli. Finally, in Chapter 4, a broad review of the results, their implications and limitations are considered
    corecore