376 research outputs found

    Controlled Generation with Prompt Insertion for Natural Language Explanations in Grammatical Error Correction

    Full text link
    In Grammatical Error Correction (GEC), it is crucial to ensure the user's comprehension of a reason for correction. Existing studies present tokens, examples, and hints as to the basis for correction but do not directly explain the reasons for corrections. Although methods that use Large Language Models (LLMs) to provide direct explanations in natural language have been proposed for various tasks, no such method exists for GEC. Generating explanations for GEC corrections involves aligning input and output tokens, identifying correction points, and presenting corresponding explanations consistently. However, it is not straightforward to specify a complex format to generate explanations, because explicit control of generation is difficult with prompts. This study introduces a method called controlled generation with Prompt Insertion (PI) so that LLMs can explain the reasons for corrections in natural language. In PI, LLMs first correct the input text, and then we automatically extract the correction points based on the rules. The extracted correction points are sequentially inserted into the LLM's explanation output as prompts, guiding the LLMs to generate explanations for the correction points. We also create an Explainable GEC (XGEC) dataset of correction reasons by annotating NUCLE, CoNLL2013, and CoNLL2014. Although generations from GPT-3 and ChatGPT using original prompts miss some correction points, the generation control using PI can explicitly guide to describe explanations for all correction points, contributing to improved performance in generating correction reasons.Comment: Work in progres

    Reducing Sequence Length by Predicting Edit Operations with Large Language Models

    Full text link
    Large Language Models (LLMs) have demonstrated remarkable performance in various tasks and gained significant attention. LLMs are also used for local sequence transduction tasks, including grammatical error correction (GEC) and formality style transfer, where most tokens in a source text are kept unchanged. However, it is inefficient to generate all target tokens because a prediction error of a target token may cause a catastrophe in predicting subsequent tokens and because the computational cost grows quadratically with the target sequence length. This paper proposes to predict a set of edit operations for the source text for local sequence transduction tasks. Representing an edit operation with a span of the source text and changed tokens, we can reduce the length of the target sequence and thus the computational cost for inference. We apply instruction tuning for LLMs on the supervision data of edit operations. Experiments show that the proposed method achieves comparable performance to the baseline in four tasks, paraphrasing, formality style transfer, GEC, and text simplification, despite reducing the length of the target text by as small as 21\%. Furthermore, we report that the instruction tuning with the proposed method achieved the state-of-the-art performance in the four tasks.Comment: Work in progres

    Interleukin-10 containing normal human serum inhibits granzyme B release but not perforin release from alloreactive and EBV-specific T cell clones

    Get PDF
    Interleukin-10 (IL-10), also known as cytokine synthesis inhibitory factor, has pleiotropic effects in immunoregulation and inflammation. It is capable of inhibiting synthesis of pro-inflammatory cytokines like interferon γ (IFNγ), IL-2, IL-3, tumor necrosis factor α(TNFα) and granulocyte macrophage colony stimulating factor (GM-CSF) made by cells such as macrophages and T helper Type 1 cells. We observed that normal human serum, derived from a healthy individual but containing large amounts of IL-10 (arbitrarily designated as "IL-10 serum"), inhibited cytotoxic activity and interfered with granzyme B release from alloreactive cytotoxic T cell (CTL) clones _in vitro_, but did not affect perforin release. The addition of normal human serum containing high levels of anti-IL-10 IgG (arbitrarily designated as "anti-IL-10 IgG serum") neutralized the inhibitory effects of IL-10 serum. Moreover, we have identified that cytotoxic activity and granzyme B release from an Epstein-Barr virus (EBV)-specific CTL clone was similarly inhibited in the presence of IL-10 serum, while perforin release was unaffected. Anti-IL-10 IgG serum also appeared to neutralize the inhibitory effect of IL-10 serum on an EBV-specific CTL clone

    SAIE Framework: Support Alone Isn't Enough -- Advancing LLM Training with Adversarial Remarks

    Full text link
    Large Language Models (LLMs) can justify or criticize their predictions through discussion with other models or humans, thereby enhancing their intrinsic understanding of instances. While proactive discussions enhance performance, this approach is currently limited to the inference phase. In this context, we posit a hypothesis: learning interactive discussions during training can improve understanding for the instances in the training step and proficiency in logical/critical thinking ability and verbalized expression of the model in the inference step. Our proposed SAIE training method involves both supportive and adversarial discussions between the learner and partner models. The learner model receives a remark from the partner through the discussion, and the parameters of the learner model are then updated based on this remark. That is, the teacher signal dynamically adjusts in response to the evolving model output throughout the training step. By bolstering the capacity for discussion and comprehension of instances, our experiments across datasets, including GSM8K, CommonsenseQA, and MMLU, reveal that models fine-tuned with our method consistently surpass those trained with standard fine-tuning techniques. Moreover, our approach demonstrates superior performance in multi-agent inference scenarios, boosting the models' reasoning abilities at the inference step.Comment: Work in progres

    OUTFOX: LLM-generated Essay Detection through In-context Learning with Adversarially Generated Examples

    Full text link
    Large Language Models (LLMs) have achieved human-level fluency in text generation, making it difficult to distinguish between human-written and LLM-generated texts. This poses a growing risk of misuse of LLMs and demands the development of detectors to identify LLM-generated texts. However, existing detectors degrade detection accuracy by simply paraphrasing LLM-generated texts. Furthermore, the effectiveness of these detectors in real-life situations, such as when students use LLMs for writing homework assignments (e.g., essays) and quickly learn how to evade these detectors, has not been explored. In this paper, we propose OUTFOX, a novel framework that improves the robustness of LLM-generated-text detectors by allowing both the detector and the attacker to consider each other's output and apply this to the domain of student essays. In our framework, the attacker uses the detector's prediction labels as examples for in-context learning and adversarially generates essays that are harder to detect. While the detector uses the adversarially generated essays as examples for in-context learning to learn to detect essays from a strong attacker. Our experiments show that our proposed detector learned in-context from the attacker improves the detection performance on the attacked dataset by up to +41.3 point F1-score. While our proposed attacker can drastically degrade the performance of the detector by up to -57.0 point F1-score compared to the paraphrasing method

    The Impact of Debiasing on the Performance of Language Models in Downstream Tasks is Underestimated

    Full text link
    Pre-trained language models trained on large-scale data have learned serious levels of social biases. Consequently, various methods have been proposed to debias pre-trained models. Debiasing methods need to mitigate only discriminatory bias information from the pre-trained models, while retaining information that is useful for the downstream tasks. In previous research, whether useful information is retained has been confirmed by the performance of downstream tasks in debiased pre-trained models. On the other hand, it is not clear whether these benchmarks consist of data pertaining to social biases and are appropriate for investigating the impact of debiasing. For example in gender-related social biases, data containing female words (e.g. ``she, female, woman''), male words (e.g. ``he, male, man''), and stereotypical words (e.g. ``nurse, doctor, professor'') are considered to be the most affected by debiasing. If there is not much data containing these words in a benchmark dataset for a target task, there is the possibility of erroneously evaluating the effects of debiasing. In this study, we compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets that containing female, male, and stereotypical words. Experiments show that the effects of debiasing are consistently \emph{underestimated} across all tasks. Moreover, the effects of debiasing could be reliably evaluated by separately considering instances containing female, male, and stereotypical words than all of the instances in a benchmark dataset.Comment: IJCNLP-AACL 202

    How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection

    Full text link
    Against the misuse (e.g., plagiarism or spreading misinformation) of Large Language Models (LLMs), many recent works have presented LLM-generated-text detectors with promising detection performance. Spotlighting a situation where users instruct LLMs to generate texts (e.g., essay writing), there are various ways to write the instruction (e.g., what task-oriented constraint to include). In this paper, we discover that even a task-oriented constraint in instruction can cause the inconsistent performance of current detectors to the generated texts. Specifically, we focus on student essay writing as a realistic domain and manually create the task-oriented constraint for each factor on essay quality by Ke and Ng (2019). Our experiment shows that the detection performance variance of the current detector on texts generated by instruction with each task-oriented constraint is up to 20 times larger than the variance caused by generating texts multiple times and paraphrasing the instruction. Our finding calls for further research on developing robust detectors that can detect such distributional shifts caused by a task-oriented constraint in the instruction

    The development of “Ultimate Rudder” for EEDI

    Get PDF
    EEDI (Energy Efficiency Design Index) came into effect mandatory in Jan. 2013, and the ship owners definitely required a higher efficiency propulsion system than ever before. Hence, the shipyards have been conducting an optimization of ESD (Energy Saving Device) system in self-propulsion test for each project. As the results, the shipyards have installed a rudder bulb as an effective ESD. The rudder bulb is a popular ESD system from a long time ago. Mewis1) described that the rudder bulb was developed by Costa in 1952 and the efficiency improve by the rudder bulb for a container vessel was 1% on average. Fujii et al.2) developed “MIPB (Mitsui Integrated Propeller Boss)” as an advanced rudder bulb. The feature of MIPB was a streamlined profile from propeller cap to rudder. According to their paper, the efficiency improve by installing MIPB was 2-4%. Recently, NAKASHIMA PROPELLER Co., Ltd. developed ECO-Cap (economical propeller cap)3) as a new ESD with FRP (Fiber Reinforced Plastics). The strength of FRP is higher than that of NAB (Nickel Aluminium Bronze), therefore ECO-Cap was able to adopt thin fins on propeller caps for low resistance. Although the material used for the energy- saving propeller cap was generally NAB, the research results on FRP showed that FRP could be used as ESD due to their properties such as lightweight and flexibility. As explained above, the authors thought that there was a possibility to evolve the rudder bulb profile using the easily moldable FRP compared with NAB. This paper described about the development of “Ultimate Rudder” of new design concept by FRP. The authors optimized the profile of “Ultimate Rudder” by CFD and confirmed the efficiency increase from 4.9 to 5.4% in self-propulsion test

    Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods

    Full text link
    Large-scale pre-trained language models such as GPT-3 have shown remarkable performance across various natural language processing tasks. However, applying prompt-based methods with GPT-3 for Grammatical Error Correction (GEC) tasks and their controllability remains underexplored. Controllability in GEC is crucial for real-world applications, particularly in educational settings, where the ability to tailor feedback according to learner levels and specific error types can significantly enhance the learning process. This paper investigates the performance and controllability of prompt-based methods with GPT-3 for GEC tasks using zero-shot and few-shot setting. We explore the impact of task instructions and examples on GPT-3's output, focusing on controlling aspects such as minimal edits, fluency edits, and learner levels. Our findings demonstrate that GPT-3 could effectively perform GEC tasks, outperforming existing supervised and unsupervised approaches. We also showed that GPT-3 could achieve controllability when appropriate task instructions and examples are given.Comment: Accepted in BEA 202

    Analysis of IGZO crystalline structure and its stability by first-principles calculations

    Get PDF
    In-Ga-Zn oxide (IGZO), an oxide semiconductor, has been actively researched as a semiconductor material having features different from those of silicon in recent years [1]. IGZO is used as a transistor material in backplanes of commercially available displays. Transistors including crystalline IGZO have high stability and thus are suitable for mass production [2]. Our previous studies revealed that the selected area diffraction pattern of an IGZO film formed at room temperature by sputtering is a halo pattern, whereas diffraction spots are observed in the diffraction pattern obtained by nanobeam electron diffraction with a probe diameter of 1 nm [3,4]. These results suggest that the IGZO film has rather nanometer-sized crystalline structures than a completely amorphous structure. We named this film “nano-crystalline IGZO (nc-IGZO) film.” Other researchers have reported that the nc-IGZO film has a crystalline-cluster composite structure, according to the analysis results obtained by grazing-incidence X-ray diffraction, anomalous X-ray scattering, and reverse-Monte-Carlo simulation [5]. In this study, an IGZO structure having a minute crystalline region, which was considered to exist in nc-IGZO as a local structure, was created by first-principles calculations and its stability was analyzed. The IGZO model having a crystalline region used in this study was obtained by a melt-quench method in the following manner. Note that the initial structure had a hexagonal-prism crystalline region at the center and an amorphous region (random atomic arrangement) around the crystalline region. The composition ratio was In:Ga:Zn:O = 1:1:1:4 and the density was 6.1 g/cm3. First, for structural relaxation with the crystalline region maintained, the amorphous region was fused in quantum molecular dynamics simulation (3500 K, 6 ps) while the atomic arrangement of the crystalline region was fixed, and the structure was cooled to 500 K at a rate of 500 K/ps and held at 300 K for 5 ps. Finally, the entire structure including the crystalline region was optimized towards the target structure (Fig. 1). An amorphous model was also created for reference. The amorphous model was obtained by quantum molecular dynamics simulation of the entire structure under similar temperature conditions without fixing the atomic arrangement of the crystalline region, followed by structural optimization. The comparison between the two models showed that the total energy of the IGZO model having a crystalline region was lower than that of the amorphous model (not having a crystalline region). This suggests that the crystalline region contributes to structure stabilization. Please click Additional Files below to see the full abstract
    • 

    corecore