376 research outputs found
Controlled Generation with Prompt Insertion for Natural Language Explanations in Grammatical Error Correction
In Grammatical Error Correction (GEC), it is crucial to ensure the user's
comprehension of a reason for correction. Existing studies present tokens,
examples, and hints as to the basis for correction but do not directly explain
the reasons for corrections. Although methods that use Large Language Models
(LLMs) to provide direct explanations in natural language have been proposed
for various tasks, no such method exists for GEC. Generating explanations for
GEC corrections involves aligning input and output tokens, identifying
correction points, and presenting corresponding explanations consistently.
However, it is not straightforward to specify a complex format to generate
explanations, because explicit control of generation is difficult with prompts.
This study introduces a method called controlled generation with Prompt
Insertion (PI) so that LLMs can explain the reasons for corrections in natural
language. In PI, LLMs first correct the input text, and then we automatically
extract the correction points based on the rules. The extracted correction
points are sequentially inserted into the LLM's explanation output as prompts,
guiding the LLMs to generate explanations for the correction points. We also
create an Explainable GEC (XGEC) dataset of correction reasons by annotating
NUCLE, CoNLL2013, and CoNLL2014. Although generations from GPT-3 and ChatGPT
using original prompts miss some correction points, the generation control
using PI can explicitly guide to describe explanations for all correction
points, contributing to improved performance in generating correction reasons.Comment: Work in progres
Reducing Sequence Length by Predicting Edit Operations with Large Language Models
Large Language Models (LLMs) have demonstrated remarkable performance in
various tasks and gained significant attention. LLMs are also used for local
sequence transduction tasks, including grammatical error correction (GEC) and
formality style transfer, where most tokens in a source text are kept
unchanged. However, it is inefficient to generate all target tokens because a
prediction error of a target token may cause a catastrophe in predicting
subsequent tokens and because the computational cost grows quadratically with
the target sequence length. This paper proposes to predict a set of edit
operations for the source text for local sequence transduction tasks.
Representing an edit operation with a span of the source text and changed
tokens, we can reduce the length of the target sequence and thus the
computational cost for inference. We apply instruction tuning for LLMs on the
supervision data of edit operations. Experiments show that the proposed method
achieves comparable performance to the baseline in four tasks, paraphrasing,
formality style transfer, GEC, and text simplification, despite reducing the
length of the target text by as small as 21\%. Furthermore, we report that the
instruction tuning with the proposed method achieved the state-of-the-art
performance in the four tasks.Comment: Work in progres
Interleukin-10 containing normal human serum inhibits granzyme B release but not perforin release from alloreactive and EBV-specific T cell clones
Interleukin-10 (IL-10), also known as cytokine synthesis inhibitory factor, has pleiotropic effects in immunoregulation and inflammation. It is capable of inhibiting synthesis of pro-inflammatory cytokines like interferon γ (IFNγ), IL-2, IL-3, tumor necrosis factor α(TNFα) and granulocyte macrophage colony stimulating factor (GM-CSF) made by cells such as macrophages and T helper Type 1 cells. We observed that normal human serum, derived from a healthy individual but containing large amounts of IL-10 (arbitrarily designated as "IL-10 serum"), inhibited cytotoxic activity and interfered with granzyme B release from alloreactive cytotoxic T cell (CTL) clones _in vitro_, but did not affect perforin release. The addition of normal human serum containing high levels of anti-IL-10 IgG (arbitrarily designated as "anti-IL-10 IgG serum") neutralized the inhibitory effects of IL-10 serum. Moreover, we have identified that cytotoxic activity and granzyme B release from an Epstein-Barr virus (EBV)-specific CTL clone was similarly inhibited in the presence of IL-10 serum, while perforin release was unaffected. Anti-IL-10 IgG serum also appeared to neutralize the inhibitory effect of IL-10 serum on an EBV-specific CTL clone
SAIE Framework: Support Alone Isn't Enough -- Advancing LLM Training with Adversarial Remarks
Large Language Models (LLMs) can justify or criticize their predictions
through discussion with other models or humans, thereby enhancing their
intrinsic understanding of instances. While proactive discussions enhance
performance, this approach is currently limited to the inference phase. In this
context, we posit a hypothesis: learning interactive discussions during
training can improve understanding for the instances in the training step and
proficiency in logical/critical thinking ability and verbalized expression of
the model in the inference step. Our proposed SAIE training method involves
both supportive and adversarial discussions between the learner and partner
models. The learner model receives a remark from the partner through the
discussion, and the parameters of the learner model are then updated based on
this remark. That is, the teacher signal dynamically adjusts in response to the
evolving model output throughout the training step. By bolstering the capacity
for discussion and comprehension of instances, our experiments across datasets,
including GSM8K, CommonsenseQA, and MMLU, reveal that models fine-tuned with
our method consistently surpass those trained with standard fine-tuning
techniques. Moreover, our approach demonstrates superior performance in
multi-agent inference scenarios, boosting the models' reasoning abilities at
the inference step.Comment: Work in progres
OUTFOX: LLM-generated Essay Detection through In-context Learning with Adversarially Generated Examples
Large Language Models (LLMs) have achieved human-level fluency in text
generation, making it difficult to distinguish between human-written and
LLM-generated texts. This poses a growing risk of misuse of LLMs and demands
the development of detectors to identify LLM-generated texts. However, existing
detectors degrade detection accuracy by simply paraphrasing LLM-generated
texts. Furthermore, the effectiveness of these detectors in real-life
situations, such as when students use LLMs for writing homework assignments
(e.g., essays) and quickly learn how to evade these detectors, has not been
explored. In this paper, we propose OUTFOX, a novel framework that improves the
robustness of LLM-generated-text detectors by allowing both the detector and
the attacker to consider each other's output and apply this to the domain of
student essays. In our framework, the attacker uses the detector's prediction
labels as examples for in-context learning and adversarially generates essays
that are harder to detect. While the detector uses the adversarially generated
essays as examples for in-context learning to learn to detect essays from a
strong attacker. Our experiments show that our proposed detector learned
in-context from the attacker improves the detection performance on the attacked
dataset by up to +41.3 point F1-score. While our proposed attacker can
drastically degrade the performance of the detector by up to -57.0 point
F1-score compared to the paraphrasing method
The Impact of Debiasing on the Performance of Language Models in Downstream Tasks is Underestimated
Pre-trained language models trained on large-scale data have learned serious
levels of social biases. Consequently, various methods have been proposed to
debias pre-trained models. Debiasing methods need to mitigate only
discriminatory bias information from the pre-trained models, while retaining
information that is useful for the downstream tasks. In previous research,
whether useful information is retained has been confirmed by the performance of
downstream tasks in debiased pre-trained models. On the other hand, it is not
clear whether these benchmarks consist of data pertaining to social biases and
are appropriate for investigating the impact of debiasing. For example in
gender-related social biases, data containing female words (e.g. ``she, female,
woman''), male words (e.g. ``he, male, man''), and stereotypical words (e.g.
``nurse, doctor, professor'') are considered to be the most affected by
debiasing. If there is not much data containing these words in a benchmark
dataset for a target task, there is the possibility of erroneously evaluating
the effects of debiasing. In this study, we compare the impact of debiasing on
performance across multiple downstream tasks using a wide-range of benchmark
datasets that containing female, male, and stereotypical words. Experiments
show that the effects of debiasing are consistently \emph{underestimated}
across all tasks. Moreover, the effects of debiasing could be reliably
evaluated by separately considering instances containing female, male, and
stereotypical words than all of the instances in a benchmark dataset.Comment: IJCNLP-AACL 202
How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection
Against the misuse (e.g., plagiarism or spreading misinformation) of Large
Language Models (LLMs), many recent works have presented LLM-generated-text
detectors with promising detection performance. Spotlighting a situation where
users instruct LLMs to generate texts (e.g., essay writing), there are various
ways to write the instruction (e.g., what task-oriented constraint to include).
In this paper, we discover that even a task-oriented constraint in instruction
can cause the inconsistent performance of current detectors to the generated
texts. Specifically, we focus on student essay writing as a realistic domain
and manually create the task-oriented constraint for each factor on essay
quality by Ke and Ng (2019). Our experiment shows that the detection
performance variance of the current detector on texts generated by instruction
with each task-oriented constraint is up to 20 times larger than the variance
caused by generating texts multiple times and paraphrasing the instruction. Our
finding calls for further research on developing robust detectors that can
detect such distributional shifts caused by a task-oriented constraint in the
instruction
The development of âUltimate Rudderâ for EEDI
EEDI (Energy Efficiency Design Index) came into effect mandatory in Jan.
2013, and the ship owners definitely required a higher efficiency propulsion system than ever
before. Hence, the shipyards have been conducting an optimization of ESD (Energy Saving Device)
system in self-propulsion test for each project. As the results, the shipyards have
installed a rudder bulb as an effective ESD.
The rudder bulb is a popular ESD system from a long time ago. Mewis1) described that the rudder
bulb was developed by Costa in 1952 and the efficiency improve by the rudder bulb for a container
vessel was 1% on average. Fujii et al.2) developed âMIPB (Mitsui Integrated Propeller Boss)â as an
advanced rudder bulb. The feature of MIPB was a streamlined profile from propeller cap to rudder.
According to their paper, the efficiency improve by installing MIPB was 2-4%.
Recently, NAKASHIMA PROPELLER Co., Ltd. developed ECO-Cap (economical propeller
cap)3) as a new ESD with FRP (Fiber Reinforced Plastics). The strength of FRP is higher than that
of NAB (Nickel Aluminium Bronze), therefore ECO-Cap was able to adopt thin fins on propeller
caps for low resistance. Although the material used for the energy- saving propeller cap
was generally NAB, the research results on FRP showed that FRP could be used as ESD due to their
properties such as lightweight and flexibility.
As explained above, the authors thought that there was a possibility to evolve the rudder bulb
profile using the easily moldable FRP compared with NAB. This paper described about the development
of âUltimate Rudderâ of new design concept by FRP. The authors optimized the profile of âUltimate
Rudderâ by CFD and confirmed the efficiency increase from 4.9 to
5.4% in self-propulsion test
Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods
Large-scale pre-trained language models such as GPT-3 have shown remarkable
performance across various natural language processing tasks. However, applying
prompt-based methods with GPT-3 for Grammatical Error Correction (GEC) tasks
and their controllability remains underexplored. Controllability in GEC is
crucial for real-world applications, particularly in educational settings,
where the ability to tailor feedback according to learner levels and specific
error types can significantly enhance the learning process. This paper
investigates the performance and controllability of prompt-based methods with
GPT-3 for GEC tasks using zero-shot and few-shot setting. We explore the impact
of task instructions and examples on GPT-3's output, focusing on controlling
aspects such as minimal edits, fluency edits, and learner levels. Our findings
demonstrate that GPT-3 could effectively perform GEC tasks, outperforming
existing supervised and unsupervised approaches. We also showed that GPT-3
could achieve controllability when appropriate task instructions and examples
are given.Comment: Accepted in BEA 202
Analysis of IGZO crystalline structure and its stability by first-principles calculations
In-Ga-Zn oxide (IGZO), an oxide semiconductor, has been actively researched as a semiconductor material having features different from those of silicon in recent years [1]. IGZO is used as a transistor material in backplanes of commercially available displays. Transistors including crystalline IGZO have high stability and thus are suitable for mass production [2].
Our previous studies revealed that the selected area diffraction pattern of an IGZO film formed at room temperature by sputtering is a halo pattern, whereas diffraction spots are observed in the diffraction pattern obtained by nanobeam electron diffraction with a probe diameter of 1 nm [3,4]. These results suggest that the IGZO film has rather nanometer-sized crystalline structures than a completely amorphous structure. We named this film ânano-crystalline IGZO (nc-IGZO) film.â Other researchers have reported that the nc-IGZO film has a crystalline-cluster composite structure, according to the analysis results obtained by grazing-incidence X-ray diffraction, anomalous X-ray scattering, and reverse-Monte-Carlo simulation [5].
In this study, an IGZO structure having a minute crystalline region, which was considered to exist in nc-IGZO as a local structure, was created by first-principles calculations and its stability was analyzed. The IGZO model having a crystalline region used in this study was obtained by a melt-quench method in the following manner. Note that the initial structure had a hexagonal-prism crystalline region at the center and an amorphous region (random atomic arrangement) around the crystalline region. The composition ratio was In:Ga:Zn:O = 1:1:1:4 and the density was 6.1 g/cm3. First, for structural relaxation with the crystalline region maintained, the amorphous region was fused in quantum molecular dynamics simulation (3500 K, 6 ps) while the atomic arrangement of the crystalline region was fixed, and the structure was cooled to 500 K at a rate of 500 K/ps and held at 300 K for 5 ps. Finally, the entire structure including the crystalline region was optimized towards the target structure (Fig. 1). An amorphous model was also created for reference. The amorphous model was obtained by quantum molecular dynamics simulation of the entire structure under similar temperature conditions without fixing the atomic arrangement of the crystalline region, followed by structural optimization. The comparison between the two models showed that the total energy of the IGZO model having a crystalline region was lower than that of the amorphous model (not having a crystalline region). This suggests that the crystalline region contributes to structure stabilization.
Please click Additional Files below to see the full abstract
- âŠ