1,192 research outputs found
Preserving In-Context Learning ability in Large Language Model Fine-tuning
Pretrained large language models (LLMs) are strong in-context learners that
are able to perform few-shot learning without changing model parameters.
However, as we show, fine-tuning an LLM on any specific task generally destroys
its in-context ability. We discover an important cause of this loss, format
specialization, where the model overfits to the format of the fine-tuned task
and is unable to output anything beyond this format. We further show that
format specialization happens at the beginning of fine-tuning. To solve this
problem, we propose Prompt Tuning with MOdel Tuning (ProMoT), a simple yet
effective two-stage fine-tuning framework that preserves in-context abilities
of the pretrained model. ProMoT first trains a soft prompt for the fine-tuning
target task, and then fine-tunes the model itself with this soft prompt
attached. ProMoT offloads task-specific formats into the soft prompt that can
be removed when doing other in-context tasks. We fine-tune mT5 XXL with ProMoT
on natural language inference (NLI) and English-French translation and evaluate
the in-context abilities of the resulting models on 8 different NLP tasks.
ProMoT achieves similar performance on the fine-tuned tasks compared with
vanilla fine-tuning, but with much less reduction of in-context learning
performances across the board. More importantly, ProMoT shows remarkable
generalization ability on tasks that have different formats, e.g. fine-tuning
on a NLI binary classification task improves the model's in-context ability to
do summarization (+0.53 Rouge-2 score compared to the pretrained model), making
ProMoT a promising method to build general purpose capabilities such as
grounding and reasoning into LLMs with small but high quality datasets. When
extended to sequential or multi-task training, ProMoT can achieve even better
out-of-domain generalization performance
On the elliptical flow in asymmetric collisions and nuclear equation of state
We here present the results of elliptical flow for the collision of different
asymmetric nuclei (10Ne20 +13 Al27, 18Ar40 +21 Sc45, 30Zn64 +28 Ni58, 36Kr86
+41 Nb93) by using the Quantum Molecular Dynamics (QMD) model. General features
of elliptical flow are investigated with the help of theoretical simulations.
The simulations are performed at different beam energies between 40 and 105
MeV/nucleon. A significant change can be seen from in-plane to out-of-plane
elliptical flow of different fragments with incident energy. A comparison with
experimental data is also made. Further, we predict, for the first time that,
elliptical flow for different kind of fragments follow power law dependence ?
C(Atot)? for asymmetric systems
The prominent role of the heaviest fragment in multifragmentation and phase transition for hot nuclei
The role played by the heaviest fragment in partitions of multifragmenting
hot nuclei is emphasized. Its size/charge distribution (mean value,
fluctuations and shape) gives information on properties of fragmenting nuclei
and on the associated phase transition.Comment: 11 pages, Proceedings of IWND09, August 23-25, Shanghai (China
Coincidence measurement of residues and light particles in the reaction 56Fe+p at 1 GeV per nucleon with SPALADIN
The spallation of Fe in collisions with hydrogen at 1 A GeV has been
studied in inverse kinematics with the large-aperture setup SPALADIN at GSI.
Coincidences of residues with low-center-of-mass kinetic energy light particles
and fragments have been measured allowing the decomposition of the total
reaction cross-section into the different possible de-excitation channels.
Detailed information on the evolution of these de-excitation channels with
excitation energy has also been obtained. The comparison of the data with
predictions of several de-excitation models coupled to the INCL4 intra-nuclear
cascade model shows that only GEMINI can reasonably account for the bulk of
collected results, indicating that in a light system with no compression and
little angular momentum, multifragmentation might not be necessary to explain
the data.Comment: 4 pages, 5 figures, revised version accepted in Phys. Rev. Let
Z-dependent Barriers in Multifragmentation from Poissonian Reducibility and Thermal Scaling
We explore the natural limit of binomial reducibility in nuclear
multifragmentation by constructing excitation functions for intermediate mass
fragments (IMF) of a given element Z. The resulting multiplicity distributions
for each window of transverse energy are Poissonian. Thermal scaling is
observed in the linear Arrhenius plots made from the average multiplicity of
each element. ``Emission barriers'' are extracted from the slopes of the
Arrhenius plots and their possible origin is discussed.Comment: 15 pages including 4 .ps figures. Submitted to Phys. Rev. Letters.
Also available at http://csa5.lbl.gov/moretto
Rapidity distribution as a probe for elliptical flow at intermediate energies
Interplay between the spectator and participant matter in heavy-ion
collisions is investigated within isospin dependent quantum molecular dynamics
(IQMD) model in term of rapidity distribution of light charged particles. The
effect of different types and size rapidity distributions is studied in
elliptical flow. The elliptical flow patterns show important role of the nearby
spectator matter on the participant zone. This role is further explained on the
basis of passing time of the spectator and expansion time of the participant
zone. The transition from the in-plane to out-of-plane is observed only when
the mid-rapidity region is included in the rapidity bin, otherwise no
transition occurs. The transition energy is found to be highly sensitive
towards the size of the rapidity bin, while weakly on the type of the rapidity
distribution. The theoretical results are also compared with the experimental
findings and are found in good agreement.Comment: 8 figure
Tracing the Evolution of Temperature in Near Fermi Energy Heavy Ion Collisions
The kinetic energy variation of emitted light clusters has been employed as a
clock to explore the time evolution of the temperature for thermalizing
composite systems produced in the reactions of 26A, 35A and 47A MeV Zn
with Ni, Mo and Au. For each system investigated, the
double isotope ratio temperature curve exhibits a high maximum apparent
temperature, in the range of 10-25 MeV, at high ejectile velocity. These
maximum values increase with increasing projectile energy and decrease with
increasing target mass. The time at which the maximum in the temperature curve
is reached ranges from 80 to 130 fm/c after contact. For each different target,
the subsequent cooling curves for all three projectile energies are quite
similar. Temperatures comparable to those of limiting temperature systematics
are reached 30 to 40 fm/c after the times corresponding to the maxima, at a
time when AMD-V transport model calculations predict entry into the final
evaporative or fragmentation stage of de-excitation of the hot composite
systems. Evidence for the establishment of thermal and chemical equilibrium is
discussed.Comment: 9 pages, 5 figure
Pion radii in nonlocal chiral quark model
The electromagnetic radius of the charged pion and the transition radius of
the neutral pion are calculated in the framework of the nonlocal chiral quark
model. It is shown in this model that the contributions of vector mesons to the
pion radii are noticeably suppressed in comparison with a similar contribution
in the local Nambu--Jona-Lasinio model. The form-factor for the process
gamma*pi+pi- is calculated for the -1 GeV^2<q^2<1.6 GeV^2. Our results are in
satisfactory agreement with experimental data.Comment: 7 pages, 7 figure
- …