1,192 research outputs found

    Preserving In-Context Learning ability in Large Language Model Fine-tuning

    Full text link
    Pretrained large language models (LLMs) are strong in-context learners that are able to perform few-shot learning without changing model parameters. However, as we show, fine-tuning an LLM on any specific task generally destroys its in-context ability. We discover an important cause of this loss, format specialization, where the model overfits to the format of the fine-tuned task and is unable to output anything beyond this format. We further show that format specialization happens at the beginning of fine-tuning. To solve this problem, we propose Prompt Tuning with MOdel Tuning (ProMoT), a simple yet effective two-stage fine-tuning framework that preserves in-context abilities of the pretrained model. ProMoT first trains a soft prompt for the fine-tuning target task, and then fine-tunes the model itself with this soft prompt attached. ProMoT offloads task-specific formats into the soft prompt that can be removed when doing other in-context tasks. We fine-tune mT5 XXL with ProMoT on natural language inference (NLI) and English-French translation and evaluate the in-context abilities of the resulting models on 8 different NLP tasks. ProMoT achieves similar performance on the fine-tuned tasks compared with vanilla fine-tuning, but with much less reduction of in-context learning performances across the board. More importantly, ProMoT shows remarkable generalization ability on tasks that have different formats, e.g. fine-tuning on a NLI binary classification task improves the model's in-context ability to do summarization (+0.53 Rouge-2 score compared to the pretrained model), making ProMoT a promising method to build general purpose capabilities such as grounding and reasoning into LLMs with small but high quality datasets. When extended to sequential or multi-task training, ProMoT can achieve even better out-of-domain generalization performance

    On the elliptical flow in asymmetric collisions and nuclear equation of state

    Full text link
    We here present the results of elliptical flow for the collision of different asymmetric nuclei (10Ne20 +13 Al27, 18Ar40 +21 Sc45, 30Zn64 +28 Ni58, 36Kr86 +41 Nb93) by using the Quantum Molecular Dynamics (QMD) model. General features of elliptical flow are investigated with the help of theoretical simulations. The simulations are performed at different beam energies between 40 and 105 MeV/nucleon. A significant change can be seen from in-plane to out-of-plane elliptical flow of different fragments with incident energy. A comparison with experimental data is also made. Further, we predict, for the first time that, elliptical flow for different kind of fragments follow power law dependence ? C(Atot)? for asymmetric systems

    The prominent role of the heaviest fragment in multifragmentation and phase transition for hot nuclei

    Get PDF
    The role played by the heaviest fragment in partitions of multifragmenting hot nuclei is emphasized. Its size/charge distribution (mean value, fluctuations and shape) gives information on properties of fragmenting nuclei and on the associated phase transition.Comment: 11 pages, Proceedings of IWND09, August 23-25, Shanghai (China

    Coincidence measurement of residues and light particles in the reaction 56Fe+p at 1 GeV per nucleon with SPALADIN

    Full text link
    The spallation of 56^{56}Fe in collisions with hydrogen at 1 A GeV has been studied in inverse kinematics with the large-aperture setup SPALADIN at GSI. Coincidences of residues with low-center-of-mass kinetic energy light particles and fragments have been measured allowing the decomposition of the total reaction cross-section into the different possible de-excitation channels. Detailed information on the evolution of these de-excitation channels with excitation energy has also been obtained. The comparison of the data with predictions of several de-excitation models coupled to the INCL4 intra-nuclear cascade model shows that only GEMINI can reasonably account for the bulk of collected results, indicating that in a light system with no compression and little angular momentum, multifragmentation might not be necessary to explain the data.Comment: 4 pages, 5 figures, revised version accepted in Phys. Rev. Let

    Z-dependent Barriers in Multifragmentation from Poissonian Reducibility and Thermal Scaling

    Get PDF
    We explore the natural limit of binomial reducibility in nuclear multifragmentation by constructing excitation functions for intermediate mass fragments (IMF) of a given element Z. The resulting multiplicity distributions for each window of transverse energy are Poissonian. Thermal scaling is observed in the linear Arrhenius plots made from the average multiplicity of each element. ``Emission barriers'' are extracted from the slopes of the Arrhenius plots and their possible origin is discussed.Comment: 15 pages including 4 .ps figures. Submitted to Phys. Rev. Letters. Also available at http://csa5.lbl.gov/moretto

    Rapidity distribution as a probe for elliptical flow at intermediate energies

    Full text link
    Interplay between the spectator and participant matter in heavy-ion collisions is investigated within isospin dependent quantum molecular dynamics (IQMD) model in term of rapidity distribution of light charged particles. The effect of different types and size rapidity distributions is studied in elliptical flow. The elliptical flow patterns show important role of the nearby spectator matter on the participant zone. This role is further explained on the basis of passing time of the spectator and expansion time of the participant zone. The transition from the in-plane to out-of-plane is observed only when the mid-rapidity region is included in the rapidity bin, otherwise no transition occurs. The transition energy is found to be highly sensitive towards the size of the rapidity bin, while weakly on the type of the rapidity distribution. The theoretical results are also compared with the experimental findings and are found in good agreement.Comment: 8 figure

    Tracing the Evolution of Temperature in Near Fermi Energy Heavy Ion Collisions

    Get PDF
    The kinetic energy variation of emitted light clusters has been employed as a clock to explore the time evolution of the temperature for thermalizing composite systems produced in the reactions of 26A, 35A and 47A MeV 64^{64}Zn with 58^{58}Ni, 92^{92}Mo and 197^{197}Au. For each system investigated, the double isotope ratio temperature curve exhibits a high maximum apparent temperature, in the range of 10-25 MeV, at high ejectile velocity. These maximum values increase with increasing projectile energy and decrease with increasing target mass. The time at which the maximum in the temperature curve is reached ranges from 80 to 130 fm/c after contact. For each different target, the subsequent cooling curves for all three projectile energies are quite similar. Temperatures comparable to those of limiting temperature systematics are reached 30 to 40 fm/c after the times corresponding to the maxima, at a time when AMD-V transport model calculations predict entry into the final evaporative or fragmentation stage of de-excitation of the hot composite systems. Evidence for the establishment of thermal and chemical equilibrium is discussed.Comment: 9 pages, 5 figure

    Pion radii in nonlocal chiral quark model

    Full text link
    The electromagnetic radius of the charged pion and the transition radius of the neutral pion are calculated in the framework of the nonlocal chiral quark model. It is shown in this model that the contributions of vector mesons to the pion radii are noticeably suppressed in comparison with a similar contribution in the local Nambu--Jona-Lasinio model. The form-factor for the process gamma*pi+pi- is calculated for the -1 GeV^2<q^2<1.6 GeV^2. Our results are in satisfactory agreement with experimental data.Comment: 7 pages, 7 figure
    • …
    corecore