23 research outputs found

    Towards Better Parameter-Efficient Fine-Tuning for Large Language Models: A Position Paper

    Full text link
    This paper delves into the pressing need in Parameter-Efficient Fine-Tuning (PEFT) for Large Language Models (LLMs). While LLMs possess remarkable capabilities, their extensive parameter requirements and associated computational demands hinder their practicality and scalability for real-world applications. Our position paper highlights current states and the necessity of further studying into the topic, and recognizes significant challenges and open issues that must be addressed to fully harness the powerful abilities of LLMs. These challenges encompass novel efficient PEFT architectures, PEFT for different learning settings, PEFT combined with model compression techniques, and the exploration of PEFT for multi-modal LLMs. By presenting this position paper, we aim to stimulate further research and foster discussions surrounding more efficient and accessible PEFT for LLMs

    From Complex to Simple: Unraveling the Cognitive Tree for Reasoning with Small Language Models

    Full text link
    Reasoning is a distinctive human capacity, enabling us to address complex problems by breaking them down into a series of manageable cognitive steps. Yet, complex logical reasoning is still cumbersome for language models. Based on the dual process theory in cognitive science, we are the first to unravel the cognitive reasoning abilities of language models. Our framework employs an iterative methodology to construct a Cognitive Tree (CogTree). The root node of this tree represents the initial query, while the leaf nodes consist of straightforward questions that can be answered directly. This construction involves two main components: the implicit extraction module (referred to as the intuitive system) and the explicit reasoning module (referred to as the reflective system). The intuitive system rapidly generates multiple responses by utilizing in-context examples, while the reflective system scores these responses using comparative learning. The scores guide the intuitive system in its subsequent generation step. Our experimental results on two popular and challenging reasoning tasks indicate that it is possible to achieve a performance level comparable to that of GPT-3.5 (with 175B parameters), using a significantly smaller language model that contains fewer parameters (<=7B) than 5% of GPT-3.5.Comment: emnlp 202

    Making Small Language Models Better Multi-task Learners with Mixture-of-Task-Adapters

    Full text link
    Recently, Large Language Models (LLMs) have achieved amazing zero-shot learning performance over a variety of Natural Language Processing (NLP) tasks, especially for text generative tasks. Yet, the large size of LLMs often leads to the high computational cost of model training and online deployment. In our work, we present ALTER, a system that effectively builds the multi-tAsk Learners with mixTure-of-task-adaptERs upon small language models (with <1B parameters) to address multiple NLP tasks simultaneously, capturing the commonalities and differences between tasks, in order to support domain-specific applications. Specifically, in ALTER, we propose the Mixture-of-Task-Adapters (MTA) module as an extension to the transformer architecture for the underlying model to capture the intra-task and inter-task knowledge. A two-stage training method is further proposed to optimize the collaboration between adapters at a small computational cost. Experimental results over a mixture of NLP tasks show that our proposed MTA architecture and the two-stage training method achieve good performance. Based on ALTER, we have also produced MTA-equipped language models for various domains

    Recent results on heavy-ion induced reactions of interest for neutrinoless double beta decay at INFN-LNS

    Get PDF
    Abstract. The possibility to use a special class of heavy-ion induced direct reactions, such as double charge exchange reactions, is discussed in view of their application to extract information that may be helpful to determinate the nuclear matrix elements entering in the expression of neutrinoless double beta decay halflife. The methodology of the experimental campaign presently running at INFN - Laboratori Nazionali del Sud is reported and the experimental challenges characterizing such activity are describe

    In Situ ZrB<sub>2</sub> Formation in B<sub>4</sub>C Ceramics and Its Strengthening Mechanism on Mechanical Properties

    No full text
    In order to reduce the sintering temperature and improve the mechanical properties of B4C ceramics, ZrB2 was formed in situ using the SPS sintering method with ZrO2 and B4C as raw materials. Thermodynamic calculations revealed that CO pressure affected the formation of ZrB2 at temperatures from 814 °C to 1100 °C. The experimental results showed that the ZrB2 grain size was 4C ceramics. With an increase in ZrO2 content, the Vickers hardness and flexural strength of the B4C ceramics first increased and then decreased, while the fracture toughness continuously increased. When the content of ZrO2 was 15 wt%, the Vickers hardness, fracture toughness and flexural strength of B4C ceramics were 35.5 ± 0.63 GPa, 3.6 ± 0.24 MPa·m1/2 and 403 ± 10 MPa, respectively. These results suggest that ZrB2 inhibits B4C grain growth, eliminates crack tip stress, and provides fine grain to strengthen and toughen B4C ceramics

    Association study of MCP-1 promoter polymorphisms with the susceptibility and progression of sepsis

    No full text
    <div><p>Previous studies have indicated that the monocyte chemo-attractant protein 1 (MCP-1), also referred to as C-C motif chemokine ligand 2 (CCL2), plays a significant role in the pathogenesis of sepsis, and this study investigated the clinical relevance of two MCP-1 gene polymorphisms on sepsis onset and progression. The Multiplex SNaPshot genotyping method was used to detect MCP-1 gene polymorphisms in the Chinese Han population (403 sepsis patients and 400 controls). MCP-1 mRNA expression levels were measured using real-time quantitative PCR, and enzyme-linked immunosorbent assays were used to analyze MCP-1, tumor necrosis factor-alpha (TNF-α), interleukin 6 (IL-6) and interleukin-1 beta (IL-1ÎČ) plasma concentrations. The rs1024611 polymorphism analysis showed lower frequencies of minor homozygous genotype (AA) and allele (A) in sepsis patients compared to the healthy controls (19.4% vs. 31.5%, P = 0.0001 and 45.9% vs. 54.8%, P = 0.0004, respectively). And the frequencies of GG genotype and G allele were lower in sepsis patients compared to the controls (19.6% vs. 31.3%, P = 0.0002 and 46.0% vs. 54.5%, P = 0.0007, respectively). The rs1024611 AG/GG and rs2857656 GC/CC genotypes were both overrepresented in patients with severe sepsis (both P = 0.0005) and septic shock (P = 0.010 and P = 0.015, respectively) compared to the patients with mild sepsis. Moreover, among sepsis patients, the rs1024611 AG/GG and rs2857656 GC/CC carriers exhibited significant increases in expression levels of MCP-1 (P = 0.025), TNF-α (P = 0.034) and IL-6 (P = 0.043) compared with the rs1024611 AA or rs2857656 GG carriers. This study provides valuable clinical evidence that the MCP-1/CCL2 polymorphisms rs1024611 and rs2857656 are associated with sepsis susceptibility and development. We conclude that MCP-1/CCL2 plays a significant role in the pathogenesis of sepsis, which has potentially important therapeutic implications.</p></div

    Magnetostratigraphy and luminescence dating on a sedimentary sequence from northern East China Sea: Constraints on evolutionary history of eastern marginal seas of China since the Early Pleistocene

    No full text
    Owing to the large and increasing population density in low-lying coastal regions, even small changes in sea level can have substantial societal and economic impacts. Alternations of terrestrial and marine sediments deposited in coastal areas or continental shelves are important and effective indicators of sea-level changes, and thus, have been widely studied in the marginal seas of China over the past 30 years. However, sea-level change results from not only eustatic factors but also tectonic activity. The Zhe-Min (or Zhejiang-Fujian) Uplift (ZMU) was such an important factor in geomorphology, and formed a barrier preventing entry of sea water into the northern marginal seas of China, but its Quaternary history is poorly known. In this study, a new borehole (ECS-DZ1) was drilled in the Zhoushan Islands, northern East China Sea to obtain information on the evolution of the ZMU during the Quaternary. Information from paleomagnetic and luminescence dating was combined with data on sedimentary changes. The main results are: (1) constrained by luminescence ages, the upper sedimentary units were extrapolated to have been deposited since similar to 0.2 Ma; (2) paleomagnetic results suggest that the ECS-DZ1 borehole sequence spans from the pre-Olduvai Matuyama reverse chron to the Brunhes normal chron, approximately constraining the age of the basal sedimentary unit to similar to 2.0 Ma; (3) a significant hiatus or erosion between two major sedimentary units possibly occurred between the late Early Pleistocene and Middle Pleistocene. As the Zhoushan Islands are within the ZMU and considering previous transgression studies around this region, it is inferred that the ZMU subsided at similar to 2.0 Ma, allowing seawater to invade northward in the Yellow Sea basin. The ZMU might have been uplifted again no later than 1.0 Ma, causing a sedimentary hiatus or lacustrine development. After similar to 0.2 Ma, the ZMU subsided completely, allowing large transgressions to develop across the northern marginal seas of China in the context of global sea-level changes.</p

    The linkage disequilibrium (LD) block (rs1024611 and rs2857565) and their locations in the promoter region of the MCP-1 gene.

    No full text
    <p>According to the GRCh38.p7 primary assembly, the human MCP-1 gene is located in Homo sapiens chromosome 17 (34,255,277–34,257,203). The blue bar represents the 5'-flanking region of the MCP-1 gene, and the three dark green bars individually represent its exon1, exon2 and exon3, respectively. In the visual, rs1024611 and rs2857656 are located in the upstream of the transcriptional start site (-2508 bp and -289 bp), respectively. The haplotype block (rs1024611-rs2857656, D' value = 0.995, r<sup>2</sup> = 0.988) is generated using Haploview 4.2.</p
    corecore