410 research outputs found

    One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

    Full text link
    We present Generalized LoRA (GLoRA), an advanced approach for universal parameter-efficient fine-tuning tasks. Enhancing Low-Rank Adaptation (LoRA), GLoRA employs a generalized prompt module to optimize pre-trained model weights and adjust intermediate activations, providing more flexibility and capability across diverse tasks and datasets. Moreover, GLoRA facilitates efficient parameter adaptation by employing a scalable, modular, layer-wise structure search that learns individual adapter of each layer. Originating from a unified mathematical formulation, GLoRA exhibits strong transfer learning, few-shot learning and domain generalization abilities, as it adapts to new tasks through not only weights but also additional dimensions like activations. Comprehensive experiments demonstrate that GLoRA outperforms all previous methods in natural, specialized, and structured vision benchmarks, achieving superior accuracy with fewer parameters and computations. The proposed method on LLaMA-1 and LLaMA-2 also show considerable enhancements compared to the original LoRA in the language domain. Furthermore, our structural re-parameterization design ensures that GLoRA incurs no extra inference cost, rendering it a practical solution for resource-limited applications. Code and models are available at: https://github.com/Arnav0400/ViT-Slim/tree/master/GLoRA.Comment: Technical report. v2: Add LLaMA-1&2 results. Code and models at https://github.com/Arnav0400/ViT-Slim/tree/master/GLoR

    Learning Disentangled Semantic Representations for Zero-Shot Cross-Lingual Transfer in Multilingual Machine Reading Comprehension

    Full text link
    Multilingual pre-trained models are able to zero-shot transfer knowledge from rich-resource to low-resource languages in machine reading comprehension (MRC). However, inherent linguistic discrepancies in different languages could make answer spans predicted by zero-shot transfer violate syntactic constraints of the target language. In this paper, we propose a novel multilingual MRC framework equipped with a Siamese Semantic Disentanglement Model (SSDM) to disassociate semantics from syntax in representations learned by multilingual pre-trained models. To explicitly transfer only semantic knowledge to the target language, we propose two groups of losses tailored for semantic and syntactic encoding and disentanglement. Experimental results on three multilingual MRC datasets (i.e., XQuAD, MLQA, and TyDi QA) demonstrate the effectiveness of our proposed approach over models based on mBERT and XLM-100. Code is available at:https://github.com/wulinjuan/SSDM_MRC.Comment: Accepted to ACL 2022 (main conference

    Distribution of pathogen in the Bohai sea in spring and summer

    Get PDF
    The aquicultural pathogen Vibrio spp. is popular and harmful to mariculture animals and even resulted in human enterogastrtis. However, little is known about the abundance and distribution of marine pathogen in Bohai Sea. In the present study, the distributions of the typical pathogens, including Escherichia Coli, Vibrio parahaemolyticus, Vibrio alginolyticus, Vibrio fluvialis and Vibrio harviyi, were investigated using protein micro array method from the Bohai Sea samples, which collected in spring and summer in 2005, respectively. The results showed that: (1) Temporally, the tested typical pathogens were more abundant in summer than in spring, as supported by the total pathogenic Vibrios averaged 3.05 x 10(4)/L in spring while 2.48 x 10(5)/L in summer; (2) Spatially, in summer, pathogenic Vibrios in Bohai Bay was 4.87, 10.52 and 7.15 times higher than that in Liaodong Bay, Laizhou Bay and Central Bohai Sea, respectively (p = 0.034, 0.013 and 0.012, respectively). (3) Total pathogenic Vibrios in coastal area was 4.68 times higher than that in central area (p = 0.0279 < 0.05), showing a decline trend in abundance. (4) All the pathogenic Vibrios varied between spring and summer, with greatest variance in V. fluvialis. Both V. parahaemolyticus and V. harveyi had no significant variances. Bohai Bay was heavily polluted and relatively not fit for mariculture. V. fluvialis dominated in Bohai Sea and was a possible major pathogen of vibriosis.MOST [2001AA63-5070, 2003AA635160

    Agriculture intensifies soil moisture decline in Northern China

    Get PDF
    Northern China is one of the most densely populated regions in the world. Agricultural activities have intensified since the 1980s to provide food security to the country. However, this intensification has likely contributed to an increasing scarcity in water resources, which may in turn be endangering food security. Based on in-situ measurements of soil moisture collected in agricultural plots during 1983–2012, we find that topsoil (0–50cm) volumetric water content during the growing season has declined significantly (p < 0.01), with a trend of −0.011 to −0.015 m3 m−3 per decade. Observed discharge declines for the three large river basins are consistent with the effects of agricultural intensification, although other factors (e.g. dam constructions) likely have contributed to these trends. Practices like fertilizer application have favoured biomass growth and increased transpiration rates, thus reducing available soil water. In addition, the rapid proliferation of water-expensive crops (e.g., maize) and the expansion of the area dedicated to food production have also contributed to soil drying. Adoption of alternative agricultural practices that can meet the immediate food demand without compromising future water resources seem critical for the sustainability of the food production system

    Initializing Models with Larger Ones

    Full text link
    Weight initialization plays an important role in neural network training. Widely used initialization methods are proposed and evaluated for networks that are trained from scratch. However, the growing number of pretrained models now offers new opportunities for tackling this classical problem of weight initialization. In this work, we introduce weight selection, a method for initializing smaller models by selecting a subset of weights from a pretrained larger model. This enables the transfer of knowledge from pretrained weights to smaller models. Our experiments demonstrate that weight selection can significantly enhance the performance of small models and reduce their training time. Notably, it can also be used together with knowledge distillation. Weight selection offers a new approach to leverage the power of pretrained models in resource-constrained settings, and we hope it can be a useful tool for training small models in the large-model era. Code is available at https://github.com/OscarXZQ/weight-selection
    • …
    corecore