410 research outputs found
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
We present Generalized LoRA (GLoRA), an advanced approach for universal
parameter-efficient fine-tuning tasks. Enhancing Low-Rank Adaptation (LoRA),
GLoRA employs a generalized prompt module to optimize pre-trained model weights
and adjust intermediate activations, providing more flexibility and capability
across diverse tasks and datasets. Moreover, GLoRA facilitates efficient
parameter adaptation by employing a scalable, modular, layer-wise structure
search that learns individual adapter of each layer. Originating from a unified
mathematical formulation, GLoRA exhibits strong transfer learning, few-shot
learning and domain generalization abilities, as it adapts to new tasks through
not only weights but also additional dimensions like activations. Comprehensive
experiments demonstrate that GLoRA outperforms all previous methods in natural,
specialized, and structured vision benchmarks, achieving superior accuracy with
fewer parameters and computations. The proposed method on LLaMA-1 and LLaMA-2
also show considerable enhancements compared to the original LoRA in the
language domain. Furthermore, our structural re-parameterization design ensures
that GLoRA incurs no extra inference cost, rendering it a practical solution
for resource-limited applications. Code and models are available at:
https://github.com/Arnav0400/ViT-Slim/tree/master/GLoRA.Comment: Technical report. v2: Add LLaMA-1&2 results. Code and models at
https://github.com/Arnav0400/ViT-Slim/tree/master/GLoR
Learning Disentangled Semantic Representations for Zero-Shot Cross-Lingual Transfer in Multilingual Machine Reading Comprehension
Multilingual pre-trained models are able to zero-shot transfer knowledge from
rich-resource to low-resource languages in machine reading comprehension (MRC).
However, inherent linguistic discrepancies in different languages could make
answer spans predicted by zero-shot transfer violate syntactic constraints of
the target language. In this paper, we propose a novel multilingual MRC
framework equipped with a Siamese Semantic Disentanglement Model (SSDM) to
disassociate semantics from syntax in representations learned by multilingual
pre-trained models. To explicitly transfer only semantic knowledge to the
target language, we propose two groups of losses tailored for semantic and
syntactic encoding and disentanglement. Experimental results on three
multilingual MRC datasets (i.e., XQuAD, MLQA, and TyDi QA) demonstrate the
effectiveness of our proposed approach over models based on mBERT and XLM-100.
Code is available at:https://github.com/wulinjuan/SSDM_MRC.Comment: Accepted to ACL 2022 (main conference
Distribution of pathogen in the Bohai sea in spring and summer
The aquicultural pathogen Vibrio spp. is popular and harmful to mariculture animals and even resulted in human enterogastrtis. However, little is known about the abundance and distribution of marine pathogen in Bohai Sea. In the present study, the distributions of the typical pathogens, including Escherichia Coli, Vibrio parahaemolyticus, Vibrio alginolyticus, Vibrio fluvialis and Vibrio harviyi, were investigated using protein micro array method from the Bohai Sea samples, which collected in spring and summer in 2005, respectively. The results showed that: (1) Temporally, the tested typical pathogens were more abundant in summer than in spring, as supported by the total pathogenic Vibrios averaged 3.05 x 10(4)/L in spring while 2.48 x 10(5)/L in summer; (2) Spatially, in summer, pathogenic Vibrios in Bohai Bay was 4.87, 10.52 and 7.15 times higher than that in Liaodong Bay, Laizhou Bay and Central Bohai Sea, respectively (p = 0.034, 0.013 and 0.012, respectively). (3) Total pathogenic Vibrios in coastal area was 4.68 times higher than that in central area (p = 0.0279 < 0.05), showing a decline trend in abundance. (4) All the pathogenic Vibrios varied between spring and summer, with greatest variance in V. fluvialis. Both V. parahaemolyticus and V. harveyi had no significant variances. Bohai Bay was heavily polluted and relatively not fit for mariculture. V. fluvialis dominated in Bohai Sea and was a possible major pathogen of vibriosis.MOST [2001AA63-5070, 2003AA635160
Agriculture intensifies soil moisture decline in Northern China
Northern China is one of the most densely populated regions in the world. Agricultural activities have intensified since the 1980s to provide food security to the country. However, this intensification has likely contributed to an increasing scarcity in water resources, which may in turn be endangering food security. Based on in-situ measurements of soil moisture collected in agricultural plots during 1983–2012, we find that topsoil (0–50cm) volumetric water content during the growing season has declined significantly (p < 0.01), with a trend of −0.011 to −0.015 m3 m−3 per decade. Observed discharge declines for the three large river basins are consistent with the effects of agricultural intensification, although other factors (e.g. dam constructions) likely have contributed to these trends. Practices like fertilizer application have favoured biomass growth and increased transpiration rates, thus reducing available soil water. In addition, the rapid proliferation of water-expensive crops (e.g., maize) and the expansion of the area dedicated to food production have also contributed to soil drying. Adoption of alternative agricultural practices that can meet the immediate food demand without compromising future water resources seem critical for the sustainability of the food production system
Initializing Models with Larger Ones
Weight initialization plays an important role in neural network training.
Widely used initialization methods are proposed and evaluated for networks that
are trained from scratch. However, the growing number of pretrained models now
offers new opportunities for tackling this classical problem of weight
initialization. In this work, we introduce weight selection, a method for
initializing smaller models by selecting a subset of weights from a pretrained
larger model. This enables the transfer of knowledge from pretrained weights to
smaller models. Our experiments demonstrate that weight selection can
significantly enhance the performance of small models and reduce their training
time. Notably, it can also be used together with knowledge distillation. Weight
selection offers a new approach to leverage the power of pretrained models in
resource-constrained settings, and we hope it can be a useful tool for training
small models in the large-model era. Code is available at
https://github.com/OscarXZQ/weight-selection
- …