10 research outputs found

    DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation

    Full text link
    With the ever-growing size of pretrained models (PMs), fine-tuning them has become more expensive and resource-hungry. As a remedy, low-rank adapters (LoRA) keep the main pretrained weights of the model frozen and just introduce some learnable truncated SVD modules (so-called LoRA blocks) to the model. While LoRA blocks are parameter-efficient, they suffer from two major problems: first, the size of these blocks is fixed and cannot be modified after training (for example, if we need to change the rank of LoRA blocks, then we need to re-train them from scratch); second, optimizing their rank requires an exhaustive search and effort. In this work, we introduce a dynamic low-rank adaptation (DyLoRA) technique to address these two problems together. Our DyLoRA method trains LoRA blocks for a range of ranks instead of a single rank by sorting the representation learned by the adapter module at different ranks during training. We evaluate our solution on different natural language understanding (GLUE benchmark) and language generation tasks (E2E, DART and WebNLG) using different pretrained models such as RoBERTa and GPT with different sizes. Our results show that we can train dynamic search-free models with DyLoRA at least 4 to 7 times (depending to the task) faster than LoRA without significantly compromising performance. Moreover, our models can perform consistently well on a much larger range of ranks compared to LoRA.Comment: Accepted to EACL 202

    Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)

    Full text link
    The rapid advancement of large language models (LLMs) has revolutionized natural language processing (NLP). While these models excel at understanding and generating human-like text, their widespread deployment can be prohibitively expensive. SortedNet is a recent training technique for enabling dynamic inference for deep neural networks. It leverages network modularity to create sub-models with varying computational loads, sorting them based on computation/accuracy characteristics in a nested manner. We extend SortedNet to generative NLP tasks, making large language models dynamic without any pretraining and by only replacing standard Supervised Fine-Tuning (SFT) with Sorted Fine-Tuning (SoFT) at the same costs. Our approach boosts model efficiency, eliminating the need for multiple models for various scenarios during inference. We show that using this approach, we are able to unlock the potential of intermediate layers of transformers in generating the target output. Our sub-models remain integral components of the original model, minimizing storage requirements and transition costs between different computational/latency budgets. By applying this approach on LLaMa 2 13B for tuning on the Stanford Alpaca dataset and comparing it to normal tuning and early exit via PandaLM benchmark, we show that Sorted Fine-Tuning can deliver models twice as fast as the original model while maintaining or exceeding performance

    SortedNet, a Place for Every Network and Every Network in its Place: Towards a Generalized Solution for Training Many-in-One Neural Networks

    Full text link
    As the size of deep learning models continues to grow, finding optimal models under memory and computation constraints becomes increasingly more important. Although usually the architecture and constituent building blocks of neural networks allow them to be used in a modular way, their training process is not aware of this modularity. Consequently, conventional neural network training lacks the flexibility to adapt the computational load of the model during inference. This paper proposes SortedNet, a generalized and scalable solution to harness the inherent modularity of deep neural networks across various dimensions for efficient dynamic inference. Our training considers a nested architecture for the sub-models with shared parameters and trains them together with the main model in a sorted and probabilistic manner. This sorted training of sub-networks enables us to scale the number of sub-networks to hundreds using a single round of training. We utilize a novel updating scheme during training that combines random sampling of sub-networks with gradient accumulation to improve training efficiency. Furthermore, the sorted nature of our training leads to a search-free sub-network selection at inference time; and the nested architecture of the resulting sub-networks leads to minimal storage requirement and efficient switching between sub-networks at inference. Our general dynamic training approach is demonstrated across various architectures and tasks, including large language models and pre-trained vision models. Experimental results show the efficacy of the proposed approach in achieving efficient sub-networks while outperforming state-of-the-art dynamic training approaches. Our findings demonstrate the feasibility of training up to 160 different sub-models simultaneously, showcasing the extensive scalability of our proposed method while maintaining 96% of the model performance

    Effect of carboxymethyl cellulose edible coating containing Zataria multiflora essential oil and grape seed extract on chemical attributes of rainbow trout meat

    No full text
    Meat products, especially fish meat, are very susceptible to lipid oxidation and microbial spoilage. In this study, first, gas chromatography mass spectrometry (GC-MS) analysis of Zataria multiflora essential oil (ZEO) components was done and then two concentrations of ZEO, (1% and 2%) and two concentrations of grape seed extract (GSE), (0.5% and 1%) were used in carboxymethyl cellulose coating alone and in combination, and their antioxidant effects on rainbow trout meat were evaluated in a 20-day period using thiobarbituric acid reactive substances (TBARS) test. Their effects on total volatile basic nitrogen (TVBN) and pH were evaluated as well. The main components of ZEO are thymol and carvacrol. These components significantly decreased production of thio-barbituric acid (TBA), TVBN and pH level of fish meat. The initial pH, TVBN and TBA content was 6.62, 12.67 mg N per 100 g and 0.19 mg kg-1, respectively. In most treatments significant (p < 0.05) effects on aforementioned factors was seen during storage at 4 ˚C. The results indicated that use of ZEO and GSE as a natural antioxidant agents was effective in reducing undesirable chemical reactions in storage of fish meat

    Investigating the Effects of Rural ICT Centers’ Services Quality on Customers’ Satisfaction (Case Study: Rural ICT Centers of Gillan)

    No full text
    Information & communication technology(ICT) is regarded as a means to rural sustainable development in order to reduce poverty, bridge the digital divide and prevent the migration of people from rural areas to cities. To achieve these goals and to deliver governmental services and other essential services needed by rural communities, ten thousand rural ICT centers with an investment of $ 280 million has been put to use by the government. The purpose of this study is to assess the dimensions of services’ quality of rural ICT centers in guilan using Parasuraman SERVQUAL model and its Impact on satisfaction. Using a descriptive correlational research method, 384 customers of these centers were selected randomly. SERVQUAL standard questionnaire was used to measure service quality dimensions and satisfaction questionnaires were used to measure customers satisfaction. To analyze the data, structural equation modeling was used. The results suggest that services’ quality dimentions were ranked in this order: reliability, empathy, guarantee, accountability,and tangibility. and satisfaction dimentions were rankd as satisfaction with personnel and total satisfaction with services. AMOS Software was used for data analysis and presenting of the results

    Optimization of Turbine Blade Cooling Using Combined Cooling Techniques

    No full text
    This paper presents analysis and optimization of turbine bade cooling systems. Since the temperature of combustion gases is very high sometimes reaching 2400 K, the turbine blade cannot sustain the resulting thermal stress. Moreover, for higher efficiency for advanced gas turbines, increase of inlet temperature is needed. Common blade cooling methods are film cooling, convection cooling, impingement cooling and combined cooling. In this paper, a numerical solution of the thermal and flow fields in film cooling technique on the AGTB expand symmetrical turbine blade was obtained and the results were validated with experimental data. Then the turbine blade geometry was changed and two combined cooling (impingement/convection cooing and impingement/film cooling) techniques were evaluated. The low Reynolds number k-epsilon turbulence model (AKN) was used for the turbulent flow simulations at various blowing ratios for two blade thicknesses. Comparisons of the results between the available experimental and numerical data showed that the AKN model is capable of predicting the turbulent flow and heat transfer in turbine blade cooling. Combined techniques (impingement/convection cooling and impingement/film cooling) were also carried out and more cooling effectiveness and uniform temperature distribution were found than film cooling method only

    Advances in the design of nanomaterial-based electrochemical affinity and enzymatic biosensors for metabolic biomarkers: A review

    No full text
    corecore