Search CORE

57 research outputs found

Estimation of spectral lines using expectation propagation

Author: Badiu Mihai-Alin
Lei Xupeng
Zhu Jiang
Publication venue: IEEE
Publication date: 18/03/2024
Field of study

We consider the line spectral estimation (LSE) from general linear/nonlinear measurements obtained through a generalized linear model (GLM). This paper develops expectation propagation (EP) based LSE (EPLSE) method. The proposed method automatically estimates the model order, noise variance, and can deal with the nonlinear measurements. Numerical experiments show the excellent performance of EPLSE

Oxford University Research Archive

Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models

Author: Jia Zhihao
Jiang Yong
Li Qing
Miao Xupeng
Oliaro Gabriele
Zhang Zhengxin
Zhao Dan
Publication venue
Publication date: 13/01/2024
Field of study

Finetuning large language models (LLMs) has been empirically effective on a variety of downstream tasks. Existing approaches to finetuning an LLM either focus on parameter-efficient finetuning, which only updates a small number of trainable parameters, or attempt to reduce the memory footprint during the training phase of the finetuning. Typically, the memory footprint during finetuning stems from three contributors: model weights, optimizer states, and intermediate activations. However, existing works still require considerable memory and none can simultaneously mitigate memory footprint for all three sources. In this paper, we present Quantized Side Tuing (QST), which enables memory-efficient and fast finetuning of LLMs by operating through a dual-stage process. First, QST quantizes an LLM's model weights into 4-bit to reduce the memory footprint of the LLM's original weights; QST also introduces a side network separated from the LLM, which utilizes the hidden states of the LLM to make task-specific predictions. Using a separate side network avoids performing backpropagation through the LLM, thus reducing the memory requirement of the intermediate activations. Furthermore, QST leverages several low-rank adaptors and gradient-free downsample modules to significantly reduce the trainable parameters, so as to save the memory footprint of the optimizer states. Experiments show that QST can reduce the total memory footprint by up to 2.3

\times

and speed up the finetuning process by up to 3

\times

while achieving competent performance compared with the state-of-the-art. When it comes to full finetuning, QST can reduce the total memory footprint up to 7

\times

arXiv.org e-Print Archive

Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism

Author: Cui Bin
Jiang Youhe
Miao Xupeng
Nie Xiaonan
Shi Chunan
Wang Yujie
Zhang Hailin
Publication venue: 'VLDB Endowment'
Publication date: 24/11/2022
Field of study

Transformer models have achieved state-of-the-art performance on various domains of applications and gradually becomes the foundations of the advanced large deep learning (DL) models. However, how to train these models over multiple GPUs efficiently is still challenging due to a large number of parallelism choices. Existing DL systems either rely on manual efforts to make distributed training plans or apply parallelism combinations within a very limited search space. In this approach, we propose Galvatron, a new system framework that incorporates multiple popular parallelism dimensions and automatically finds the most efficient hybrid parallelism strategy. To better explore such a rarely huge search space, we 1) involve a decision tree to make decomposition and pruning based on some reasonable intuitions, and then 2) design a dynamic programming search algorithm to generate the optimal plan. Evaluations on four representative Transformer workloads show that Galvatron could perform automatically distributed training with different GPU memory budgets. Among all evluated scenarios, Galvatron always achieves superior system throughput compared to previous work with limited parallelism

arXiv.org e-Print Archive

Improving Automatic Parallel Training via Balanced Memory Workload Optimization

Author: Cui Bin
Fu Fangcheng
Jiang Youhe
Miao Xupeng
Nie Xiaonan
Tu Yaofeng
Wang Yujie
Zhu Shenhan
Publication venue
Publication date: 24/02/2024
Field of study

Transformer models have emerged as the leading approach for achieving state-of-the-art performance across various application domains, serving as the foundation for advanced large-scale deep learning (DL) models. However, efficiently training these models across multiple GPUs remains a complex challenge due to the abundance of parallelism options. Existing DL systems either require manual efforts to design distributed training plans or limit parallelism combinations to a constrained search space. In this paper, we present Galvatron-BMW, a novel system framework that integrates multiple prevalent parallelism dimensions and automatically identifies the most efficient hybrid parallelism strategy. To effectively navigate this vast search space, we employ a decision tree approach for decomposition and pruning based on intuitive insights. We further utilize a dynamic programming search algorithm to derive the optimal plan. Moreover, to improve resource utilization and enhance system efficiency, we propose a bi-objective optimization workflow that focuses on workload balance. Our evaluations on different Transformer models demonstrate the capabilities of Galvatron-BMW in automating distributed training under varying GPU memory constraints. Across all tested scenarios, Galvatron-BMW consistently achieves superior system throughput, surpassing previous approaches that rely on limited parallelism strategies.Comment: arXiv admin note: substantial text overlap with arXiv:2211.1387

arXiv.org e-Print Archive

Synergistic treatment of osteosarcoma with biomimetic nanoparticles transporting doxorubicin and siRNA

Author: Jingtong Zhao
Jinlan Jiang
Ping Li
Xiaowen Zhang
Xuejia Hou
Xupeng Mu
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2023
Field of study

IntroductionOsteosarcoma tumors are the most common malignant bone tumors in children and adolescents. Their treatment usually requires surgical removal of all detectable cancerous tissue and multidrug chemotherapy; however, the prognosis for patients with unresectable or recurrent osteosarcoma is unfavorable. To make chemotherapy safer and more effective for osteosarcoma patients, biomimetic nanoparticles (NPs) camouflaged by mesenchymal stem cell membranes (MSCMs) were synthesized to induce osteosarcoma cell apoptosis by co-delivering the anticancer drug doxorubicin hydrochloride(DOX) and a small interfering RNA (siRNA). Importantly, these NPs have high biocompatibility and tumor-homing ability. This study aimed to improve the efficacy of osteosarcoma therapy by using the synergistic combination of DOX and an siRNA targeting the apoptosis suppressor gene survivin.MethodsBiomimetic NPs (DOX/siSUR-PLGA@MSCM NPs) were synthesized by coloading DOX and survivin siRNA (siSUR) into poly (lactide-co-glycolide acid) (PLGA) via a double-emulsion solvent evaporation method. The NPs were camouflaged by MSCMs to deliver both DOX and survivin-targeting siRNA and characterized and evaluated in terms of cellular uptake, in vitro release, in vitro and in vivo antitumor effects, and biosafety.ResultsDOX/siSUR-PLGA@MSCM NPs had good tumor-homing ability due to the MSCMs modification. The drug-laden biomimetic NPs had good antitumor effects in homozygous MG63 tumor-bearing mice due to the synergistic effect of the drug combination.ConclusionDOX/siSUR-PLGA@MSCM NPs can show improved therapeutic effects in osteosarcoma patients due to the combination of a chemotherapeutic drug and gene therapy based on their good tumor targeting and biosafety

Directory of Open Access Journals

PubMed Central

Characteristics of slamming pressure and force for trimaran hull

Author: Deng Yanzeng
Jiang Yichen
Korobkin A.
Sui Xupeng
Sun Zhe
Xu Lixin
Zou Li
Publication venue: 'MDPI AG'
Publication date: 01/06/2021
Field of study

In this paper, the characteristics of the impact pressure and force of a trimaran section was studied by Computational Fluid Dynamics (CFD). The time domain features of the slamming pressure or force showed a strong correlation with the penetration depth regardless of the specific ways of water entry. The effects of velocity and acceleration on the impact pressure and force were analyzed. It was found that the initial impact of the main hull and the wet-deck slamming were predominantly affected by the entry velocities, whilst the acceleration had almost no effect for initial impact. The impact velocity presented a quadratic relation with slamming pressure/forces, and the relation between acceleration and wet-deck slamming pressure/force was linear. These were consistent with the patterns implied by analytical models such as the Wagner or MLM (Modified Logvinovich model) theories

University of East Anglia digital repository

LV5plex: Immune-histological phenotypes staged by self-studying for a liver cancer multiplex staining set

Author: Dongbo Jiang
Fenli Zhou
Hengzheng Wei
Jianing Shen
Jijin Li
Junqi Zhang
Kun Yang
Qingtao Zhao
Shuya Yang
Tengfei Zhuang
Xiyang Zhang
Xupeng Qiao
Xupeng Qiao
Xvshen Ding
Yahui Shi
Yang Liu
Yang Liu
Yuancai Feng
Yuanjie Sun
Publication venue: 'Frontiers Media SA'
Publication date: 01/02/2023
Field of study

Directory of Open Access Journals

Phylogenomic analyses provide insights into primate evolution

Author: Bi Xupeng
Chen Chun-Yan
Chen Jia-Wei
Chen Wu
Cooper David N.
Fan Peng-Fei
Farh Kyle Kai-How
Hayakawa Takashi
Hu Jiang
Kuderna Lukas F. K.
Li Fang
Li Gang
Li Ming
Li Xin
Liu Yang
Liu Zhi-Jin
Lu Hui-Meng
Marques-Bonet Tomas
Qi Xiao-Guang
Rivas-González Iker
Rogers Jeffrey
Roos Christian
Schierup Mikkel Heide
Shao Feng
Shao Yong
Stenson Peter D.
Sun Zongyi
Tiley George P.
Wang Depeng
Wang Sheng
Wang Wen
Wang Yun-Mei
Wu Dong-Dong
Yao Yong-Gang
Yoder Anne D.
Yu Li
Zhang Bao-Lin
Zhang Guojie
Zhang Ya-Ping
Zhao Lan
Zhou Long
Zhu Hong-Liang
Zhuang Xiao-Lin
Publication venue: American Association for the Advancement of Science
Publication date: 01/06/2023
Field of study

Comparative analysis of primate genomes within a phylogenetic context is essential for understanding the evolution of human genetic architecture and primate diversity. We present such a study of 50 primate species spanning 38 genera and 14 families, including 27 genomes first reported here, with many from previously less well represented groups, the New World monkeys and the Strepsirrhini. Our analyses reveal heterogeneous rates of genomic rearrangement and gene evolution across primate lineages. Thousands of genes under positive selection in different lineages play roles in the nervous, skeletal, and digestive systems and may have contributed to primate innovations and adaptations. Our study reveals that many key genomic innovations occurred in the Simiiformes ancestral node and may have had an impact on the adaptive radiation of the Simiiformes and human evolution

Online Research @ Cardiff

Digital.CSIC