Search CORE

9 research outputs found

SpotServe: Serving Generative Large Language Models on Preemptible Instances

Author: Cui Bin
Duan Jiangfei
Jia Zhihao
Lin Dahua
Miao Xupeng
Shi Chunan
Xi Xiaoli
Publication venue
Publication date: 27/11/2023
Field of study

The high computational and memory requirements of generative large language models (LLMs) make it challenging to serve them cheaply. This paper aims to reduce the monetary cost for serving LLMs by leveraging preemptible GPU instances on modern clouds, which offer accesses to spare GPUs at a much cheaper price than regular instances but may be preempted by the cloud at any time. Serving LLMs on preemptible instances requires addressing challenges induced by frequent instance preemptions and the necessity of migrating instances to handle these preemptions. This paper presents SpotServe, the first distributed LLM serving system on preemptible instances. Several key techniques in SpotServe realize fast and reliable serving of generative LLMs on cheap preemptible instances. First, SpotServe dynamically adapts the LLM parallelization configuration for dynamic instance availability and fluctuating workload, while balancing the trade-off among the overall throughput, inference latency and monetary costs. Second, to minimize the cost of migrating instances for dynamic reparallelization, the task of migrating instances is formulated as a bipartite graph matching problem, which uses the Kuhn-Munkres algorithm to identify an optimal migration plan that minimizes communications. Finally, to take advantage of the grace period offered by modern clouds, we introduce stateful inference recovery, a new inference mechanism that commits inference progress at a much finer granularity and allows SpotServe to cheaply resume inference upon preemption. We evaluate on real spot instance preemption traces and various popular LLMs and show that SpotServe can reduce the P99 tail latency by 2.4 - 9.1x compared with the best existing LLM serving systems. We also show that SpotServe can leverage the price advantage of preemptive instances, saving 54% monetary cost compared with only using on-demand instances.Comment: ASPLOS 202

arXiv.org e-Print Archive

Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism

Author: Cui Bin
Jiang Youhe
Miao Xupeng
Nie Xiaonan
Shi Chunan
Wang Yujie
Zhang Hailin
Publication venue: 'VLDB Endowment'
Publication date: 24/11/2022
Field of study

Transformer models have achieved state-of-the-art performance on various domains of applications and gradually becomes the foundations of the advanced large deep learning (DL) models. However, how to train these models over multiple GPUs efficiently is still challenging due to a large number of parallelism choices. Existing DL systems either rely on manual efforts to make distributed training plans or apply parallelism combinations within a very limited search space. In this approach, we propose Galvatron, a new system framework that incorporates multiple popular parallelism dimensions and automatically finds the most efficient hybrid parallelism strategy. To better explore such a rarely huge search space, we 1) involve a decision tree to make decomposition and pruning based on some reasonable intuitions, and then 2) design a dynamic programming search algorithm to generate the optimal plan. Evaluations on four representative Transformer workloads show that Galvatron could perform automatically distributed training with different GPU memory budgets. Among all evluated scenarios, Galvatron always achieves superior system throughput compared to previous work with limited parallelism

arXiv.org e-Print Archive

Template-free environmentally friendly synthesis and characterization of unsupported tungsten carbide with a controllable porous framework

Author: Chen Zhaoyang
Lin Wenfeng
Ma Chunan
Shi M Q
Zhao F M
Publication venue: 'Elsevier BV'
Publication date: 01/02/2012
Field of study

Queen's University Belfast Research Portal

Crossref

The interface electrochemical and chemical mechanism of a low alloy steel in a 3.5% NaCl solution containing Ce 3+

Author: Aballe
Bin
Brug
Catubig
Chunan
Chunan
Chunan
De Nicolo
Hirschorn
Ivusic
Laleh
Li
Lin
Lin
Liu
Liu
Meresht
Mishra
Mishra
Muster
Nobial
Oka
Orazem
Sherif
Shi
Valdez
Yang
Yanhua
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

The influence of temperature on the impedance characteristics of PEMFC

Author: Chunan Cao
Kang Hu
Meixun Peng
Qian Liu
Shi Meilun
Ying Shi Wang
Zhao Shengnan
Publication venue: 'IOP Publishing'
Publication date
Field of study

Crossref

A Mini Review of <i>S</i>-Nitrosoglutathione Loaded Nano/Micro-Formulation Strategies

Author: Chunan Du
Hui Ming
Kunpeng Zhang
Libo Zhang
Shengbo Ge
Xuqiang Guo
Yang Shi
Publication venue: 'MDPI AG'
Publication date: 01/01/2023
Field of study

As a potential therapeutic agent, the clinical application of S-nitrosoglutathione (GSNO) is limited because of its instability. Therefore, different formulations have been developed to protect GSNO from degradation, delivery and the release of GSNO at a physiological concentration in the active position. Due to the high water-solubility and small molecular-size of GSNO, the biggest challenges in the encapsulation step are low encapsulation efficiency and burst release. This review summarizes the different nano/micro-formulation strategies of a GSNO related delivery system to provide references for subsequent researchers interested in GSNO encapsulation

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

PubMed Central

Analysis of the open-circuit voltage of Cu2ZnSn(S, Se)4 thin film solar cell

Author: Adewoyin
Amal
Arbouz
Chelvanathan
Chen
Cherouana
Chunan Zhuang
Collord
Cui
Dalapati
Dave
Dimitrievska
Dullweber
Feng
Frisk
Green
Gurav
Hironiwa
Huang
Hwang
Jiaxiong Xu
Junhui Lin
Kaminski
Kanevce
Kim
Kuo
Lee
Levcenco
Li
Li
Li
Lin
Lin
Liu
Lokhande
Lu
Marchionna
Meher
Miskin
Mohammadnejad
Nam
Qi
Rakhshani
Ramasamy
Scragg
Shi
Shin
Siebentritt
Simya
Singh
Tajima
Temgoua
Todorov
Topič
Varache
Wang
Wei
Wu
Xie
Xu
Xu
Xu
Xue
Yan
Yan
Yang
Yang
Yim
Zhang
Zhao
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Microstructural stability and mechanical behavior of a Co–20Ni–7Al–7W–4Ti at.% superalloy

Author: Bauer
Bocchini
Bocchini
Bocchini
Booth-Morrison
Booth-Morrison
Chiang
Chunan Li
Chung
Chung
Coakley
Coakley
Courtney
Cui
David C. Dunand
David N. Seidman
Ding-Wen Chung
Epishin
Gabb
Johnson
Johnson
Jokisaari
Kobayashi
Kolb
Krol
Lass
Lee
Lifshitz
Liu
Locq
Mao
Marquis
Marqusee
Meher
Mottura
Mughrabi
Nabarro
Neumeier
Ng
Ooshima
Osada
Parratt
Philippe
Plotnikov
Pollock
Povstugar
Pyczak
Pyczak
Reed
Russ
Saal
Sato
Sauza
Shi
Shinagawa
Singh
Snyder
Socrate
Sudbrack
Suzuki
Suzuki
Tetzlaff
Thompson
Tian
Titus
Tsunekane
Vorontsov
Wagner
Wang
Xue
Xue
Xue
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Does haze pollution promote the consumption of energy-saving appliances in China? An empirical study based on norm activation model

Author: Abrahamse
Aiken
Baldini
Bang
Baron
Beck
Chen
Cheng
Cheng
Christian
Chunan Zhao
De Groot
De Groot
Ellen
Fornell
Fujii
Galarraga
Gao
Graafland
Guo
He
Hopper
Kang
Kelly
Kim
Kinnear
Lan
Lascu
Lewandowski
Li
Lippa
Liu
Liu
Lv
Ma
Ma
Mattia
Mi
Ming Zhang
Mizobuchi
Mostafa
Møller
Nguyen
Pan
Park
Peters
Qi
Radpour
Roberts
Ru
Rui
Schultz
Schwartz
Schwartz
Sheng
Shi
Smith
Stephane de la
Straughan
Tan
Tobler
Trotta
Vermeir
Wang
Wang
Wang
Wang
Wang
Wang
Wesley
Wierzbowski
Wittenberg
Wu
Xiong
Xu
Yan Song
Yang
Zhang
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref