Search CORE

5 research outputs found

Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation

Author: Blakeney Cody
Forde Jessica Zosa
Frankle Jonathan
Leavitt Matthew L.
Zong Ziliang
Publication venue
Publication date: 01/11/2022
Field of study

Methods for improving the efficiency of deep network training (i.e. the resources required to achieve a given level of model quality) are of immediate benefit to deep learning practitioners. Distillation is typically used to compress models or improve model quality, but it's unclear if distillation actually improves training efficiency. Can the quality improvements of distillation be converted into training speed-ups, or do they simply increase final model quality with no resource savings? We conducted a series of experiments to investigate whether and how distillation can be used to accelerate training using ResNet-50 trained on ImageNet and BERT trained on C4 with a masked language modeling objective and evaluated on GLUE, using common enterprise hardware (8x NVIDIA A100). We found that distillation can speed up training by up to 1.96x in ResNet-50 trained on ImageNet and up to 1.42x on BERT when evaluated on GLUE. Furthermore, distillation for BERT yields optimal results when it is only performed for the first 20-50% of training. We also observed that training with distillation is almost always more efficient than training without distillation, even when using the poorest-quality model as a teacher, in both ResNet-50 and BERT. Finally, we found that it's possible to gain the benefit of distilling from an ensemble of teacher models, which has O(n) runtime cost, by randomly sampling a single teacher from the pool of teacher models on each step, which only has a O(1) runtime cost. Taken together, these results show that distillation can substantially improve training efficiency in both image classification and language modeling, and that a few simple optimizations to distillation protocols can further enhance these efficiency improvements

arXiv.org e-Print Archive

Characterization of novel structural features in the lipopolysaccharide of nondisease associated nontypeable Haemophilus influenzae

Author: Bauer
Berry
Blakeney
Bouchet
Cody
Cox
Eskola
Faden
Fleischmann
Gerwig
Helander
Holst
Hood
Hood
Hood
Hood
Jansson
Kilpi
Kussak
Li
Lysenko
Masoud
Masoud
Meats
Månsson
Månsson
Månsson
Månsson
Peerbooms
Phillips
Rahman
Risberg
Sawardeker
Schweda
Schweda
Schweda
Teele
Trottier
Vesa
Weiser
Weiser
Yildirim
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

Study of negative pressure wound therapy as an adjunct treatment for acute burns in children (SONATA in C): protocol for a randomised controlled trial

Author: A Bakker
A Christie
A Emond
A Parihar
A Thompson
AA Stone
AD Widgerow
AE Jong de
AK Greene
Australian Institute of Health and Welfare
BJ Shields
Bronwyn Griffin
C Schiestl
C Schrank
CC Yang
CJ Moffatt
CL Baeyer von
CL Baeyer von
CL Hicks
CM Fox
Cody C. Frear
D Contractor
D Heimbach
D Passaretti
D Schmauss
DM Heimbach
DM Jackson
DN Herndon
DN Herndon
DW Edgar
E Gee Kee
EA Deitch
EL Gee Kee
FM Wood
G Damiani
G Montori
GN Saxe
GN Stamatas
GO Till
H Inoue
H Lagus
H Onarheim
HF Carvajal
I Wollgarten-Hadamek
I Wollgarten-Hadamek
J Goverman
J Miro
J Molnar
J Rappsilber
J Wiseman
JA Molnar
JF Mooney 3rd
JW Lawrence
JW Lawrence
JW Shupp
JY Shin
K Lauritsen
K Storey
KA Stockton
KC Lee
KD Capek
KK Desai
KM Bombaro
KS Petkar
L Lancerotto
LA Scherer
LC Argenta
Leila Cuttle
LJ Draaijers
LK Branski
LP Bucky
LP Kamolz
M Adamkova
M Baharestani
M Choiniere
M Infanger
M Malmsjö
M Simons
M Tolonen
M Wal van der
MA Ferreira-Valente
MB Wal van der
MF Fay
MGKE Simons
MJ Hjermstad
MJ Morykwas
MJ Morykwas
MJ Morykwas
MJ Potter
ML Blum
N Iusupov Iu
N Kairinos
N Kairinos
N Rumsey
NA Kantak
NJ Brown
O Borgquist
O Borgquist
O Pardesi
P Blakeney
PE Banwell
PE Banwell
PG Shakespeare
PJ McGrath
RC Manworren
RH Caulfield
RH Demling
RL Williams
Roy Kimble
RR Danks
S Goksan
S Gregor
S Hassan
S Isik
S Liu
S Llanos
S Lonie
S McGarry
S McGarry
SA Birchenough
SI Merkel
SJ Lane
SS Scherer
Steven M. McPhail
T Everett
T Lund
T Zang
T Zang
TC Cubison
TM Honnegowda
V Morris
V Singh
W Meyer
W Meyer
W Mileski
World Health Organization
Y Huang
Y Ren
Z Hu
Z Janzekovic
Z Janzekovic
Z Tyack
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref