Search CORE

1,212 research outputs found

CES-KD: Curriculum-based Expert Selection for Guided Knowledge Distillation

Author: Amara Ibtihel
Clark James J.
Gross Warren
Meyer Brett H.
Ziaeefard Maryam
Publication venue
Publication date: 15/09/2022
Field of study

Knowledge distillation (KD) is an effective tool for compressing deep classification models for edge devices. However, the performance of KD is affected by the large capacity gap between the teacher and student networks. Recent methods have resorted to a multiple teacher assistant (TA) setting for KD, which sequentially decreases the size of the teacher model to relatively bridge the size gap between these models. This paper proposes a new technique called Curriculum Expert Selection for Knowledge Distillation (CES-KD) to efficiently enhance the learning of a compact student under the capacity gap problem. This technique is built upon the hypothesis that a student network should be guided gradually using stratified teaching curriculum as it learns easy (hard) data samples better and faster from a lower (higher) capacity teacher network. Specifically, our method is a gradual TA-based KD technique that selects a single teacher per input image based on a curriculum driven by the difficulty in classifying the image. In this work, we empirically verify our hypothesis and rigorously experiment with CIFAR-10, CIFAR-100, CINIC-10, and ImageNet datasets and show improved accuracy on VGG-like models, ResNets, and WideResNets architectures.Comment: ICPR202

arXiv.org e-Print Archive

FMAS: Fast Multi-Objective SuperNet Architecture Search for Semantic Segmentation

Author: Amein Marihan
Gross Warren J.
Meyer Brett H.
Therrien Olivier
Xiong Zhuoran
Publication venue
Publication date: 28/03/2023
Field of study

We present FMAS, a fast multi-objective neural architecture search framework for semantic segmentation. FMAS subsamples the structure and pre-trained parameters of DeepLabV3+, without fine-tuning, dramatically reducing training time during search. To further reduce candidate evaluation time, we use a subset of the validation dataset during the search. Only the final, Pareto non-dominated, candidates are ultimately fine-tuned using the complete training set. We evaluate FMAS by searching for models that effectively trade accuracy and computational cost on the PASCAL VOC 2012 dataset. FMAS finds competitive designs quickly, e.g., taking just 0.5 GPU days to discover a DeepLabV3+ variant that reduces FLOPs and parameters by 10

\%

and 20

\%

respectively, for less than 3

\%

increased error. We also search on an edge device called GAP8 and use its latency as the metric. FMAS is capable of finding 2.2

\times

faster network with 7.61

\%

MIoU loss.Comment: Accepted as a full paper by the TinyML Research Symposium 202

arXiv.org e-Print Archive

BD-KD: Balancing the Divergences for Online Knowledge Distillation

Author: Amara Ibtihel
Clark James J.
Gross Warren J.
Meyer Brett H.
Sepahvand Nazanin
Publication venue
Publication date: 25/12/2022
Field of study

Knowledge distillation (KD) has gained a lot of attention in the field of model compression for edge devices thanks to its effectiveness in compressing large powerful networks into smaller lower-capacity models. Online distillation, in which both the teacher and the student are learning collaboratively, has also gained much interest due to its ability to improve on the performance of the networks involved. The Kullback-Leibler (KL) divergence ensures the proper knowledge transfer between the teacher and student. However, most online KD techniques present some bottlenecks under the network capacity gap. By cooperatively and simultaneously training, the models the KL distance becomes incapable of properly minimizing the teacher's and student's distributions. Alongside accuracy, critical edge device applications are in need of well-calibrated compact networks. Confidence calibration provides a sensible way of getting trustworthy predictions. We propose BD-KD: Balancing of Divergences for online Knowledge Distillation. We show that adaptively balancing between the reverse and forward divergences shifts the focus of the training strategy to the compact student network without limiting the teacher network's learning process. We demonstrate that, by performing this balancing design at the level of the student distillation loss, we improve upon both performance accuracy and calibration of the compact student network. We conducted extensive experiments using a variety of network architectures and show improvements on multiple datasets including CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet. We illustrate the effectiveness of our approach through comprehensive comparisons and ablations with current state-of-the-art online and offline KD techniques

arXiv.org e-Print Archive

Efficient Fine-Tuning of Compressed Language Models with Learners

Author: Clark James J.
Gross Warren J.
Meyer Brett H.
Tayaranian Mohammadreza
Vucetic Danilo
Ziaeefard Maryam
Publication venue
Publication date: 03/08/2022
Field of study

Fine-tuning BERT-based models is resource-intensive in memory, computation, and time. While many prior works aim to improve inference efficiency via compression techniques, e.g., pruning, these works do not explicitly address the computational challenges of training to downstream tasks. We introduce Learner modules and priming, novel methods for fine-tuning that exploit the overparameterization of pre-trained language models to gain benefits in convergence speed and resource utilization. Learner modules navigate the double bind of 1) training efficiently by fine-tuning a subset of parameters, and 2) training effectively by ensuring quick convergence and high metric scores. Our results on DistilBERT demonstrate that learners perform on par with or surpass the baselines. Learners train 7x fewer parameters than state-of-the-art methods on GLUE. On CoLA, learners fine-tune 20% faster, and have significantly lower resource utilization.Comment: 8 pages, 9 figures, 2 tables, presented at ICML 2022 workshop on Hardware-Aware Efficient Training (HAET 2022

arXiv.org e-Print Archive

A Design Framework for Invertible Logic

Author: Fujita Hiroyuki
Gross Warren J.
Hanyu Takahiro
Meyer Brett H.
Nishino Kaito
Onizawa Naoya
Smithson Sean C.
Yamagata Hitoshi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/06/2020
Field of study

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Commissioning of the CMS High Level Trigger

The CMS experiment will collect data from the proton-proton collisions delivered by the Large Hadron Collider (LHC) at a centre-of-mass energy up to 14 TeV. The CMS trigger system is designed to cope with unprecedented luminosities and LHC bunch-crossing rates up to 40 MHz. The unique CMS trigger architecture only employs two trigger levels. The Level-1 trigger is implemented using custom electronics, while the High Level Trigger (HLT) is based on software algorithms running on a large cluster of commercial processors, the Event Filter Farm. We present the major functionalities of the CMS High Level Trigger system as of the starting of LHC beams operations in September 2008. The validation of the HLT system in the online environment with Monte Carlo simulated data and its commissioning during cosmic rays data taking campaigns are discussed in detail. We conclude with the description of the HLT operations with the first circulating LHC beams before the incident occurred the 19th September 2008

arXiv.org e-Print Archive

Crossref

The University of Manchester - Institutional Repository

CERN Document Server

Archivio istituzionale della ricerca - Università di Padova

Palaeoenvironmental control on distribution of crinoids in the Bathonian (Middle Jurassic) of England and France

Author: Aaron W. Hunter
Ager D.V.
Améziane N.
Améziane-Cominardi N.
Ausich W.I.
Baumiller T.K.
Baumiller T.K.
Baumiller T.K.
Baumiller T.K.
Baumiller T.K.
Benton M.J.
Bigot A.
Boneham B.F.W.
Botquelen A.
Bottjer D.J.
Brett C.E.
Brett C.E.
Brett C.E.
Carpenter P.H.
Channon P.J.
Charlie J. Underwood
Cope J.C.W.
David J.
David J.
David J.
Eagle M.K.
Elliott G.F.
Emson R.H.
Franzen C.
Freeman E.F.
Fürsich F.T.
Gislén T.
Hagdorn H.
Hagdorn H.
Hallam A.
Hess H.
Hess H.
Hess H.
Hess H.
House M.R.
Hudson J.
Hunter A.W.
Hunter A.W.
Jagt J.W.M.
Kammer T.W.
Kidwell S.M.
Klikushin V.G.
Kroh A.
Llewellyn G.
Loriol P.
Macurda D.B.
Macurda D.B.
Macurda D.B.
McKerrow W.S.
McKerrow W.S.
Messing C.G.
Messing C.G.
Messing C.G.
Messing C.G.
Messing C.G.
Metcalf S.J.
Meyer D.
Meyer D.L.
Meyer D.L.
Meyer D.L.
Moore R.C.
Oji T.
Oji T.
Palmer T.J.
Palmer T.J.
Palmer T.J.
Pickerill R.K.
Pisera A.
Radwańska U.
Rasmussen H. W.
Rasmussen H.W.
Richardson L.
Rose E.P.F.
Roux M.
Roux M.
Sellwood B.W.
Shibata T.F.
Simms M.J.
Simms M.J.
Simms M.J.
Smith W.
Sumbler M.G.
Sumbler M.G.
Sylvester-Bradley P.C.
Taylor P.D.
Taylor P.D.
Titus R.
Torrens H.S.
Ward D.J.
Wyatt R.J.
Wyatt R.J.
Publication venue: 'Polska Akademia Nauk Instytut Paleobiologii (Institute of Paleobiology, Polish Academy of Sciences)'
Publication date: 01/01/2009
Field of study

Bulk sampling of a number of different marine and marginal marine lithofacies in the British Bathonian has allowed us to assess the palaeoenvironmental distribution of crinoids for the first time. Although remains are largely fragmentary, many species have been identified by comparison with articulated specimens from elsewhere, whilst the large and unbiased sample sizes allowed assessment of relative proportions of different taxa. Results indicate that distribution of crinoids well corresponds to particular facies. Ossicles of Chariocrinus and Balanocrinus dominate in deeper-water and lower-energy facies,with the former extending further into shallower-water facies than the latter. Isocrinus dominates in shallower water carbonate facies, accompanied by rarer comatulids, and was also present in the more marine parts of lagoons. Pentacrinites remains are abundant in very high-energy oolite shoal lithofacies. The presence of millericrinids within one, partly allochthonous lithofacies suggests the presence of an otherwise unknown hard substrate from which they have been transported. These results are compared to crinoid assemblages from other Mesozoic localities, and it is evident that the same morphological ad-aptations are present within crinoids from similar lithofacies throughout the Jurassic and Early Cretaceous

Crossref

Biblioteka Nauki - repozytorium artykuÅÃ³w

UCL Discovery

espace@Curtin

Theory of Low-Mass Stars and Substellar Objects

Author: Allard F
Auman J
Baraffe I
Baraffe I
Basri G
Beuermann K
Borysow A
Bouvier J
Brett C
Brett C
Chabrier G
Chabrier G
Chabrier G
Clayton DD
Copeland H
Cox JP
Dahn CC
Delfosse X
Delfosse X
Delfosse X
Delfosse X
Forveille T
Gilles Chabrier
Goldman B
Grossman AS
Isabelle Baraffe
King AR
Kroupa P
Ludwig HG
Magni G
Martín EL
Meyer F
Méra D
Neuhäuser R
Raboud D
Spiegel EA
Spruit H
Spruit H
Tinney CG
Tsuji T
Udalsky A
Ushomirsky G
Publication venue: 'Annual Reviews'
Publication date: 01/01/2000
Field of study

Since the discovery of the first bona-fide brown dwarfs and extra-solar planets in 1995, the field of low mass stars and substellar objects has considerably progressed, both from theoretical and observational viewpoints.Recent developments in the physics entering the modeling of these objects have led to significant improvements in the theory and to a better understanding of their mechanical and thermal properties. This theory can now be confronted with observations directly in various observational diagrams (color-color, color-magnitude, mass-magnitude, mass-spectral type), a stringent and unavoidable constraint which became possible only recently, with the generation of synthetic spectra. In this paper, we present the current state-of-the-art general theory of low-mass stars and sub-stellar objects, from one solar mass to one Jupiter mass, regarding primarily their interior structure and evolution. This review is a natural complement to the previous review on the atmosphere of low-mass stars and brown dwarfs (Allard et al 1997). Special attention is devoted to the comparison of the theory with various available observations. The contribution of low-mass stellar and sub-stellar objects to the Galactic mass budget is also analysed.Comment: 81 pages, Latex file, uses aasms4.sty, review for Annual Review of Astronomy and Astrophysics, vol. 38 (2000

arXiv.org e-Print Archive