53 research outputs found

    Elixir: Train a Large Language Model on a Small GPU Cluster

    Full text link
    In recent years, the number of parameters of one deep learning (DL) model has been growing much faster than the growth of GPU memory space. People who are inaccessible to a large number of GPUs resort to heterogeneous training systems for storing model parameters in CPU memory. Existing heterogeneous systems are based on parallelization plans in the scope of the whole model. They apply a consistent parallel training method for all the operators in the computation. Therefore, engineers need to pay a huge effort to incorporate a new type of model parallelism and patch its compatibility with other parallelisms. For example, Mixture-of-Experts (MoE) is still incompatible with ZeRO-3 in Deepspeed. Also, current systems face efficiency problems on small scale, since they are designed and tuned for large-scale training. In this paper, we propose Elixir, a new parallel heterogeneous training system, which is designed for efficiency and flexibility. Elixir utilizes memory resources and computing resources of both GPU and CPU. For flexibility, Elixir generates parallelization plans in the granularity of operators. Any new type of model parallelism can be incorporated by assigning a parallel pattern to the operator. For efficiency, Elixir implements a hierarchical distributed memory management scheme to accelerate inter-GPU communications and CPU-GPU data transmissions. As a result, Elixir can train a 30B OPT model on an A100 with 40GB CUDA memory, meanwhile reaching 84% efficiency of Pytorch GPU training. With its super-linear scalability, the training efficiency becomes the same as Pytorch GPU training on multiple GPUs. Also, large MoE models can be trained 5.3x faster than dense models of the same size. Now Elixir is integrated into ColossalAI and is available on its main branch

    Search for dark matter produced in association with bottom or top quarks in √s = 13 TeV pp collisions with the ATLAS detector

    Get PDF
    A search for weakly interacting massive particle dark matter produced in association with bottom or top quarks is presented. Final states containing third-generation quarks and miss- ing transverse momentum are considered. The analysis uses 36.1 fb−1 of proton–proton collision data recorded by the ATLAS experiment at √s = 13 TeV in 2015 and 2016. No significant excess of events above the estimated backgrounds is observed. The results are in- terpreted in the framework of simplified models of spin-0 dark-matter mediators. For colour- neutral spin-0 mediators produced in association with top quarks and decaying into a pair of dark-matter particles, mediator masses below 50 GeV are excluded assuming a dark-matter candidate mass of 1 GeV and unitary couplings. For scalar and pseudoscalar mediators produced in association with bottom quarks, the search sets limits on the production cross- section of 300 times the predicted rate for mediators with masses between 10 and 50 GeV and assuming a dark-matter mass of 1 GeV and unitary coupling. Constraints on colour- charged scalar simplified models are also presented. Assuming a dark-matter particle mass of 35 GeV, mediator particles with mass below 1.1 TeV are excluded for couplings yielding a dark-matter relic density consistent with measurements

    Space advanced technology demonstration satellite

    Get PDF
    The Space Advanced Technology demonstration satellite (SATech-01), a mission for low-cost space science and new technology experiments, organized by Chinese Academy of Sciences (CAS), was successfully launched into a Sun-synchronous orbit at an altitude of similar to 500 km on July 27, 2022, from the Jiuquan Satellite Launch Centre. Serving as an experimental platform for space science exploration and the demonstration of advanced common technologies in orbit, SATech-01 is equipped with 16 experimental payloads, including the solar upper transition region imager (SUTRI), the lobster eye imager for astronomy (LEIA), the high energy burst searcher (HEBS), and a High Precision Magnetic Field Measurement System based on a CPT Magnetometer (CPT). It also incorporates an imager with freeform optics, an integrated thermal imaging sensor, and a multi-functional integrated imager, etc. This paper provides an overview of SATech-01, including a technical description of the satellite and its scientific payloads, along with their on-orbit performance

    Measurements of top-quark pair differential cross-sections in the eμe\mu channel in pppp collisions at s=13\sqrt{s} = 13 TeV using the ATLAS detector

    Get PDF

    Search for single production of vector-like quarks decaying into Wb in pp collisions at s=8\sqrt{s} = 8 TeV with the ATLAS detector

    Get PDF

    Measurement of the W boson polarisation in ttˉt\bar{t} events from pp collisions at s\sqrt{s} = 8 TeV in the lepton + jets channel with ATLAS

    Get PDF

    Measurement of jet fragmentation in Pb+Pb and pppp collisions at sNN=2.76\sqrt{{s_\mathrm{NN}}} = 2.76 TeV with the ATLAS detector at the LHC

    Get PDF

    Measurement of the charge asymmetry in top-quark pair production in the lepton-plus-jets final state in pp collision data at s=8TeV\sqrt{s}=8\,\mathrm TeV{} with the ATLAS detector

    Get PDF

    Search for new phenomena in events containing a same-flavour opposite-sign dilepton pair, jets, and large missing transverse momentum in s=\sqrt{s}= 13 pppp collisions with the ATLAS detector

    Get PDF

    Charged-particle distributions at low transverse momentum in s=13\sqrt{s} = 13 TeV pppp interactions measured with the ATLAS detector at the LHC

    Get PDF
    corecore