446 research outputs found

    Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models

    Full text link
    Finetuning large language models (LLMs) has been empirically effective on a variety of downstream tasks. Existing approaches to finetuning an LLM either focus on parameter-efficient finetuning, which only updates a small number of trainable parameters, or attempt to reduce the memory footprint during the training phase of the finetuning. Typically, the memory footprint during finetuning stems from three contributors: model weights, optimizer states, and intermediate activations. However, existing works still require considerable memory and none can simultaneously mitigate memory footprint for all three sources. In this paper, we present Quantized Side Tuing (QST), which enables memory-efficient and fast finetuning of LLMs by operating through a dual-stage process. First, QST quantizes an LLM's model weights into 4-bit to reduce the memory footprint of the LLM's original weights; QST also introduces a side network separated from the LLM, which utilizes the hidden states of the LLM to make task-specific predictions. Using a separate side network avoids performing backpropagation through the LLM, thus reducing the memory requirement of the intermediate activations. Furthermore, QST leverages several low-rank adaptors and gradient-free downsample modules to significantly reduce the trainable parameters, so as to save the memory footprint of the optimizer states. Experiments show that QST can reduce the total memory footprint by up to 2.3 ×\times and speed up the finetuning process by up to 3 ×\times while achieving competent performance compared with the state-of-the-art. When it comes to full finetuning, QST can reduce the total memory footprint up to 7 ×\times

    SpotServe: Serving Generative Large Language Models on Preemptible Instances

    Full text link
    The high computational and memory requirements of generative large language models (LLMs) make it challenging to serve them cheaply. This paper aims to reduce the monetary cost for serving LLMs by leveraging preemptible GPU instances on modern clouds, which offer accesses to spare GPUs at a much cheaper price than regular instances but may be preempted by the cloud at any time. Serving LLMs on preemptible instances requires addressing challenges induced by frequent instance preemptions and the necessity of migrating instances to handle these preemptions. This paper presents SpotServe, the first distributed LLM serving system on preemptible instances. Several key techniques in SpotServe realize fast and reliable serving of generative LLMs on cheap preemptible instances. First, SpotServe dynamically adapts the LLM parallelization configuration for dynamic instance availability and fluctuating workload, while balancing the trade-off among the overall throughput, inference latency and monetary costs. Second, to minimize the cost of migrating instances for dynamic reparallelization, the task of migrating instances is formulated as a bipartite graph matching problem, which uses the Kuhn-Munkres algorithm to identify an optimal migration plan that minimizes communications. Finally, to take advantage of the grace period offered by modern clouds, we introduce stateful inference recovery, a new inference mechanism that commits inference progress at a much finer granularity and allows SpotServe to cheaply resume inference upon preemption. We evaluate on real spot instance preemption traces and various popular LLMs and show that SpotServe can reduce the P99 tail latency by 2.4 - 9.1x compared with the best existing LLM serving systems. We also show that SpotServe can leverage the price advantage of preemptive instances, saving 54% monetary cost compared with only using on-demand instances.Comment: ASPLOS 202

    Overall survival and cancer-specific survival were improved in local treatment of metastatic prostate cancer

    Get PDF
    BackgroundFor metastatic prostate cancer (mPCa), radical prostatectomy (RP) and radiation therapy (RT) may improve overall survival (OS) and cancer-specific survival (CSS). Compared with RT, RP shows significant advantages in improving patient outcomes. External beam radiation therapy (EBRT) even slightly elevates CSM with no statistical difference in OS compared with no local treatment (NLT).ObjectiveTo evaluate OS and CSS after local treatment (LT) (including RP and RT) versus NLT in mPCa.Design, setting, and participantsWithin the Surveillance, Epidemiology and End Results (SEER) database (2000-2018), 20098 patients with metastatic prostate cancer were selected in this study, of which 19433 patients had no local treatment, 377 patients with radical prostate treatment, and 288 patients with RT.Outcome measurements and statistical analysisMultivariable competing risks regression analysis after propensity score matching (PSM) was used to calculate CSM. Multivariable Cox regression analysis was used to identify the risk factors. Kaplan-Meier methods were used to calculate OS.Results and limitationsA total of 20098 patients were included: NLT (n = 19433), RP (n=377) and RT (n=288). In a competing risk regression analysis after PSM (ratio 1:1), RP resulted in a significantly lower CSM (hazard ratio [HR] 0.36, 95% confidence interval [CI] 0.29-0.45) than NLT, while RT showed a slightly lower CSM (HR 0.77, 95% CI 0.63-0.95). In a competing risk regression analysis after PSM (ratio 1:1), RP led to a lower CSM (HR 0.56, 95% CI 0.41-0.76) versus RT. As for all-cause mortality (ACM), RP (HR 0.37, 95% CI 0.31-0.45) and RT (HR 0.66, 95% CI 0.56-0.79). also showed a downward trend. In terms of OS, RP and RT significantly improved the survival probability compared with NLT, with the effect of RP being more pronounced. Obviously, older age, Gleason scores ≄8, AJCC T3-T4 stage, AJCC N1, AJCC M1b-M1c were all associated with higher CSM (P <0.05). The same results held true for ACM. The limitation of this article is that it is not possible to assess the effect of differences in systemic therapy on CSM in mPCa patients and clinical trials are needed to verify the results.ConclusionsFor patients with mPCa, both RP and RT are beneficial to patients, and the efficacy of RP is better than RT from the perspective of CSM and ACM. Older age, higher gleason scores and the more advanced AJCC TNM stage all put patients at higher risk of dying.Patient summaryA large population-based cancer database showed that in addition to first-line therapy (hormonal treatment), RP and radiotherapy can also benefit patients with mPCa

    Minute-cadence Observations of the LAMOST Fields with the TMTS: III. Statistic Study of the Flare Stars from the First Two Years

    Full text link
    Tsinghua University-Ma Huateng Telescopes for Survey (TMTS) aims to detect fast-evolving transients in the Universe, which has led to the discovery of thousands of short-period variables and eclipsing binaries since 2020. In this paper, we present the observed properties of 125 flare stars identified by the TMTS within the first two years, with an attempt to constrain their eruption physics. As expected, most of these flares were recorded in late-type red stars with GBP−GRPG_{\rm BP}-G_{\rm RP} > 2.0 mag, however, the flares associated with bluer stars tend to be on average more energetic and have broader profiles. The peak flux (F_peak) of the flare is found to depend strongly on the equivalent duration (ED) of the energy release, i.e., Fpeak∝ED0.72±0.04F_{{\rm peak}} \propto {\rm ED}^{0.72\pm0.04}, which is consistent with results derived from the Kepler and Evryscope samples. This relation is likely related to the magnetic loop emission, while -- for the more popular non-thermal electron heating model -- a specific time evolution may be required to generate this relation. We notice that flares produced by hotter stars have a flatter Fpeak∝EDF_{{\rm peak}} \propto {\rm ED} relation compared to that from cooler stars. This is related to the statistical discrepancy in light-curve shape of flare events with different colors. In spectra from LAMOST, we find that flare stars have apparently stronger H alpha emission than inactive stars, especially at the low temperature end, suggesting that chromospheric activity plays an important role in producing flares. On the other hand, the subclass having frequent flares are found to show H alpha emission of similar strength in their spectra to that recorded with only a single flare but similar effective temperature, implying that the chromospheric activity may not be the only trigger for eruptions.Comment: 17 pages, 15 figures, 2 tables, refereed version. For associated data files, see https://cdsarc.cds.unistra.fr/viz-bin/cat/J/MNRAS/523/219

    A spectral data release for 104 Type II Supernovae from the Tsinghua Supernova Group

    Full text link
    We present 206 unpublished optical spectra of 104 type II supernovae obtained by the Xinglong 2.16m telescope and Lijiang 2.4m telescope during the period from 2011 to 2018, spanning the phases from about 1 to 200 days after the SN explosion. The spectral line identifications, evolution of line velocities and pseudo equivalent widths, as well as correlations between some important spectral parameters are presented. Our sample displays a large range in expansion velocities. For instance, the Fe~{\sc ii} 51695169 velocities measured from spectra at t∌50t\sim 50 days after the explosion vary from ${\rm 2000\ km\ s^{-1}}to to {\rm 5500\ km\ s^{-1}},withanaveragevalueof, with an average value of {\rm 3872 \pm 949\ km\ s^{-1}}.Power−lawfunctionscanbeusedtofitthevelocityevolution,withthepower−lawexponentquantifyingthevelocitydeclinerate.WefoundananticorrelationexistingbetweenH. Power-law functions can be used to fit the velocity evolution, with the power-law exponent quantifying the velocity decline rate. We found an anticorrelation existing between H\betavelocityatmid−plateauphaseanditsvelocitydecayexponent,SNeIIwithhighervelocitiestendingtohavesmallervelocitydecayrate.Moreover,wenoticedthatthevelocitydecayrateinferredfromtheBalmerlines(i.e.,H velocity at mid-plateau phase and its velocity decay exponent, SNe II with higher velocities tending to have smaller velocity decay rate. Moreover, we noticed that the velocity decay rate inferred from the Balmer lines (i.e., H\alphaandH and H\beta)havemoderatecorrelationswiththeratioofabsorptiontoemissionforH) have moderate correlations with the ratio of absorption to emission for H\alpha$ (a/e). In our sample, two objects show possibly flash-ionized features at early phases. Besides, we noticed that multiple high-velocity components may exist on the blue side of hydrogen lines of SN 2013ab, possibly suggesting that these features arise from complex line forming region. All our spectra can be found in WISeREP and Zenodo

    Robust estimation of bacterial cell count from optical density

    Get PDF
    Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data

    Multidifferential study of identified charged hadron distributions in ZZ-tagged jets in proton-proton collisions at s=\sqrt{s}=13 TeV

    Full text link
    Jet fragmentation functions are measured for the first time in proton-proton collisions for charged pions, kaons, and protons within jets recoiling against a ZZ boson. The charged-hadron distributions are studied longitudinally and transversely to the jet direction for jets with transverse momentum 20 <pT<100< p_{\textrm{T}} < 100 GeV and in the pseudorapidity range 2.5<η<42.5 < \eta < 4. The data sample was collected with the LHCb experiment at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 1.64 fb−1^{-1}. Triple differential distributions as a function of the hadron longitudinal momentum fraction, hadron transverse momentum, and jet transverse momentum are also measured for the first time. This helps constrain transverse-momentum-dependent fragmentation functions. Differences in the shapes and magnitudes of the measured distributions for the different hadron species provide insights into the hadronization process for jets predominantly initiated by light quarks.Comment: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-013.html (LHCb public pages

    Study of the B−→Λc+Λˉc−K−B^{-} \to \Lambda_{c}^{+} \bar{\Lambda}_{c}^{-} K^{-} decay

    Full text link
    The decay B−→Λc+Λˉc−K−B^{-} \to \Lambda_{c}^{+} \bar{\Lambda}_{c}^{-} K^{-} is studied in proton-proton collisions at a center-of-mass energy of s=13\sqrt{s}=13 TeV using data corresponding to an integrated luminosity of 5 fb−1\mathrm{fb}^{-1} collected by the LHCb experiment. In the Λc+K−\Lambda_{c}^+ K^{-} system, the Ξc(2930)0\Xi_{c}(2930)^{0} state observed at the BaBar and Belle experiments is resolved into two narrower states, Ξc(2923)0\Xi_{c}(2923)^{0} and Ξc(2939)0\Xi_{c}(2939)^{0}, whose masses and widths are measured to be m(Ξc(2923)0)=2924.5±0.4±1.1 MeV,m(Ξc(2939)0)=2938.5±0.9±2.3 MeV,Γ(Ξc(2923)0)=0004.8±0.9±1.5 MeV,Γ(Ξc(2939)0)=0011.0±1.9±7.5 MeV, m(\Xi_{c}(2923)^{0}) = 2924.5 \pm 0.4 \pm 1.1 \,\mathrm{MeV}, \\ m(\Xi_{c}(2939)^{0}) = 2938.5 \pm 0.9 \pm 2.3 \,\mathrm{MeV}, \\ \Gamma(\Xi_{c}(2923)^{0}) = \phantom{000}4.8 \pm 0.9 \pm 1.5 \,\mathrm{MeV},\\ \Gamma(\Xi_{c}(2939)^{0}) = \phantom{00}11.0 \pm 1.9 \pm 7.5 \,\mathrm{MeV}, where the first uncertainties are statistical and the second systematic. The results are consistent with a previous LHCb measurement using a prompt Λc+K−\Lambda_{c}^{+} K^{-} sample. Evidence of a new Ξc(2880)0\Xi_{c}(2880)^{0} state is found with a local significance of 3.8 σ3.8\,\sigma, whose mass and width are measured to be 2881.8±3.1±8.5 MeV2881.8 \pm 3.1 \pm 8.5\,\mathrm{MeV} and 12.4±5.3±5.8 MeV12.4 \pm 5.3 \pm 5.8 \,\mathrm{MeV}, respectively. In addition, evidence of a new decay mode Ξc(2790)0→Λc+K−\Xi_{c}(2790)^{0} \to \Lambda_{c}^{+} K^{-} is found with a significance of 3.7 σ3.7\,\sigma. The relative branching fraction of B−→Λc+Λˉc−K−B^{-} \to \Lambda_{c}^{+} \bar{\Lambda}_{c}^{-} K^{-} with respect to the B−→D+D−K−B^{-} \to D^{+} D^{-} K^{-} decay is measured to be 2.36±0.11±0.22±0.252.36 \pm 0.11 \pm 0.22 \pm 0.25, where the first uncertainty is statistical, the second systematic and the third originates from the branching fractions of charm hadron decays.Comment: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-028.html (LHCb public pages

    Measurement of the ratios of branching fractions R(D∗)\mathcal{R}(D^{*}) and R(D0)\mathcal{R}(D^{0})

    Full text link
    The ratios of branching fractions R(D∗)≡B(Bˉ→D∗τ−Μˉτ)/B(Bˉ→D∗Ό−ΜˉΌ)\mathcal{R}(D^{*})\equiv\mathcal{B}(\bar{B}\to D^{*}\tau^{-}\bar{\nu}_{\tau})/\mathcal{B}(\bar{B}\to D^{*}\mu^{-}\bar{\nu}_{\mu}) and R(D0)≡B(B−→D0τ−Μˉτ)/B(B−→D0Ό−ΜˉΌ)\mathcal{R}(D^{0})\equiv\mathcal{B}(B^{-}\to D^{0}\tau^{-}\bar{\nu}_{\tau})/\mathcal{B}(B^{-}\to D^{0}\mu^{-}\bar{\nu}_{\mu}) are measured, assuming isospin symmetry, using a sample of proton-proton collision data corresponding to 3.0 fb−1{ }^{-1} of integrated luminosity recorded by the LHCb experiment during 2011 and 2012. The tau lepton is identified in the decay mode τ−→Ό−ΜτΜˉΌ\tau^{-}\to\mu^{-}\nu_{\tau}\bar{\nu}_{\mu}. The measured values are R(D∗)=0.281±0.018±0.024\mathcal{R}(D^{*})=0.281\pm0.018\pm0.024 and R(D0)=0.441±0.060±0.066\mathcal{R}(D^{0})=0.441\pm0.060\pm0.066, where the first uncertainty is statistical and the second is systematic. The correlation between these measurements is ρ=−0.43\rho=-0.43. Results are consistent with the current average of these quantities and are at a combined 1.9 standard deviations from the predictions based on lepton flavor universality in the Standard Model.Comment: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-039.html (LHCb public pages

    Measurement of forward charged hadron flow harmonics in peripheral PbPb collisions at √sNN = 5.02 TeV with the LHCb detector

    Get PDF
    Flow harmonic coefficients, v n , which are the key to studying the hydrodynamics of the quark-gluon plasma (QGP) created in heavy-ion collisions, have been measured in various collision systems and kinematic regions and using various particle species. The study of flow harmonics in a wide pseudorapidity range is particularly valuable to understand the temperature dependence of the shear viscosity to entropy density ratio of the QGP. This paper presents the first LHCb results of the second- and the third-order flow harmonic coefficients of charged hadrons as a function of transverse momentum in the forward region, corresponding to pseudorapidities between 2.0 and 4.9, using the data collected from PbPb collisions in 2018 at a center-of-mass energy of 5.02 TeV . The coefficients measured using the two-particle angular correlation analysis method are smaller than the central-pseudorapidity measurements at ALICE and ATLAS from the same collision system but share similar features
    • 

    corecore