41 research outputs found

    Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

    Full text link
    Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper we discuss the SW/HW co-designed solution for high-performance distributed training of large-scale DLRMs. We introduce a high-performance scalable software stack based on PyTorch and pair it with the new evolution of Zion platform, namely ZionEX. We demonstrate the capability to train very large DLRMs with up to 12 Trillion parameters and show that we can attain 40X speedup in terms of time to solution over previous systems. We achieve this by (i) designing the ZionEX platform with dedicated scale-out network, provisioned with high bandwidth, optimal topology and efficient transport (ii) implementing an optimized PyTorch-based training stack supporting both model and data parallelism (iii) developing sharding algorithms capable of hierarchical partitioning of the embedding tables along row, column dimensions and load balancing them across multiple workers; (iv) adding high-performance core operators while retaining flexibility to support optimizers with fully deterministic updates (v) leveraging reduced precision communications, multi-level memory hierarchy (HBM+DDR+SSD) and pipelining. Furthermore, we develop and briefly comment on distributed data ingestion and other supporting services that are required for the robust and efficient end-to-end training in production environments

    KMT-2022-BLG-0440Lb: A New q<10−4q < 10^{-4} Microlensing Planet with the Central-Resonant Caustic Degeneracy Broken

    Full text link
    We present the observations and analysis of a high-magnification microlensing planetary event, KMT-2022-BLG-0440, for which the weak and short-lived planetary signal was covered by both the KMTNet survey and follow-up observations. The binary-lens models with a central caustic provide the best fits, with a planet/host mass ratio, q=0.75q = 0.75--1.00×10−41.00 \times 10^{-4} at 1σ1\sigma. The binary-lens models with a resonant caustic and a brown-dwarf mass ratio are both excluded by Δχ2>70\Delta\chi^2 > 70. The binary-source model can fit the anomaly well but is rejected by the ``color argument'' on the second source. From Bayesian analyses, it is estimated that the host star is likely a K or M dwarf located in the Galactic disk, the planet probably has a Neptune-mass, and the projected planet-host separation is 1.9−0.7+0.61.9^{+0.6}_{-0.7} or 4.6−1.7+1.44.6^{+1.4}_{-1.7} au, subject to the close/wide degeneracy. This is the third q<10−4q < 10^{-4} planet from a high-magnification planetary signal (A≳65A \gtrsim 65). Together with another such planet, KMT-2021-BLG-0171Lb, the ongoing follow-up program for the KMTNet high-magnification events has demonstrated its ability in detecting high-magnification planetary signals for q<10−4q < 10^{-4} planets, which are challenging for the current microlensing surveys.Comment: MNRAS accepte

    Modeling of regional differentiation of land-use degree in China

    No full text

    Immobilization of by-product sulfate salt slag from high-salt organic wastewater with fly ash in lightweight aggregate ceramsite

    No full text
    The lightweight aggregate ceramsite (LAC) was prepared from by-product sulfate salt slag (BPSS) of high-salt organic wastewater with fly ash. The BPSS fixation rate, leaching toxicity, morphological structures and potential environmental risks of heavy metals in LAC were investigated. BPSS can be fixed in LAC when the mass ratio of Fly ash: Kaolin: clay was 7:1:2, the addition of BPSS was 28%, the heating rate was 8 degrees C min(-1), and the calcination temperature was 1100 degrees C. The characteristics of the LAC met the requirements for Chinese lightweight aggregate standards (GB/T17431.2-2010). The Total Organic Carbon (TOC) content of the aqueous leaching liquor in LAC was less than 0.5 mg center dot L-1. And the fixation rate of heavy metal was more than 99%, which meets the requirements of GB 5085.3-2007. The BPSS immobilization mechanisms were mainly related to the formation of new crystal phases, including Leucite (KAlSi2O6), Albite (Na2O center dot Al2O3 center dot 6SiO(2)), Potash Feldspar (K2O center dot Al2O3 center dot 6SiO(2)), Jadeite (NaAlSi2O6), Hauyne ([Na,Ca](8)[Si,Al](12)O-24[SO4](2)), Nosean (Na8Al6Si6O24SO4), and Sodalite (Na8Al6Si6O24[MnO4](2)) by incorporation of heavy metals in high-temperature curing reaction. This work provides an effective method for the harmless treatment and recycling of by-product salt residues from high-salt organic wastewater

    Optimization Preparation and Evaluation of Chitosan Grafted Norfloxacin as a Hemostatic Sponge

    No full text
    Considering the great harm to the human body caused by severe and massive bleeding, in this study, chitosan-grafted norfloxacin (CTS-NF) composites were prepared with chitosan (CTS) and norfloxacin (NF) as raw materials by a 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide-mediated coupling method to solve the limitations of slow hemostatic and poor anti-infective effects of current dressings on the market. The effects of the mass ratio of CTS to NF (MCTS/MNF), reaction temperature T and reaction time t on the grafting rate (&eta;%) of the products were investigated through single factor tests. The preparation process was optimized with the &eta;% as an evaluation index by means of the Box&ndash;Behnken test design and response surface analysis. The antimicrobial activity was evaluated by inhibition zone assay, and the hemostatic activity of the prepared composites was evaluated in vitro and in vivo. The results suggested that the optimum preparation conditions were the mass ratio of CTS to NF (MCTS/MNF) 5:3, reaction temperature 65 &deg;C, and reaction time 4 h. Under this condition, the &eta;% of CTS-NF was 45.5%. The CTS-NF composites displayed significant antimicrobial activities. Moreover, in vitro hemostasis results revealed that the CTS-NF composite had a lower blood clotting index and absorbed red blood cells to promote aggregation. In vivo ear and live hemostasis, the CTS-NF groups showed short hemostatic time (49.75 &plusmn; 3.32 s and 50.00 &plusmn; 7.21 s) and more blood loss (0.07 &plusmn; 0.010 g and 0.075 &plusmn; 0.013 g). The results showed that CTS-NF reduced the bleeding time and volume, exhibiting a significant coagulation effect. Therefore, the CTS-NF sponge is expected to be a new, effective hemostatic and antibacterial material in the future

    High rate performance of the carbon encapsulated Li4Ti5O12 for lithium ion battery

    No full text
    Li4Ti5O12 (LTO) is attractive alternative anode material with excellent cyclic performance and high rate after coating modifications of the conductive materials. Anatase TiO2 and glucose were applied of the synthesis of the carbon coated LTO (C@LTO). XRD results showed that all the major diffractions from the spinel structure of LTO can be found in the C@LTO such as (111), (311), (400) but there are no observations of the Carbon diffraction peaks. Electrochemical Impedance Spectroscopy (EIS) data shows C@LTO resistance was nearly half of the LTO value. Rate performance showed that capacity of C@LTO was higher than that of the pure LTO from 0.1 C, 0.2 C, 1 C, 2 C, 5 C and 10 C, which indicates that this is a promising approach to prepare the high performance LTO anode. Keywords: Li-ion batteries, Rate performance, Carbon materials, Li4Ti5O12 anod

    A Particle-Scale Model of Surface Tension for Two-Phase Flow: Model Description and Validation

    No full text
    A particle-scale surface tension force model (STF) is proposed here to be incorporated in the smoothed hydrodynamics particle (SPH) method. This model is based on the identification of interface geometry and the gradient of densities across the interface. A square bubble of single-phase and a square bubble immersed in fluids are simulated by the STF model accompanied with a combined kernel in SPH to validate their suitability to simulate the immersed bubble motion. Two cases of rising bubbles, i.e., a single rising bubble and a pair of rising bubbles, are simulated for demonstration. The rising velocity, density, surface tension force, interfacial curvature, the power of the STF, and the smoothing length of the rising bubble and surrounding fluids are all computed by the current STF model to study the characteristics of immersed bubble’s motion and coalescence. The current model provides a way to capture the interfacial interactions in two-phase flows at particle scales
    corecore