41 research outputs found
Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models
Deep learning recommendation models (DLRMs) are used across many
business-critical services at Facebook and are the single largest AI
application in terms of infrastructure demand in its data-centers. In this
paper we discuss the SW/HW co-designed solution for high-performance
distributed training of large-scale DLRMs. We introduce a high-performance
scalable software stack based on PyTorch and pair it with the new evolution of
Zion platform, namely ZionEX. We demonstrate the capability to train very large
DLRMs with up to 12 Trillion parameters and show that we can attain 40X speedup
in terms of time to solution over previous systems. We achieve this by (i)
designing the ZionEX platform with dedicated scale-out network, provisioned
with high bandwidth, optimal topology and efficient transport (ii) implementing
an optimized PyTorch-based training stack supporting both model and data
parallelism (iii) developing sharding algorithms capable of hierarchical
partitioning of the embedding tables along row, column dimensions and load
balancing them across multiple workers; (iv) adding high-performance core
operators while retaining flexibility to support optimizers with fully
deterministic updates (v) leveraging reduced precision communications,
multi-level memory hierarchy (HBM+DDR+SSD) and pipelining. Furthermore, we
develop and briefly comment on distributed data ingestion and other supporting
services that are required for the robust and efficient end-to-end training in
production environments
KMT-2022-BLG-0440Lb: A New Microlensing Planet with the Central-Resonant Caustic Degeneracy Broken
We present the observations and analysis of a high-magnification microlensing
planetary event, KMT-2022-BLG-0440, for which the weak and short-lived
planetary signal was covered by both the KMTNet survey and follow-up
observations. The binary-lens models with a central caustic provide the best
fits, with a planet/host mass ratio, -- at
. The binary-lens models with a resonant caustic and a brown-dwarf
mass ratio are both excluded by . The binary-source model
can fit the anomaly well but is rejected by the ``color argument'' on the
second source. From Bayesian analyses, it is estimated that the host star is
likely a K or M dwarf located in the Galactic disk, the planet probably has a
Neptune-mass, and the projected planet-host separation is
or au, subject to the close/wide degeneracy. This is the
third planet from a high-magnification planetary signal (). Together with another such planet, KMT-2021-BLG-0171Lb, the
ongoing follow-up program for the KMTNet high-magnification events has
demonstrated its ability in detecting high-magnification planetary signals for
planets, which are challenging for the current microlensing
surveys.Comment: MNRAS accepte
Immobilization of by-product sulfate salt slag from high-salt organic wastewater with fly ash in lightweight aggregate ceramsite
The lightweight aggregate ceramsite (LAC) was prepared from by-product sulfate salt slag (BPSS) of high-salt organic wastewater with fly ash. The BPSS fixation rate, leaching toxicity, morphological structures and potential environmental risks of heavy metals in LAC were investigated. BPSS can be fixed in LAC when the mass ratio of Fly ash: Kaolin: clay was 7:1:2, the addition of BPSS was 28%, the heating rate was 8 degrees C min(-1), and the calcination temperature was 1100 degrees C. The characteristics of the LAC met the requirements for Chinese lightweight aggregate standards (GB/T17431.2-2010). The Total Organic Carbon (TOC) content of the aqueous leaching liquor in LAC was less than 0.5 mg center dot L-1. And the fixation rate of heavy metal was more than 99%, which meets the requirements of GB 5085.3-2007. The BPSS immobilization mechanisms were mainly related to the formation of new crystal phases, including Leucite (KAlSi2O6), Albite (Na2O center dot Al2O3 center dot 6SiO(2)), Potash Feldspar (K2O center dot Al2O3 center dot 6SiO(2)), Jadeite (NaAlSi2O6), Hauyne ([Na,Ca](8)[Si,Al](12)O-24[SO4](2)), Nosean (Na8Al6Si6O24SO4), and Sodalite (Na8Al6Si6O24[MnO4](2)) by incorporation of heavy metals in high-temperature curing reaction. This work provides an effective method for the harmless treatment and recycling of by-product salt residues from high-salt organic wastewater
Optimization Preparation and Evaluation of Chitosan Grafted Norfloxacin as a Hemostatic Sponge
Considering the great harm to the human body caused by severe and massive bleeding, in this study, chitosan-grafted norfloxacin (CTS-NF) composites were prepared with chitosan (CTS) and norfloxacin (NF) as raw materials by a 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide-mediated coupling method to solve the limitations of slow hemostatic and poor anti-infective effects of current dressings on the market. The effects of the mass ratio of CTS to NF (MCTS/MNF), reaction temperature T and reaction time t on the grafting rate (η%) of the products were investigated through single factor tests. The preparation process was optimized with the η% as an evaluation index by means of the Box–Behnken test design and response surface analysis. The antimicrobial activity was evaluated by inhibition zone assay, and the hemostatic activity of the prepared composites was evaluated in vitro and in vivo. The results suggested that the optimum preparation conditions were the mass ratio of CTS to NF (MCTS/MNF) 5:3, reaction temperature 65 °C, and reaction time 4 h. Under this condition, the η% of CTS-NF was 45.5%. The CTS-NF composites displayed significant antimicrobial activities. Moreover, in vitro hemostasis results revealed that the CTS-NF composite had a lower blood clotting index and absorbed red blood cells to promote aggregation. In vivo ear and live hemostasis, the CTS-NF groups showed short hemostatic time (49.75 ± 3.32 s and 50.00 ± 7.21 s) and more blood loss (0.07 ± 0.010 g and 0.075 ± 0.013 g). The results showed that CTS-NF reduced the bleeding time and volume, exhibiting a significant coagulation effect. Therefore, the CTS-NF sponge is expected to be a new, effective hemostatic and antibacterial material in the future
High rate performance of the carbon encapsulated Li4Ti5O12 for lithium ion battery
Li4Ti5O12 (LTO) is attractive alternative anode material with excellent cyclic performance and high rate after coating modifications of the conductive materials. Anatase TiO2 and glucose were applied of the synthesis of the carbon coated LTO (C@LTO). XRD results showed that all the major diffractions from the spinel structure of LTO can be found in the C@LTO such as (111), (311), (400) but there are no observations of the Carbon diffraction peaks. Electrochemical Impedance Spectroscopy (EIS) data shows C@LTO resistance was nearly half of the LTO value. Rate performance showed that capacity of C@LTO was higher than that of the pure LTO from 0.1Â C, 0.2Â C, 1Â C, 2Â C, 5Â C and 10Â C, which indicates that this is a promising approach to prepare the high performance LTO anode. Keywords: Li-ion batteries, Rate performance, Carbon materials, Li4Ti5O12 anod
The effect of gradient conductivity of doped BiFeO3 as filler on the surface insulation performance of epoxy composite
A Particle-Scale Model of Surface Tension for Two-Phase Flow: Model Description and Validation
A particle-scale surface tension force model (STF) is proposed here to be incorporated in the smoothed hydrodynamics particle (SPH) method. This model is based on the identification of interface geometry and the gradient of densities across the interface. A square bubble of single-phase and a square bubble immersed in fluids are simulated by the STF model accompanied with a combined kernel in SPH to validate their suitability to simulate the immersed bubble motion. Two cases of rising bubbles, i.e., a single rising bubble and a pair of rising bubbles, are simulated for demonstration. The rising velocity, density, surface tension force, interfacial curvature, the power of the STF, and the smoothing length of the rising bubble and surrounding fluids are all computed by the current STF model to study the characteristics of immersed bubble’s motion and coalescence. The current model provides a way to capture the interfacial interactions in two-phase flows at particle scales