84 research outputs found

    ATP: Adaptive Tensor Parallelism for Foundation Models

    Full text link
    Foundation models have impressive performance and generalization capabilities across a wide range of applications. The increasing size of the models introduces great challenges for the training. Tensor parallelism is a critical technique that is currently used in almost all foundation model training and has a significant impact on overall training performance. However, current tensor parallelism in machine learning frameworks misses optimization opportunities in fitting various interconnection topologies. In this work, we present ATP, an adaptive tensor parallelism framework for foundation models, which can automatically select the optimal parallel strategy on different interconnections. We propose column- and row-first tensor parallelism based on 2D device meshes and construct a search space. Combined with the hierarchical communication matrix, ATP can identify the optimal strategy in the search space. We also propose chunk-based overlapping to reduce communication overhead. Our evaluations show ATP consistently outperforms the state-of-the-art approaches for various model sizes and interconnects, achieving end-to-end training performance improvements of up to 37-64% on specific interconnects. Based on our theoretical model, the communication overhead of ATP decreases with scaling, indicating a qualitative leap forward

    Sequential optimization for efficient high-quality object proposal generation

    Full text link
    We are motivated by the need for a generic object proposal generation algorithm which achieves good balance between object detection recall, proposal localization quality and computational efficiency. We propose a novel object proposal algorithm, BING ++, which inherits the virtue of good computational efficiency of BING [1] but significantly improves its proposal localization quality. At high level we formulate the problem of object proposal generation from a novel probabilistic perspective, based on which our BING++ manages to improve the localization quality by employing edges and segments to estimate object boundaries and update the proposals sequentially. We propose learning the parameters efficiently by searching for approximate solutions in a quantized parameter space for complexity reduction. We demonstrate the generalization of BING++ with the same fixed parameters across different object classes and datasets. Empirically our BING++ can run at half speed of BING on CPU, but significantly improve the localization quality by 18.5 and 16.7 percent on both VOC2007 and Microhsoft COCO datasets, respectively. Compared with other state-of-the-art approaches, BING++ can achieve comparable performance, but run significantly faster

    Sequential Optimization for Efficient High-Quality Object Proposal Generation

    Full text link
    We are motivated by the need for a generic object proposal generation algorithm which achieves good balance between object detection recall, proposal localization quality and computational efficiency. We propose a novel object proposal algorithm, BING++, which inherits the virtue of good computational efficiency of BING but significantly improves its proposal localization quality. At high level we formulate the problem of object proposal generation from a novel probabilistic perspective, based on which our BING++ manages to improve the localization quality by employing edges and segments to estimate object boundaries and update the proposals sequentially. We propose learning the parameters efficiently by searching for approximate solutions in a quantized parameter space for complexity reduction. We demonstrate the generalization of BING++ with the same fixed parameters across different object classes and datasets. Empirically our BING++ can run at half speed of BING on CPU, but significantly improve the localization quality by 18.5% and 16.7% on both VOC2007 and Microhsoft COCO datasets, respectively. Compared with other state-of-the-art approaches, BING++ can achieve comparable performance, but run significantly faster.Comment: Accepted by TPAM

    Charging and discharging in thermal energy storage unit with fin-stone hybrid structure for enhancing heat transfer of phase change materials

    Get PDF
    This work proposes a fin-stone hybrid structure integrating fins (popular thermal enhancers) and natural stones (widely used sensible heat storage media) to enhance the heat transfer of phase change materials for on-site thermal energy storage applications, with advantages of low cost, environmental friendliness, and easy accessibility. 3D numerical models of charging and discharging in shell-and-tube heat storage units with various configurations, including fins, the fin-stone hybrid structure, stones, and no heat transfer enhancement, were constructed, and the performance evaluation and comparison were carried out. Compared to fins, fin-stone hybrid structures with 20 mm-, 30 mm-, and 40 mm-sized stones shorten the charging time by 67%, 54%, and 56%, and the discharging time by 73%, 60%, and 46%, respectively. Small stones have better heat transfer enhancement, which is attributed to the small volume, large surface area, and contact with the tube and fins. The advantage of the fin-stone hybrid structure, i.e. the shortening of phase change time, is more significant in charging than in discharging, in comparison with stones, as both heat conduction and natural convection are enhanced. Moreover, the hybrid structure exhibits satisfactory temperature stability with a 48.9 °C temperature change in charging and 37.2 °C in discharging, each lower than the fins, which is beneficial to stabilise the heat transfer fluid outlet temperature. The yearly supplied energy of the hybrid structure with 20 mm-sized stones is 121% and 72% more than that of fins and stones, respectively

    Tin Nanoparticles Encapsulated Carbon Nanoboxes as High-Performance Anode for Lithium-Ion Batteries

    Get PDF
    One of the crucial challenges for applying Sn as an anode of lithium-ion batteries (LIBs) is the dramatic volume change during lithiation/delithiation process, which causes a rapid capacity fading and then deteriorated battery performance. To address this issue, herein, we report the design and fabrication of Sn encapsulated carbon nanoboxes (denoted as Sn@C) with yolk@shell architectures. In this design, the carbon shell can facilitate the good transport kinetics whereas the hollow space between Sn and carbon shell can accommodate the volume variation during repeated charge/discharge process. Accordingly, this composite electrode exhibits a high reversible capacity of 675 mAh g−1 at a current density of 0.8 A g−1 after 500 cycles and preserves as high as 366mAh g−1 at a higher current density of 3 A g−1 even after 930 cycles. The enhanced electrochemical performance can be ascribed to the crystal size reduction of Sn cores and the formation of polymeric gel-like layer outside the electrode surface after long-term cycles, resulting in improved capacity and enhanced rate performance

    BING: Binarized normed gradients for objectness estimation at 300fps

    Get PDF
    Training a generic objectness measure to produce object proposals has recently become of significant interest. We observe that generic objects with well-defined closed boundaries can be detected by looking at the norm of gradients, with a suitable resizing of their corresponding image windows to a small fixed size. Based on this observation and computational reasons, we propose to resize the window to 8 × 8 and use the norm of the gradients as a simple 64D feature to describe it, for explicitly training a generic objectness measure. We further show how the binarized version of this feature, namely binarized normed gradients (BING), can be used for efficient objectness estimation, which requires only a few atomic operations (e.g., add, bitwise shift, etc.). To improve localization quality of the proposals while maintaining efficiency, we propose a novel fast segmentation method and demonstrate its effectiveness for improving BING’s localization performance, when used in multithresholding straddling expansion (MTSE) postprocessing. On the challenging PASCAL VOC2007 dataset, using 1000 proposals per image and intersectionover- union threshold of 0.5, our proposal method achieves a 95.6% object detection rate and 78.6% mean average best overlap in less than 0.005 second per image

    Tin Nanoparticles Encapsulated Carbon Nanoboxes as High-Performance Anode for Lithium-Ion Batteries

    Get PDF
    One of the crucial challenges for applying Sn as an anode of lithium-ion batteries (LIBs) is the dramatic volume change during lithiation/delithiation process, which causes a rapid capacity fading and then deteriorated battery performance. To address this issue, herein, we report the design and fabrication of Sn encapsulated carbon nanoboxes (denoted as Sn@C) with yolk@shell architectures. In this design, the carbon shell can facilitate the good transport kinetics whereas the hollow space between Sn and carbon shell can accommodate the volume variation during repeated charge/discharge process. Accordingly, this composite electrode exhibits a high reversible capacity of 675 mAh g−1 at a current density of 0.8 A g−1 after 500 cycles and preserves as high as 366 mAh g−1 at a higher current density of 3 A g−1 even after 930 cycles. The enhanced electrochemical performance can be ascribed to the crystal size reduction of Sn cores and the formation of polymeric gel-like layer outside the electrode surface after long-term cycles, resulting in improved capacity and enhanced rate performance

    Investigation of hearing loss in elderly vertigo and dizziness patients in the past 10 years

    Get PDF
    BackgroundVertigo and hearing loss are both prevalent in the elderly. This study retrospectively analyzed hearing test results from elderly patients experiencing vertigo and dizziness at ENT outpatient over a 10-year period, in order to study the patterns of hearing loss in this patient population.MethodsNine thousand three hundred eighty four patients over 50 years old underwent retrospective collection and screening of outpatient diagnosis, pure tone audiometry, acoustic immittance measurement (tympanogram) and auditory brainstem response (ABR) test. The patient's audiograms are divided into 7 subtypes according to a set of fixed criteria. Meanwhile, K-Means clustering analysis method was used to classify the audiogram.ResultsThe Jerger classification of tympanogram in elderly patients with vertigo and dizziness showed the majority falling under type A. The leading audiogram shapes were flat (27.81% in right ear and 26.89% in left ear), high-frequency gently sloping (25.97% in right ear and 27.34% in left ear), and high-frequency steeply sloping (21.60% in right ear and 22.53% in left ear). Meniere's disease (MD; 30.87%), benign recurrent vertigo (BRV; 19.07%), and benign paroxysmal positional vertigo (BPPV; 15.66%) were the most common etiologies in elderly vestibular diseases. We observed statistically significant differences in hearing thresholds among these vestibular diseases (P < 0.001). K-Means clustering analysis suggested that the optimal number of clusters was three, with sample sizes for the three clusters being 2,747, 2,413, and 4,139, respectively. The ANOVA statistical results of each characteristic value showed P < 0.001.ConclusionThe elderly patients often have mild to moderate hearing loss as a concomitant symptom with vertigo. Female patients have better hearing thresholds than males. The dominant audiometric shapes in this patient population were flat, high-frequency gently sloping, and high-frequency steeply sloping according to a set of fixed criteria. This study highlights the need for tailored strategies in managing hearing loss in elderly patients with vertigo and dizziness

    Real-space imaging of polar and elastic nano-textures in thin films via inversion of diffraction data

    Full text link
    Exploiting the emerging nanoscale periodicities in epitaxial, single-crystal thin films is an exciting direction in quantum materials science: confinement and periodic distortions induce novel properties. The structural motifs of interest are ferroelastic, ferroelectric, multiferroic, and, more recently, topologically protected magnetization and polarization textures. A critical step towards heterostructure engineering is understanding their nanoscale structure, best achieved through real-space imaging. X-ray Bragg coherent diffractive imaging visualizes sub-picometer crystalline displacements with tens of nanometers spatial resolution. Yet, it is limited to objects spatially confined in all three dimensions and requires highly coherent, laser-like x-rays. Here we lift the confinement restriction by developing real-space imaging of periodic lattice distortions: we combine an iterative phase retrieval algorithm with unsupervised machine learning to invert the diffuse scattering in conventional x-ray reciprocal-space mapping into real-space images of polar and elastic textures in thin epitaxial films. We first demonstrate our imaging in PbTiO3/SrTiO3 superlattices to be consistent with published phase-field model calculations. We then visualize strain-induced ferroelastic domains emerging during the metal-insulator transition in Ca2RuO4 thin films. Instead of homogeneously transforming into a low-temperature structure (like in bulk), the strained Mott insulator splits into nanodomains with alternating lattice constants, as confirmed by cryogenic scanning transmission electron microscopy. Our study reveals the type, size, orientation, and crystal displacement field of the nano-textures. The non-destructive imaging of textures promises to improve models for their dynamics and enable advances in quantum materials and microelectronics

    Construction of a cross-species cell landscape at single-cell level.

    Get PDF
    Individual cells are basic units of life. Despite extensive efforts to characterize the cellular heterogeneity of different organisms, cross-species comparisons of landscape dynamics have not been achieved. Here, we applied single-cell RNA sequencing (scRNA-seq) to map organism-level cell landscapes at multiple life stages for mice, zebrafish and Drosophila. By integrating the comprehensive dataset of > 2.6 million single cells, we constructed a cross-species cell landscape and identified signatures and common pathways that changed throughout the life span. We identified structural inflammation and mitochondrial dysfunction as the most common hallmarks of organism aging, and found that pharmacological activation of mitochondrial metabolism alleviated aging phenotypes in mice. The cross-species cell landscape with other published datasets were stored in an integrated online portal-Cell Landscape. Our work provides a valuable resource for studying lineage development, maturation and aging
    • …
    corecore