84 research outputs found
ATP: Adaptive Tensor Parallelism for Foundation Models
Foundation models have impressive performance and generalization capabilities
across a wide range of applications. The increasing size of the models
introduces great challenges for the training. Tensor parallelism is a critical
technique that is currently used in almost all foundation model training and
has a significant impact on overall training performance. However, current
tensor parallelism in machine learning frameworks misses optimization
opportunities in fitting various interconnection topologies. In this work, we
present ATP, an adaptive tensor parallelism framework for foundation models,
which can automatically select the optimal parallel strategy on different
interconnections. We propose column- and row-first tensor parallelism based on
2D device meshes and construct a search space. Combined with the hierarchical
communication matrix, ATP can identify the optimal strategy in the search
space. We also propose chunk-based overlapping to reduce communication
overhead. Our evaluations show ATP consistently outperforms the
state-of-the-art approaches for various model sizes and interconnects,
achieving end-to-end training performance improvements of up to 37-64% on
specific interconnects. Based on our theoretical model, the communication
overhead of ATP decreases with scaling, indicating a qualitative leap forward
Sequential optimization for efficient high-quality object proposal generation
We are motivated by the need for a generic object proposal generation algorithm which achieves good balance between object detection recall, proposal localization quality and computational efficiency. We propose a novel object proposal algorithm, BING ++, which inherits the virtue of good computational efficiency of BING [1] but significantly improves its proposal localization quality. At high level we formulate the problem of object proposal generation from a novel probabilistic perspective, based on which our BING++ manages to improve the localization quality by employing edges and segments to estimate object boundaries and update the proposals sequentially. We propose learning the parameters efficiently by searching for approximate solutions in a quantized parameter space for complexity reduction. We demonstrate the generalization of BING++ with the same fixed parameters across different object classes and datasets. Empirically our BING++ can run at half speed of BING on CPU, but significantly improve the localization quality by 18.5 and 16.7 percent on both VOC2007 and Microhsoft COCO datasets, respectively. Compared with other state-of-the-art approaches, BING++ can achieve comparable performance, but run significantly faster
Sequential Optimization for Efficient High-Quality Object Proposal Generation
We are motivated by the need for a generic object proposal generation
algorithm which achieves good balance between object detection recall, proposal
localization quality and computational efficiency. We propose a novel object
proposal algorithm, BING++, which inherits the virtue of good computational
efficiency of BING but significantly improves its proposal localization
quality. At high level we formulate the problem of object proposal generation
from a novel probabilistic perspective, based on which our BING++ manages to
improve the localization quality by employing edges and segments to estimate
object boundaries and update the proposals sequentially. We propose learning
the parameters efficiently by searching for approximate solutions in a
quantized parameter space for complexity reduction. We demonstrate the
generalization of BING++ with the same fixed parameters across different object
classes and datasets. Empirically our BING++ can run at half speed of BING on
CPU, but significantly improve the localization quality by 18.5% and 16.7% on
both VOC2007 and Microhsoft COCO datasets, respectively. Compared with other
state-of-the-art approaches, BING++ can achieve comparable performance, but run
significantly faster.Comment: Accepted by TPAM
Charging and discharging in thermal energy storage unit with fin-stone hybrid structure for enhancing heat transfer of phase change materials
This work proposes a fin-stone hybrid structure integrating fins (popular thermal enhancers) and natural stones (widely used sensible heat storage media) to enhance the heat transfer of phase change materials for on-site thermal energy storage applications, with advantages of low cost, environmental friendliness, and easy accessibility. 3D numerical models of charging and discharging in shell-and-tube heat storage units with various configurations, including fins, the fin-stone hybrid structure, stones, and no heat transfer enhancement, were constructed, and the performance evaluation and comparison were carried out. Compared to fins, fin-stone hybrid structures with 20 mm-, 30 mm-, and 40 mm-sized stones shorten the charging time by 67%, 54%, and 56%, and the discharging time by 73%, 60%, and 46%, respectively. Small stones have better heat transfer enhancement, which is attributed to the small volume, large surface area, and contact with the tube and fins. The advantage of the fin-stone hybrid structure, i.e. the shortening of phase change time, is more significant in charging than in discharging, in comparison with stones, as both heat conduction and natural convection are enhanced. Moreover, the hybrid structure exhibits satisfactory temperature stability with a 48.9 °C temperature change in charging and 37.2 °C in discharging, each lower than the fins, which is beneficial to stabilise the heat transfer fluid outlet temperature. The yearly supplied energy of the hybrid structure with 20 mm-sized stones is 121% and 72% more than that of fins and stones, respectively
Tin Nanoparticles Encapsulated Carbon Nanoboxes as High-Performance Anode for Lithium-Ion Batteries
One of the crucial challenges for applying Sn as an anode of lithium-ion batteries (LIBs) is the dramatic volume change during lithiation/delithiation process, which causes a rapid capacity fading and then deteriorated battery performance. To address this issue, herein, we report the design and fabrication of Sn encapsulated carbon nanoboxes (denoted as Sn@C) with yolk@shell architectures. In this design, the carbon shell can facilitate the good transport kinetics whereas the hollow space between Sn and carbon shell can accommodate the volume variation during repeated charge/discharge process. Accordingly, this composite electrode exhibits a high reversible capacity of 675 mAh g−1 at a current density of 0.8 A g−1 after 500 cycles and preserves as high as 366mAh g−1 at a higher current density of 3 A g−1 even after 930 cycles. The enhanced electrochemical performance can be ascribed to the crystal size reduction of Sn cores and the formation of polymeric gel-like layer outside the electrode surface after long-term cycles, resulting in improved capacity and enhanced rate performance
BING: Binarized normed gradients for objectness estimation at 300fps
Training a generic objectness measure to produce object proposals has recently become of significant interest. We observe that generic objects with well-defined closed boundaries can be detected by looking at the norm of gradients, with a suitable resizing of their corresponding image windows to a small fixed size. Based on this observation and computational reasons, we propose to resize the window to 8 × 8 and use the norm of the gradients as a simple 64D feature to describe it, for explicitly training a generic objectness measure. We further show how the binarized version of this feature, namely binarized normed gradients (BING), can be used for efficient objectness estimation, which requires only a few atomic operations (e.g., add, bitwise shift, etc.). To improve localization quality of the proposals while maintaining efficiency, we propose a novel fast segmentation method and demonstrate its effectiveness for improving BING’s localization performance, when used in multithresholding straddling expansion (MTSE) postprocessing. On the challenging PASCAL VOC2007 dataset, using 1000 proposals per image and intersectionover- union threshold of 0.5, our proposal method achieves a 95.6% object detection rate and 78.6% mean average best overlap in less than 0.005 second per image
Tin Nanoparticles Encapsulated Carbon Nanoboxes as High-Performance Anode for Lithium-Ion Batteries
One of the crucial challenges for applying Sn as an anode of lithium-ion batteries (LIBs) is the dramatic volume change during lithiation/delithiation process, which causes a rapid capacity fading and then deteriorated battery performance. To address this issue, herein, we report the design and fabrication of Sn encapsulated carbon nanoboxes (denoted as Sn@C) with yolk@shell architectures. In this design, the carbon shell can facilitate the good transport kinetics whereas the hollow space between Sn and carbon shell can accommodate the volume variation during repeated charge/discharge process. Accordingly, this composite electrode exhibits a high reversible capacity of 675 mAh g−1 at a current density of 0.8 A g−1 after 500 cycles and preserves as high as 366 mAh g−1 at a higher current density of 3 A g−1 even after 930 cycles. The enhanced electrochemical performance can be ascribed to the crystal size reduction of Sn cores and the formation of polymeric gel-like layer outside the electrode surface after long-term cycles, resulting in improved capacity and enhanced rate performance
Investigation of hearing loss in elderly vertigo and dizziness patients in the past 10 years
BackgroundVertigo and hearing loss are both prevalent in the elderly. This study retrospectively analyzed hearing test results from elderly patients experiencing vertigo and dizziness at ENT outpatient over a 10-year period, in order to study the patterns of hearing loss in this patient population.MethodsNine thousand three hundred eighty four patients over 50 years old underwent retrospective collection and screening of outpatient diagnosis, pure tone audiometry, acoustic immittance measurement (tympanogram) and auditory brainstem response (ABR) test. The patient's audiograms are divided into 7 subtypes according to a set of fixed criteria. Meanwhile, K-Means clustering analysis method was used to classify the audiogram.ResultsThe Jerger classification of tympanogram in elderly patients with vertigo and dizziness showed the majority falling under type A. The leading audiogram shapes were flat (27.81% in right ear and 26.89% in left ear), high-frequency gently sloping (25.97% in right ear and 27.34% in left ear), and high-frequency steeply sloping (21.60% in right ear and 22.53% in left ear). Meniere's disease (MD; 30.87%), benign recurrent vertigo (BRV; 19.07%), and benign paroxysmal positional vertigo (BPPV; 15.66%) were the most common etiologies in elderly vestibular diseases. We observed statistically significant differences in hearing thresholds among these vestibular diseases (P < 0.001). K-Means clustering analysis suggested that the optimal number of clusters was three, with sample sizes for the three clusters being 2,747, 2,413, and 4,139, respectively. The ANOVA statistical results of each characteristic value showed P < 0.001.ConclusionThe elderly patients often have mild to moderate hearing loss as a concomitant symptom with vertigo. Female patients have better hearing thresholds than males. The dominant audiometric shapes in this patient population were flat, high-frequency gently sloping, and high-frequency steeply sloping according to a set of fixed criteria. This study highlights the need for tailored strategies in managing hearing loss in elderly patients with vertigo and dizziness
Real-space imaging of polar and elastic nano-textures in thin films via inversion of diffraction data
Exploiting the emerging nanoscale periodicities in epitaxial, single-crystal
thin films is an exciting direction in quantum materials science: confinement
and periodic distortions induce novel properties. The structural motifs of
interest are ferroelastic, ferroelectric, multiferroic, and, more recently,
topologically protected magnetization and polarization textures. A critical
step towards heterostructure engineering is understanding their nanoscale
structure, best achieved through real-space imaging. X-ray Bragg coherent
diffractive imaging visualizes sub-picometer crystalline displacements with
tens of nanometers spatial resolution. Yet, it is limited to objects spatially
confined in all three dimensions and requires highly coherent, laser-like
x-rays. Here we lift the confinement restriction by developing real-space
imaging of periodic lattice distortions: we combine an iterative phase
retrieval algorithm with unsupervised machine learning to invert the diffuse
scattering in conventional x-ray reciprocal-space mapping into real-space
images of polar and elastic textures in thin epitaxial films. We first
demonstrate our imaging in PbTiO3/SrTiO3 superlattices to be consistent with
published phase-field model calculations. We then visualize strain-induced
ferroelastic domains emerging during the metal-insulator transition in Ca2RuO4
thin films. Instead of homogeneously transforming into a low-temperature
structure (like in bulk), the strained Mott insulator splits into nanodomains
with alternating lattice constants, as confirmed by cryogenic scanning
transmission electron microscopy. Our study reveals the type, size,
orientation, and crystal displacement field of the nano-textures. The
non-destructive imaging of textures promises to improve models for their
dynamics and enable advances in quantum materials and microelectronics
Construction of a cross-species cell landscape at single-cell level.
Individual cells are basic units of life. Despite extensive efforts to characterize the cellular heterogeneity of different organisms, cross-species comparisons of landscape dynamics have not been achieved. Here, we applied single-cell RNA sequencing (scRNA-seq) to map organism-level cell landscapes at multiple life stages for mice, zebrafish and Drosophila. By integrating the comprehensive dataset of > 2.6 million single cells, we constructed a cross-species cell landscape and identified signatures and common pathways that changed throughout the life span. We identified structural inflammation and mitochondrial dysfunction as the most common hallmarks of organism aging, and found that pharmacological activation of mitochondrial metabolism alleviated aging phenotypes in mice. The cross-species cell landscape with other published datasets were stored in an integrated online portal-Cell Landscape. Our work provides a valuable resource for studying lineage development, maturation and aging
- …