3 research outputs found

    PowerFusion: A Tensor Compiler with Explicit Data Movement Description and Instruction-level Graph IR

    Full text link
    Deep neural networks (DNNs) are of critical use in different domains. To accelerate DNN computation, tensor compilers are proposed to generate efficient code on different domain-specific accelerators. Existing tensor compilers mainly focus on optimizing computation efficiency. However, memory access is becoming a key performance bottleneck because the computational performance of accelerators is increasing much faster than memory performance. The lack of direct description of memory access and data dependence in current tensor compilers' intermediate representation (IR) brings significant challenges to generate memory-efficient code. In this paper, we propose IntelliGen, a tensor compiler that can generate high-performance code for memory-intensive operators by considering both computation and data movement optimizations. IntelliGen represent a DNN program using GIR, which includes primitives indicating its computation, data movement, and parallel strategies. This information will be further composed as an instruction-level dataflow graph to perform holistic optimizations by searching different memory access patterns and computation operations, and generating memory-efficient code on different hardware. We evaluate IntelliGen on NVIDIA GPU, AMD GPU, and Cambricon MLU, showing speedup up to 1.97x, 2.93x, and 16.91x(1.28x, 1.23x, and 2.31x on average), respectively, compared to current most performant frameworks.Comment: 12 pages, 14 figure

    Li-Ca Alloy Composite Anode with Ant-Nest-Like Lithiophilic Channels in Carbon Cloth Enabling High-Performance Li Metal Batteries

    No full text
    Constructing a three-dimensional (3D) multifunctional hosting architecture and subsequent thermal infusion of molten Li to produce advanced Li composite is an effective strategy for stable Li metal anode. However, the pure liquid Li is difficult to spread across the surface of various substrates due to its large surface tension and poor wettability, hindering the production and application of Li composite anode. Herein, heteroatomic Ca is doped into molten Li to generate Li-Ca alloy, which greatly regulates the surface tension of the molten alloy and improves the wettability against carbon cloth (CC). Moreover, a secondary network composed of CaLi2 intermetallic compound with interconnected ant-nest-like lithiophilic channels is in situ formed and across the primary scaffold of CC matrix by infiltrating molten Li-Ca alloy into CC and then cooling treatment (LCAC), which has a larger and lithiophilic surface to enable uniform Li deposition into interior space of the hybrid scaffold without Li dendrites. Therefore, LCAC exhibits a long-term lifespan for 1100 h under a current density of 5 mA cm-2 with fixed areal capacity of 5 mAh cm-2. Remarkably, full cells paired with practical-level LiFePO4 cathode of 2.45 mAh cm-2 deliver superior performance

    A comprehensive evaluation of the phenotype-first and data-driven approaches in analyzing facial morphological traits

    No full text
    Summary: The phenotype-first approach (PFA) and data-driven approach (DDA) have both greatly facilitated anthropological studies and the mapping of trait-associated genes. However, the pros and cons of the two approaches are poorly understood. Here, we systematically evaluated the two approaches and analyzed 14,838 facial traits in 2,379 Han Chinese individuals. Interestingly, the PFA explained more facial variation than the DDA in the top 100 and 1,000 except in the top 10 phenotypes. Accordingly, the ratio of heterogeneous traits extracted from the PFA was much greater, while more homogenous traits were found using the DDA for different sex, age, and BMI groups. Notably, our results demonstrated that the sex factor accounted for 30% of phenotypic variation in all traits extracted. Furthermore, we linked DDA phenotypes to PFA phenotypes with explicit biological explanations. These findings provide new insights into the analysis of multidimensional phenotypes and expand the understanding of phenotyping approaches
    corecore