329 research outputs found

    Accelerating Generic Graph Neural Networks via Architecture, Compiler, Partition Method Co-Design

    Full text link
    Graph neural networks (GNNs) have shown significant accuracy improvements in a variety of graph learning domains, sparking considerable research interest. To translate these accuracy improvements into practical applications, it is essential to develop high-performance and efficient hardware acceleration for GNN models. However, designing GNN accelerators faces two fundamental challenges: the high bandwidth requirement of GNN models and the diversity of GNN models. Previous works have addressed the first challenge by using more expensive memory interfaces to achieve higher bandwidth. For the second challenge, existing works either support specific GNN models or have generic designs with poor hardware utilization. In this work, we tackle both challenges simultaneously. First, we identify a new type of partition-level operator fusion, which we utilize to internally reduce the high bandwidth requirement of GNNs. Next, we introduce partition-level multi-threading to schedule the concurrent processing of graph partitions, utilizing different hardware resources. To further reduce the extra on-chip memory required by multi-threading, we propose fine-grained graph partitioning to generate denser graph partitions. Importantly, these three methods make no assumptions about the targeted GNN models, addressing the challenge of model variety. We implement these methods in a framework called SwitchBlade, consisting of a compiler, a graph partitioner, and a hardware accelerator. Our evaluation demonstrates that SwitchBlade achieves an average speedup of 1.85×1.85\times and energy savings of 19.03×19.03\times compared to the NVIDIA V100 GPU. Additionally, SwitchBlade delivers performance comparable to state-of-the-art specialized accelerators

    Three-dimensional bio-printing

    Get PDF
    Three-dimensional (3D) printing technology has been widely used in various manufacturing operations including automotive, defence and space industries. 3D printing has the advantages of personalization, flexibility and high resolution, and is therefore becoming increasingly visible in the high-tech fields. Three-dimensional bio-printing technology also holds promise for future use in medical applications. At present 3D bio-printing is mainly used for simulating and reconstructing some hard tissues or for preparing drug-delivery systems in the medical area. The fabrication of 3D structures with living cells and bioactive moieties spatially distributed throughout will be realisable. Fabrication of complex tissues and organs is still at the exploratory stage. This review summarize the development of 3D bio-printing and its potential in medical applications, as well as discussing the current challenges faced by 3D bio-printing

    Efficient Adaptive Activation Rounding for Post-Training Quantization

    Full text link
    Post-training quantization attracts increasing attention due to its convenience in deploying quantized neural networks. Although rounding-to-nearest remains the prevailing method for DNN quantization, prior research has demonstrated its suboptimal nature when applied to weight quantization. They propose optimizing weight rounding schemes by leveraging output error rather than the traditional weight quantization error. Our study reveals that similar rounding challenges also extend to activation quantization. Despite the easy generalization, the challenges lie in the dynamic nature of activation. Adaptive rounding is expected for varying activations and the method is subjected to runtime overhead. To tackle this, we propose the AQuant quantization framework with a novel perspective to reduce output error by adjusting rounding schemes of activations. Instead of using the constant rounding border 0.5 of the rounding-to-nearest operation, we make the border become a function w.r.t. the activation value to change the activation rounding by the adaptive border. To deal with the runtime overhead, we use a coarse-grained version of the border function. Finally, we introduce our framework to optimize the border function. Extensive experiments show that AQuant achieves notable improvements compared to state-of-the-art works and pushes the accuracy of ResNet-18 up to 60.31% under the 2-bit weight and activation quantization

    AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels on GPUs

    Full text link
    Graph neural networks (GNNs) are powerful tools for exploring and learning from graph structures and features. As such, achieving high-performance execution for GNNs becomes crucially important. Prior works have proposed to explore the sparsity (i.e., low density) in the input graph to accelerate GNNs, which uses the full-graph-level or block-level sparsity format. We show that they fail to balance the sparsity benefit and kernel execution efficiency. In this paper, we propose a novel system, referred to as AdaptGear, that addresses the challenge of optimizing GNNs performance by leveraging kernels tailored to the density characteristics at the subgraph level. Meanwhile, we also propose a method that dynamically chooses the optimal set of kernels for a given input graph. Our evaluation shows that AdaptGear can achieve a significant performance improvement, up to 6.49×6.49 \times (1.87×1.87 \times on average), over the state-of-the-art works on two mainstream NVIDIA GPUs across various datasets

    Haemophilus parasuis Infection Disrupts Adherens Junctions and Initializes EMT Dependent on Canonical Wnt/ÎČ-Catenin Signaling Pathway

    No full text
    In this study, animal experimentation verified that the canonical Wnt/ÎČ-catenin signaling pathway was activated under a reduced activity of p-ÎČ-catenin (Ser33/37/Thr41) and an increased accumulation of ÎČ-catenin in the lungs and kidneys of pigs infected with a highly virulent strain of H. parasuis. In PK-15 and NPTr cells, it was also confirmed that infection with a high-virulence strain of H. parasuis induced cytoplasmic accumulation and nuclear translocation of ÎČ-catenin. H. parasuis infection caused a sharp degradation of E-cadherin and an increase of the epithelial cell monolayer permeability, as well as a broken interaction between ÎČ-catenin and E-cadherin dependent on Wnt/ÎČ-catenin signaling pathway. Moreover, Wnt/ÎČ-catenin signaling pathway also contributed to the initiation of epithelial-mesenchymal transition (EMT) during high-virulence strain of H. parasuis infection with expression changes of epithelial/mesenchymal markers, increased migratory capabilities as well as the morphologically spindle-like switch in PK-15 and NPTr cells. Therefore, we originally speculated that H. parasuis infection activates the canonical Wnt/ÎČ-catenin signaling pathway leading to a disruption of the epithelial barrier, altering cell structure and increasing cell migration, which results in severe acute systemic infection characterized by fibrinous polyserositis during H. parasuis infection

    Study of the B−→Λc+Λˉc−K−B^{-} \to \Lambda_{c}^{+} \bar{\Lambda}_{c}^{-} K^{-} decay

    Full text link
    The decay B−→Λc+Λˉc−K−B^{-} \to \Lambda_{c}^{+} \bar{\Lambda}_{c}^{-} K^{-} is studied in proton-proton collisions at a center-of-mass energy of s=13\sqrt{s}=13 TeV using data corresponding to an integrated luminosity of 5 fb−1\mathrm{fb}^{-1} collected by the LHCb experiment. In the Λc+K−\Lambda_{c}^+ K^{-} system, the Ξc(2930)0\Xi_{c}(2930)^{0} state observed at the BaBar and Belle experiments is resolved into two narrower states, Ξc(2923)0\Xi_{c}(2923)^{0} and Ξc(2939)0\Xi_{c}(2939)^{0}, whose masses and widths are measured to be m(Ξc(2923)0)=2924.5±0.4±1.1 MeV,m(Ξc(2939)0)=2938.5±0.9±2.3 MeV,Γ(Ξc(2923)0)=0004.8±0.9±1.5 MeV,Γ(Ξc(2939)0)=0011.0±1.9±7.5 MeV, m(\Xi_{c}(2923)^{0}) = 2924.5 \pm 0.4 \pm 1.1 \,\mathrm{MeV}, \\ m(\Xi_{c}(2939)^{0}) = 2938.5 \pm 0.9 \pm 2.3 \,\mathrm{MeV}, \\ \Gamma(\Xi_{c}(2923)^{0}) = \phantom{000}4.8 \pm 0.9 \pm 1.5 \,\mathrm{MeV},\\ \Gamma(\Xi_{c}(2939)^{0}) = \phantom{00}11.0 \pm 1.9 \pm 7.5 \,\mathrm{MeV}, where the first uncertainties are statistical and the second systematic. The results are consistent with a previous LHCb measurement using a prompt Λc+K−\Lambda_{c}^{+} K^{-} sample. Evidence of a new Ξc(2880)0\Xi_{c}(2880)^{0} state is found with a local significance of 3.8 σ3.8\,\sigma, whose mass and width are measured to be 2881.8±3.1±8.5 MeV2881.8 \pm 3.1 \pm 8.5\,\mathrm{MeV} and 12.4±5.3±5.8 MeV12.4 \pm 5.3 \pm 5.8 \,\mathrm{MeV}, respectively. In addition, evidence of a new decay mode Ξc(2790)0→Λc+K−\Xi_{c}(2790)^{0} \to \Lambda_{c}^{+} K^{-} is found with a significance of 3.7 σ3.7\,\sigma. The relative branching fraction of B−→Λc+Λˉc−K−B^{-} \to \Lambda_{c}^{+} \bar{\Lambda}_{c}^{-} K^{-} with respect to the B−→D+D−K−B^{-} \to D^{+} D^{-} K^{-} decay is measured to be 2.36±0.11±0.22±0.252.36 \pm 0.11 \pm 0.22 \pm 0.25, where the first uncertainty is statistical, the second systematic and the third originates from the branching fractions of charm hadron decays.Comment: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-028.html (LHCb public pages

    Measurement of the ratios of branching fractions R(D∗)\mathcal{R}(D^{*}) and R(D0)\mathcal{R}(D^{0})

    Full text link
    The ratios of branching fractions R(D∗)≡B(Bˉ→D∗τ−Μˉτ)/B(Bˉ→D∗Ό−ΜˉΌ)\mathcal{R}(D^{*})\equiv\mathcal{B}(\bar{B}\to D^{*}\tau^{-}\bar{\nu}_{\tau})/\mathcal{B}(\bar{B}\to D^{*}\mu^{-}\bar{\nu}_{\mu}) and R(D0)≡B(B−→D0τ−Μˉτ)/B(B−→D0Ό−ΜˉΌ)\mathcal{R}(D^{0})\equiv\mathcal{B}(B^{-}\to D^{0}\tau^{-}\bar{\nu}_{\tau})/\mathcal{B}(B^{-}\to D^{0}\mu^{-}\bar{\nu}_{\mu}) are measured, assuming isospin symmetry, using a sample of proton-proton collision data corresponding to 3.0 fb−1{ }^{-1} of integrated luminosity recorded by the LHCb experiment during 2011 and 2012. The tau lepton is identified in the decay mode τ−→Ό−ΜτΜˉΌ\tau^{-}\to\mu^{-}\nu_{\tau}\bar{\nu}_{\mu}. The measured values are R(D∗)=0.281±0.018±0.024\mathcal{R}(D^{*})=0.281\pm0.018\pm0.024 and R(D0)=0.441±0.060±0.066\mathcal{R}(D^{0})=0.441\pm0.060\pm0.066, where the first uncertainty is statistical and the second is systematic. The correlation between these measurements is ρ=−0.43\rho=-0.43. Results are consistent with the current average of these quantities and are at a combined 1.9 standard deviations from the predictions based on lepton flavor universality in the Standard Model.Comment: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-039.html (LHCb public pages

    Can Digital Finance Contribute to Agricultural Carbon Reduction? Evidence from China

    No full text
    The existing research covers digital finance’s carbon reduction impacts in industrial and urban settings, however, leaving a gap in understanding its effects in agriculture. This study addresses this gap by examining the relationship and mechanism between digital finance and agricultural carbon reduction. Two hypotheses are proposed to guide the study: (1) The development of digital finance could reduce agricultural carbon emissions; (2) The development of digital finance could significantly promote agricultural green innovation, empowering agricultural carbon emission reduction. By employing panel data spanning 31 provinces from 2011 to 2020, we empirically investigate the relationship between digital finance development and a reduction in agricultural carbon emissions. The results indicate that digital financial development significantly reduces agricultural carbon emissions. Mechanism analysis further elucidates the pivotal role of digital finance in facilitating agricultural green innovation, resulting in a decline in agricultural carbon emissions. Additionally, heterogeneity analysis reveals that the impact of digital finance on agricultural carbon emission reduction is particularly pronounced in regions with higher income levels and greater educational attainment. The study offers empirical evidence on the nexus between digital finance and agricultural carbon emissions, from a developing country perspective. It could provide innovative ideas and experiences from China for global agricultural low-carbon development practices

    A Nonintrusive Load Monitoring Method for Microgrid EMS Using Bi-LSTM Algorithm

    No full text
    Nonintrusive load monitoring in smart microgrids aims to obtain the energy consumption of individual appliances from the aggregated energy data, which is generally confronted with the error identification of the load type for energy disaggregation in microgrid energy management system (EMS). This paper proposes a classification strategy for the nonintrusive load identification scheme based on the bilateral long-term and short-term memory network (Bi-LSTM) algorithm. The sliding window algorithm is used to extract the detected load event features and obtain the load features of data samples. In order to accurately identify these load features, the steady state information is combined as the input of the Bi-LSTM model during training. Comprising long-term and short-term memory (LSTM) network and recurrent neural network (RNN), Bi-LSTM has the advantages of stronger recognition ability. Finally, precision (P), recall (R), accuracy (A), and F1 values are used as the evaluation method for nonintrusive load identification. The experimental results show the accuracy of the Bi-LSTM identification method for load start and stop state feature matching; moreover, the method can identify relatively low-power and multistate appliances
    • 

    corecore