99 research outputs found
Sharpness-Aware Minimization with Dynamic Reweighting
Deep neural networks are often overparameterized and may not easily achieve
model generalization. Adversarial training has shown effectiveness in improving
generalization by regularizing the change of loss on top of adversarially
chosen perturbations. The recently proposed sharpness-aware minimization (SAM)
algorithm conducts adversarial weight perturbation, encouraging the model to
converge to a flat minima. SAM finds a common adversarial weight perturbation
per-batch. Although per-instance adversarial weight perturbations are stronger
adversaries and they can potentially lead to better generalization performance,
their computational cost is very high and thus it is impossible to use
per-instance perturbations efficiently in SAM. In this paper, we tackle this
efficiency bottleneck and propose sharpness-aware minimization with dynamic
reweighting ({\delta}-SAM). Our theoretical analysis motivates that it is
possible to approach the stronger, per-instance adversarial weight
perturbations using reweighted per-batch weight perturbations. {\delta}-SAM
dynamically reweights perturbation within each batch according to the
theoretically principled weighting factors, serving as a good approximation to
per-instance perturbation. Experiments on various natural language
understanding tasks demonstrate the effectiveness of {\delta}-SAM
Experimental and physical model of the melting zone in the interface of the explosive cladding bar
AbstractLocal melting zone encountered in sections of the cladding interface is a distinguished phenomenon of the explosive cladding technique. The thickness and morphology of the melting zone in the Ti/NiCr explosive cladding bar are investigated by means of optical microscopy. Results show that the distribution of the melting zone in the interface of the Ti/NiCr explosive cladding bar is uniform and axisymmetric, and boundaries of the melting zone are circular arcs, whose center points to the center of the NiCr bar. The bamboo-shaped cracks generate in the melting zone. The thickness of the melting zone decreases with reducing of the stand-off distance and the thickness of the explosive. A physical model of the melting zone in the interface of the explosive cladding bar is proposed
An analytical modeling for high-velocity impacts on woven Kevlar composite laminates
In this paper, an analytical model, which based on energy balance, is built to study the process of high velocity impacts on woven Kevlar composite laminates by a cylindrical projectile. Four different mechanisms, such as laminate crushing, linear momentum transfer and tensile fiber failure, and shear plugging, is absorbed by the laminate while impacting. Then, simplification of the model is done to obtain the residual velocity and ballistic limit. The analytical results are validated with the results of experiment, and the perturbation analysis is done to analyze the reason of error
Exome sequencing revealed PDE11A as a novel candidate gene for early-onset Alzheimer\u27s disease
To identify novel risk genes and better understand the molecular pathway underlying Alzheimer\u27s disease (AD), whole-exome sequencing was performed in 215 early-onset AD (EOAD) patients and 255 unrelated healthy controls of Han Chinese ethnicity. Subsequent validation, computational annotation and in vitro functional studies were performed to evaluate the role of candidate variants in EOAD. We identified two rare missense variants in the phosphodiesterase 11A (PDE11A) gene in individuals with EOAD. Both variants are located in evolutionarily highly conserved amino acids, are predicted to alter the protein conformation and are classified as pathogenic. Furthermore, we found significantly decreased protein levels of PDE11A in brain samples of AD patients. Expression of PDE11A variants and knockdown experiments with specific short hairpin RNA (shRNA) for PDE11A both resulted in an increase of AD-associated Tau hyperphosphorylation at multiple epitopes in vitro. PDE11A variants or PDE11A shRNA also caused increased cyclic adenosine monophosphate (cAMP) levels, protein kinase A (PKA) activation and cAMP response element-binding protein phosphorylation. In addition, pretreatment with a PKA inhibitor (H89) suppressed PDE11A variant-induced Tau phosphorylation formation. This study offers insight into the involvement of Tau phosphorylation via the cAMP/PKA pathway in EOAD pathogenesis and provides a potential new target for intervention
ConvKyber: Unleashing the Power of AI Accelerators for Faster Kyber with Novel Iteration-based Approaches
The remarkable performance capabilities of AI accelerators offer promising opportunities for accelerating cryptographic algorithms, particularly in the context of lattice-based cryptography. However, current approaches to leveraging AI accelerators often remain at a rudimentary level of implementation, overlooking the intricate internal mechanisms of these devices. Consequently, a significant number of computational resources is underutilized.
In this paper, we present a comprehensive exploration of NVIDIA Tensor Cores and introduce a novel framework tailored specifically for Kyber. Firstly, we propose two innovative approaches that efficiently break down Kyber\u27s NTT into iterative matrix multiplications, resulting in approximately a 75% reduction in costs compared to the state-of-the-art scanning-based methods.Secondly, by reversing the internal mechanisms, we precisely manipulate the internal resources of Tensor Cores using assembly-level code instead of inefficient standard interfaces, eliminating memory accesses and redundant function calls. Finally, building upon our highly optimized NTT, we provide a complete implementation for all parameter sets of Kyber. Our implementation surpasses the state-of-the-art Tensor Core based work, achieving remarkable speed-ups of 1.93x, 1.65x, 1.22x and 3.55x for polyvec_ntt, KeyGen, Enc and Dec in Kyber-1024, respectively. Even when considering execution latency, our throughput-oriented full Kyber implementation maintains an acceptable execution latency. For instance, the execution latency ranges from 1.02 to 5.68 milliseconds for Kyber-1024 on R3080 when achieving the peak throughput
ConvKyber: Unleashing the Power of AI Accelerators for Faster Kyber with Novel Iteration-based Approaches
The remarkable performance capabilities of AI accelerators offer promising opportunities for accelerating cryptographic algorithms, particularly in the context of lattice-based cryptography. However, current approaches to leveraging AI accelerators often remain at a rudimentary level of implementation, overlooking the intricate internal mechanisms of these devices. Consequently, a significant number of computational resources is underutilized.
In this paper, we present a comprehensive exploration of NVIDIA Tensor Cores and introduce a novel framework tailored specifically for Kyber. Firstly, we propose two innovative approaches that efficiently break down Kyber’s NTT into iterative matrix multiplications, resulting in approximately a 75% reduction in costs compared to the state-of-the-art scanning-based methods. Secondly, by reversing the internal mechanisms, we precisely manipulate the internal resources of Tensor Cores using assembly-level code instead of inefficient standard interfaces, eliminating memory accesses and redundant function calls. Finally, building upon our highly optimized NTT, we provide a complete implementation for all parameter sets of Kyber. Our implementation surpasses the state-of-the-art Tensor Core based work, achieving remarkable speed-ups of 1.93x, 1.65x, 1.22x and 3.55x for polyvec_ntt, KeyGen, Enc and Dec in Kyber-1024, respectively. Even when considering execution latency, our throughput-oriented full Kyber implementation maintains an acceptable execution latency. For instance, the execution latency ranges from 1.02 to 5.68 milliseconds for Kyber-1024 on R3080 when achieving the peak throughput
Dual lactate clearance in the viability assessment of livers donated after circulatory death with ex situ normothermic machine perfusion
Perfusate lactate clearance (LC) is considered one of the useful indicators of liver viability assessment during normothermic machine perfusion (NMP); however, the applicable scope and potential mechanisms of LC remain poorly defined in the setting of liver donation after circulatory death.
Methods: The ex situ NMP of end-ischemic human livers was performed using the OrganOx Metra device. We further studied the extracellular signal-regulated kinases (phospho-extracellular signal-regulated kinase1/2 [pERK1/2]) pathway and several clinical parameters of these livers with successful LC (sLC, n = 5) compared with non-sLC (nLC, n = 5) in the perfusate (\u3c2.2 mmol/L at 2 h, n = 5, rapid retrieval without normothermic regional perfusion).
Results: We found the pERK1/2 level was substantially higher in the nLC livers than in the sLC livers (n = 5) at 2- and 6-h NMP (
Conclusions: The dual LC in perfusate and bile can be helpful in evaluating the hypoxic injury of hepatocytes and cholangiocytes during the NMP of donation after circulatory death in liver donors
- …