63 research outputs found

    Utilizing the Double-Precision Floating-Point Computing Power of GPUs for RSA Acceleration

    Get PDF
    Asymmetric cryptographic algorithm (e.g., RSA and Elliptic Curve Cryptography) implementations on Graphics Processing Units (GPUs) have been researched for over a decade. The basic idea of most previous contributions is exploiting the highly parallel GPU architecture and porting the integer-based algorithms from general-purpose CPUs to GPUs, to offer high performance. However, the great potential cryptographic computing power of GPUs, especially by the more powerful floating-point instructions, has not been comprehensively investigated in fact. In this paper, we fully exploit the floating-point computing power of GPUs, by various designs, including the floating-point-based Montgomery multiplication/exponentiation algorithm and Chinese Remainder Theorem (CRT) implementation in GPU. And for practical usage of the proposed algorithm, a new method is performed to convert the input/output between octet strings and floating-point numbers, fully utilizing GPUs and further promoting the overall performance by about 5%. The performance of RSA-2048/3072/4096 decryption on NVIDIA GeForce GTX TITAN reaches 42,211/12,151/5,790 operations per second, respectively, which achieves 13 times the performance of the previous fastest floating-point-based implementation (published in Eurocrypt 2009). The RSA-4096 decryption precedes the existing fastest integer-based result by 23%

    Adaptive Tuning of Robotic Polishing Skills based on Force Feedback Model

    Full text link
    Acquiring human skills offers an efficient approach to tackle complex task planning challenges. When performing a learned skill model for a continuous contact task, such as robot polishing in an uncertain environment, the robot needs to be able to adaptively modify the skill model to suit the environment and perform the desired task. The environmental perturbation of the polishing task is mainly reflected in the variation of contact force. Therefore, adjusting the task skill model by providing feedback on the contact force deviation is an effective way to meet the task requirements. In this study, a phase-modulated diagonal recurrent neural network (PMDRNN) is proposed for force feedback model learning in the robotic polishing task. The contact between the tool and the workpiece in the polishing task can be considered a dynamic system. In comparison to the existing feedforward neural network phase-modulated neural network (PMNN), PMDRNN combines the diagonal recurrent network structure with the phase-modulated neural network layer to improve the learning performance of the feedback model for dynamic systems. Specifically, data from real-world robot polishing experiments are used to learn the feedback model. PMDRNN demonstrates a significant reduction in the training error of the feedback model when compared to PMNN. Building upon this, the combination of PMDRNN and dynamic movement primitives (DMPs) can be used for real-time adjustment of skills for polishing tasks and effectively improve the robustness of the task skill model. Finally, real-world robotic polishing experiments are conducted to demonstrate the effectiveness of the approach.Comment: This paper has been accepted by The 2023 IEEE International Conference on Robotics and Biomimetics (IEEE ROBIO 2023

    Spatiotemporal patterns and spatial risk factors for visceral leishmaniasis from 2007 to 2017 in Western and Central China: a modelling analysis

    Get PDF
    Visceral leishmaniasis (VL) is a neglected disease caused by trypanosomatid protozoa in the genus Leishmania, which is transmitted by phlebotomine sandflies. Although this vector-borne disease has been eliminated in several regions of China during the last century, the reported human VL cases have rebounded in Western and Central China in recent decades. However, understanding of the spatial epidemiology of the disease remains vague, as the spatial risk factors driving the spatial heterogeneity of VL. In this study, we analyzed the spatiotemporal patterns of annual human VL cases in Western and Central China from 2007 to 2017. Based on the related spatial maps, the boosted regression tree (BRT) model was adopted to explore the relationships between VL and spatial correlates as well as predicting both the existing and potential infection risk zones of VL in Western and Central China. The mined links reveal that elevation, minimum temperature, relative humidity, and annual accumulated precipitation make great contributions to the spatial heterogeneity of VL. The maps show that Xinjiang Uygur Autonomous Region, Gansu, western Inner Mongolia Autonomous Region, and Sichuan are predicted to fall in the highest infection risk zones of VL. Approximately 61.60 million resident populations lived in the high-risk regions of VL in Western and Central China. Our results provide a better understanding of how spatial risk factors driving VL spread as well as identifying the potential endemic risk region of VL, thereby enhancing the biosurveillance capacity of public health authorities

    ConvKyber: Unleashing the Power of AI Accelerators for Faster Kyber with Novel Iteration-based Approaches

    Get PDF
    The remarkable performance capabilities of AI accelerators offer promising opportunities for accelerating cryptographic algorithms, particularly in the context of lattice-based cryptography. However, current approaches to leveraging AI accelerators often remain at a rudimentary level of implementation, overlooking the intricate internal mechanisms of these devices. Consequently, a significant number of computational resources is underutilized. In this paper, we present a comprehensive exploration of NVIDIA Tensor Cores and introduce a novel framework tailored specifically for Kyber. Firstly, we propose two innovative approaches that efficiently break down Kyber\u27s NTT into iterative matrix multiplications, resulting in approximately a 75% reduction in costs compared to the state-of-the-art scanning-based methods.Secondly, by reversing the internal mechanisms, we precisely manipulate the internal resources of Tensor Cores using assembly-level code instead of inefficient standard interfaces, eliminating memory accesses and redundant function calls. Finally, building upon our highly optimized NTT, we provide a complete implementation for all parameter sets of Kyber. Our implementation surpasses the state-of-the-art Tensor Core based work, achieving remarkable speed-ups of 1.93x, 1.65x, 1.22x and 3.55x for polyvec_ntt, KeyGen, Enc and Dec in Kyber-1024, respectively. Even when considering execution latency, our throughput-oriented full Kyber implementation maintains an acceptable execution latency. For instance, the execution latency ranges from 1.02 to 5.68 milliseconds for Kyber-1024 on R3080 when achieving the peak throughput

    A Novel High-performance Implementation of CRYSTALS-Kyber with AI Accelerator

    Get PDF
    Public-key cryptography, including conventional cryptosystems and post-quantum cryptography, involves computation-intensive workloads. With noticing the extraordinary computing power of AI accelerators, in this paper, we further explore the feasibility to introduce AI accelerators into high-performance cryptographic computing. Since AI accelerators are dedicated to machine learning or neural networks, the biggest challenge is how to transform cryptographic workloads into their operations, while ensuring the correctness of the results and bringing convincing performance gains. After investigating and analysing the workload of NVIDIA AI accelerator, Tensor Core, we choose to utilize it to accelerate the polynomial multiplication, usually the most time-consuming part in lattice-based cryptography. We take measures to accommodate the matrix-multiply-and-add mode of Tensor Core and make a trade-off between precision and performance, to leverage it as a high-performance NTT box performing NTT/INTT through CUDA C++ WMMA APIs. Meanwhile, we take CRYSTALS-Kyber, the candidate to be standardized by NIST, as a case study on RTX 3080 with the Ampere Tensor Core. The empirical results show that the customized NTT of polynomial vector (n=256,k=4n=256,k=4) with our NTT box obtains a speedup around 6.47x that of the state-of-the-art implementation on the same GPU platform. Compared with the AVX2 implementation submitted to NIST, our Kyber-1024 can achieve a speedup of 26x, 36x, and 35x for each phase

    Germline Predisposition and Copy Number Alteration in Pre-stage Lung Adenocarcinomas Presenting as Ground-Glass Nodules

    Get PDF
    Objective: Synchronous multiple ground-glass nodules (SM-GGNs) are a distinct entity of lung cancer which has been emerging increasingly in recent years in China. The oncogenesis molecular mechanisms of SM-GGNs remain elusive.Methods: We investigated single nucleotide variations (SNV), insertions and deletions (INDEL), somatic copy number variations (CNV), and germline mutations of 69 SM-GGN samples collected from 31 patients, using target sequencing (TRS) and whole exome sequencing (WES).Results: In the entire cohort, many known driver mutations were found, including EGFR (21.7%), BRAF (14.5%), and KRAS (6%). However, only one out of the 31 patients had the same somatic missense or truncated events within SM-GGNs, indicating the independent origins for almost all of these SM-GGNs. Many germline mutations with a low frequency in the Chinese population, and genes harboring both germline and somatic variations, were discovered in these pre-stage GGNs. These GGNs also bore large segments of copy number gains and/or losses. The CNV segment number tended to be positively correlated with the germline mutations (r = 0.57). The CNV sizes were correlated with the somatic mutations (r = 0.55). A moderate correlation (r = 0.54) was also shown between the somatic and germline mutations.Conclusion: Our data suggests that the precancerous unstable CNVs with potentially predisposing genetic backgrounds may foster the onset of driver mutations and the development of independent SM-GGNs during the local stimulation of mutagens

    Steady-State Simulation Techniques for Process Systems

    No full text

    Impact of the Kunming–Bangkok Highway on Land Use Changes along the Route between Laos and Thailand

    No full text
    Road construction fragments the landscape, reduces connectivity, and drives land use changes. To our knowledge, little is known about the scope and intensity of the effects of cross-border roads on changes in land use. Here, with the land use data products provided by the US Agency for International Development’s SERVIR Mekong project, using the GIS-based spatial analysis to quantitatively analyze and compare the effects of the cross-border road on land use changes within a 30 km buffer area along the Kunming–Bangkok Highway between Laos and Thailand. The results show the following: The greater the distance was from the highway, the smaller were the overall changes in land use within the buffer zone. A comparison of the situation before and after the road was opened in 2013 revealed significant differences in the most influential land use types of agricultural expansion, i.e., from 47.07% to 52.07% (the buffer zone was 1 km). In particular, 57.32% (1381.93 ha) and 40.08% (966.46 ha) of the land occupied by forests had been converted into land for plantation and agriculture, respectively, from 2013 to 2018. The scope of the impact of the operational route on the dynamics of land use was inconsistent. The largest impact before the road became operational was within 4 km of the buffer zone (0.26 to 0.24). Once the road had been opened, the range of its impact was beyond 10 km (0.63 to 0.57). The work here can provide a scientific basis for regional transportation planning and the sustainable use of land resources
    • …
    corecore