49 research outputs found

    HEAD: an FHE-based Privacy-preserving Cloud Computing Protocol with Compact Storage and Efficient Computation

    Get PDF
    Fully homomorphic encryption (FHE) provides a natural solution for privacy-preserving cloud computing, but a straightforward FHE protocol may suffer from high computational overhead and a large ciphertext expansion rate, especially for computation-intensive tasks over large data, which are the main obstacles toward practical privacy-preserving cloud computing. In this paper, we present HEAD, a generic privacy-preserving cloud computing protocol that can be based on most mainstream (typically a BGV or GSW style scheme) FHE schemes with more compact storage and less computational costs than the straightforward FHE counterpart. In particular, our protocol enjoys a ciphertext/plaintext expansion rate of 1 (i.e., no expansion) in a cloud computing server, instead of a factor of hundreds of thousands. This is achieved by means of ``pseudorandomly masked\u27\u27 ciphertexts, and the efficient transformations of them into FHE ciphertexts to facilitate privacy-preserving cloud computing. Depending on the underlying FHE in use, our HEAD protocol can be instantiated with the three masking techniques, namely modulo-subtraction-masking, modulo-division-masking, and XOR-masking, to support the decimal integer, real, or binary messages. Thanks to these masking techniques, various homomorphic computation tasks are made more efficient and less prone to noise accumulation. Furthermore, our multi-input masking and unmasking operations are more flexible than the FHE SIMD-batching, by supporting an on-demand configuration of FHE during each cloud computing request. We evaluate the performance of HEAD protocols on BFV, BGV, CKKS, and FHEW schemes based on the PALISADE and SEAL libraries, which confirms the theoretical analysis of the storage savings, the reduction in terms of computational complexity and noise accumulation. For example, in the BFV computation optimization, the sum or product of eight ciphertexts overhead is reduced from 336.3 ms to 6.3 ms, or from 1219.4 ms to 9.5 ms, respectively. We also embed HEAD into a mainstream database, PostgreSQL, in a client-server cloud storage and computing style. Compared with a straightforward FHE protocol, our experiments show that HEAD does not incur ciphertext expansion, and exhibits at least an order of magnitude saving in computing time at the server side for various tasks (on a hundred ciphertexts), by paying a reasonable price in client pre-processing time and communication. Our storage advantage not only gets around the database storage limitation but also reduces the I/O overhead

    Elliptic Curve Cryptography on Modern Processor Architectures

    Get PDF
    Abstract Elliptic Curve Cryptography (ECC) has been adopted by the US National Security Agency (NSA) in Suite "B" as part of its "Cryptographic Modernisation Program ". Additionally, it has been favoured by an entire host of mobile devices due to its superior performance characteristics. ECC is also the building block on which the exciting field of pairing/identity based cryptography is based. This widespread use means that there is potentially a lot to be gained by researching efficient implementations on modern processors such as IBM's Cell Broadband Engine and Philip's next generation smart card cores. ECC operations can be thought of as a pyramid of building blocks, from instructions on a core, modular operations on a finite field, point addition & doubling, elliptic curve scalar multiplication to application level protocols. In this thesis we examine an implementation of these components for ECC focusing on a range of optimising techniques for the Cell's SPU and the MIPS smart card. We show significant performance improvements that can be achieved through of adoption of EC

    Cryptographic extensions for custom and GPU-like architectures

    Get PDF
    The PhD thesis work deals with the exploration of hardware architectures dedicated to cryptographic applications, in particular, solutions based on reconfigurable hardware, such as FPGA. The thesis presents the results achieved for the acceleration of operations essential to homomorphic cryptography, specifically, the integer multiplication of very long operands, based on the Schonhage-Strassen algorithm and implemented with an ad-hoc FPGA hardware. Then, the thesis reports the exploration of novelty approaches for cryptographic acceleration, based on vectorial dedicated architectures, software programmable, with the corresponding implementation of symmetric and public key operations (namely, AES encryption and Montgomery multiplication) with improved performances

    Cryptographic algorithm acceleration using CUDA enabled GPUs in typical system configurations

    Get PDF
    The need to encrypt data is becoming more and more necessary. As the size of datasets continues to grow, the speed of encryption must increase to keep up or it will become a bottleneck. CUDA GPUs have been shown to offer performance improvements versus conventional CPUs for some data-intensive problems. This thesis evaluates the applicability of CUDA GPUs in accelerating the execution of cryptographic algorithms, which are increasingly used for growing amounts of data and thus will require significantly faster encryption and hashing throughput. Specifically, the CUDA environment was used to implement and experiment with three distinct cryptographic algorithms -- AES, SHA-2, and Keccak -- in order to show the applicability for various cryptographic algorithm classes. They were implemented in a system that emulates the conditions present in a real world environment, and the effects of offloading these tasks from the CPU to the GPU were assessed. Speedups up to 2.6x relative to the CPU were seen for single-kernel AES, but SHA-2 and Keccak did not perform as well as on the GPU as on the CPU. Multi-kernel AES saw speedups over single-kernel AES up to 1.4x, 1.65x, and 1.8x for two, three, and four kernels, respectively. This translates to speedups between 3.6x and 4.7x over CPU implementations of AES. Introducing a CPU load had a minimal effect on throughput whereas a GPU load was seen to decrease throughput by as much as 4%. Overall, CUDA GPUs appear to have potential for improving encryption throughputs if a parallelizable algorithm is selected

    Cloud-based homomorphic encryption for privacy-preserving machine learning in clinical decision support

    Get PDF
    While privacy and security concerns dominate public cloud services, Homomorphic Encryption (HE) is seen as an emerging solution that ensures secure processing of sensitive data via untrusted networks in the public cloud or by third-party cloud vendors. It relies on the fact that some encryption algorithms display the property of homomorphism, which allows them to manipulate data meaningfully while still in encrypted form; although there are major stumbling blocks to overcome before the technology is considered mature for production cloud environments. Such a framework would find particular relevance in Clinical Decision Support (CDS) applications deployed in the public cloud. CDS applications have an important computational and analytical role over confidential healthcare information with the aim of supporting decision-making in clinical practice. Machine Learning (ML) is employed in CDS applications that typically learn and can personalise actions based on individual behaviour. A relatively simple-to-implement, common and consistent framework is sought that can overcome most limitations of Fully Homomorphic Encryption (FHE) in order to offer an expanded and flexible set of HE capabilities. In the absence of a significant breakthrough in FHE efficiency and practical use, it would appear that a solution relying on client interactions is the best known entity for meeting the requirements of private CDS-based computation, so long as security is not significantly compromised. A hybrid solution is introduced, that intersperses limited two-party interactions amongst the main homomorphic computations, allowing exchange of both numerical and logical cryptographic contexts in addition to resolving other major FHE limitations. Interactions involve the use of client-based ciphertext decryptions blinded by data obfuscation techniques, to maintain privacy. This thesis explores the middle ground whereby HE schemes can provide improved and efficient arbitrary computational functionality over a significantly reduced two-party network interaction model involving data obfuscation techniques. This compromise allows for the powerful capabilities of HE to be leveraged, providing a more uniform, flexible and general approach to privacy-preserving system integration, which is suitable for cloud deployment. The proposed platform is uniquely designed to make HE more practical for mainstream clinical application use, equipped with a rich set of capabilities and potentially very complex depth of HE operations. Such a solution would be suitable for the long-term privacy preserving-processing requirements of a cloud-based CDS system, which would typically require complex combinatorial logic, workflow and ML capabilities

    Hardware acceleration using FPGAs for adaptive radiotherapy

    Get PDF
    Adaptive radiotherapy (ART) seeks to improve the accuracy of radiotherapy by adapting the treatment based on up-to-date images of the patient's anatomy captured at the time of treatment delivery. The amount of image data, combined with the clinical time requirements for ART, necessitates automatic image analysis to adapt the treatment plan. Currently, the computational effort of the image processing and plan adaptation means they cannot be completed in a clinically acceptable timeframe. This thesis aims to investigate the use of hardware acceleration on Field Programmable Gate Arrays (FPGAs) to accelerate algorithms for segmenting bony anatomy in Computed Tomography (CT) scans, to reduce the plan adaptation time for ART. An assessment was made of the overhead incurred by transferring image data to an FPGA-based hardware accelerator using the industry-standard DICOM protocol over an Ethernet connection. The rate was found to be likely to limit the performanceof hardware accelerators for ART, highlighting the need for an alternative method of integrating hardware accelerators with existing radiotherapy equipment. A clinically-validated segmentation algorithm was adapted for implementation in hardware. This was shown to process three-dimensional CT images up to 13.81 times faster than the original software implementation. The segmentations produced by the two implementations showed strong agreement. Modifications to the hardware implementation were proposed for segmenting fourdimensional CT scans. This was shown to process image volumes 14.96 times faster than the original software implementation, and the segmentations produced by the two implementations showed strong agreement in most cases.A second, novel, method for segmenting four-dimensional CT data was also proposed. The hardware implementation executed 1.95 times faster than the software implementation. However, the algorithm was found to be unsuitable for the global segmentation task examined here, although it may be suitable as a refining segmentation in the context of a larger ART algorithm.Adaptive radiotherapy (ART) seeks to improve the accuracy of radiotherapy by adapting the treatment based on up-to-date images of the patient's anatomy captured at the time of treatment delivery. The amount of image data, combined with the clinical time requirements for ART, necessitates automatic image analysis to adapt the treatment plan. Currently, the computational effort of the image processing and plan adaptation means they cannot be completed in a clinically acceptable timeframe. This thesis aims to investigate the use of hardware acceleration on Field Programmable Gate Arrays (FPGAs) to accelerate algorithms for segmenting bony anatomy in Computed Tomography (CT) scans, to reduce the plan adaptation time for ART. An assessment was made of the overhead incurred by transferring image data to an FPGA-based hardware accelerator using the industry-standard DICOM protocol over an Ethernet connection. The rate was found to be likely to limit the performanceof hardware accelerators for ART, highlighting the need for an alternative method of integrating hardware accelerators with existing radiotherapy equipment. A clinically-validated segmentation algorithm was adapted for implementation in hardware. This was shown to process three-dimensional CT images up to 13.81 times faster than the original software implementation. The segmentations produced by the two implementations showed strong agreement. Modifications to the hardware implementation were proposed for segmenting fourdimensional CT scans. This was shown to process image volumes 14.96 times faster than the original software implementation, and the segmentations produced by the two implementations showed strong agreement in most cases.A second, novel, method for segmenting four-dimensional CT data was also proposed. The hardware implementation executed 1.95 times faster than the software implementation. However, the algorithm was found to be unsuitable for the global segmentation task examined here, although it may be suitable as a refining segmentation in the context of a larger ART algorithm

    On the Cryptanalysis of Public-Key Cryptography

    Get PDF
    Nowadays, the most popular public-key cryptosystems are based on either the integer factorization or the discrete logarithm problem. The feasibility of solving these mathematical problems in practice is studied and techniques are presented to speed-up the underlying arithmetic on parallel architectures. The fastest known approach to solve the discrete logarithm problem in groups of elliptic curves over finite fields is the Pollard rho method. The negation map can be used to speed up this calculation by a factor √2. It is well known that the random walks used by Pollard rho when combined with the negation map get trapped in fruitless cycles. We show that previously published approaches to deal with this problem are plagued by recurring cycles, and we propose effective alternative countermeasures. Furthermore, fast modular arithmetic is introduced which can take advantage of prime moduli of a special form using efficient "sloppy reduction." The effectiveness of these techniques is demonstrated by solving a 112-bit elliptic curve discrete logarithm problem using a cluster of PlayStation 3 game consoles: breaking a public-key standard and setting a new world record. The elliptic curve method (ECM) for integer factorization is the asymptotically fastest method to find relatively small factors of large integers. From a cryptanalytic point of view the performance of ECM gives information about secure parameter choices of some cryptographic protocols. We optimize ECM by proposing carry-free arithmetic modulo Mersenne numbers (numbers of the form 2M – 1) especially suitable for parallel architectures. Our implementation of these techniques on a cluster of PlayStation 3 game consoles set a new record by finding a 241-bit prime factor of 21181 – 1. A normal form for elliptic curves introduced by Edwards results in the fastest elliptic curve arithmetic in practice. Techniques to reduce the temporary storage and enhance the performance even further in the setting of ECM are presented. Our results enable one to run ECM efficiently on resource-constrained platforms such as graphics processing units

    Towards Practical Secure Neural Network Inference: The Journey So Far and the Road Ahead

    Get PDF
    Neural networks (NNs) have become one of the most important tools for artificial intelligence (AI). Well-designed and trained NNs can perform inference (e.g., make decisions or predictions) on unseen inputs with high accuracy. Using NNs often involves sensitive data: depending on the specific use case, the input to the NN and/or the internals of the NN (e.g., the weights and biases) may be sensitive. Thus, there is a need for techniques for performing NN inference securely, ensuring that sensitive data remains secret. In the past few years, several approaches have been proposed for secure neural network inference. These approaches achieve better and better results in terms of efficiency, security, accuracy, and applicability, thus making big progress towards practical secure neural network inference. The proposed approaches make use of many different techniques, such as homomorphic encryption and secure multi-party computation. The aim of this survey paper is to give an overview of the main approaches proposed so far, their different properties, and the techniques used. In addition, remaining challenges towards large-scale deployments are identified

    Homomorphic Encryption for Machine Learning in Medicine and Bioinformatics

    Get PDF
    Machine learning techniques are an excellent tool for the medical community to analyzing large amounts of medical and genomic data. On the other hand, ethical concerns and privacy regulations prevent the free sharing of this data. Encryption methods such as fully homomorphic encryption (FHE) provide a method evaluate over encrypted data. Using FHE, machine learning models such as deep learning, decision trees, and naive Bayes have been implemented for private prediction using medical data. FHE has also been shown to enable secure genomic algorithms, such as paternity testing, and secure application of genome-wide association studies. This survey provides an overview of fully homomorphic encryption and its applications in medicine and bioinformatics. The high-level concepts behind FHE and its history are introduced. Details on current open-source implementations are provided, as is the state of FHE for privacy-preserving techniques in machine learning and bioinformatics and future growth opportunities for FHE
    corecore