20 research outputs found

    Huffman-based Code Compression Techniques for Embedded Systems

    Get PDF

    NASA SERC 1990 Symposium on VLSI Design

    Get PDF
    This document contains papers presented at the first annual NASA Symposium on VLSI Design. NASA's involvement in this event demonstrates a need for research and development in high performance computing. High performance computing addresses problems faced by the scientific and industrial communities. High performance computing is needed in: (1) real-time manipulation of large data sets; (2) advanced systems control of spacecraft; (3) digital data transmission, error correction, and image compression; and (4) expert system control of spacecraft. Clearly, a valuable technology in meeting these needs is Very Large Scale Integration (VLSI). This conference addresses the following issues in VLSI design: (1) system architectures; (2) electronics; (3) algorithms; and (4) CAD tools

    Quantum Algorithmic Techniques for Fault-Tolerant Quantum Computers

    Get PDF
    Quantum computers have the potential to push the limits of computation in areas such as quantum chemistry, cryptography, optimization, and machine learning. Even though many quantum algorithms show asymptotic improvement compared to classical ones, the overhead of running quantum computers limits when quantum computing becomes useful. Thus, by optimizing components of quantum algorithms, we can bring the regime of quantum advantage closer. My work focuses on developing efficient subroutines for quantum computation. I focus specifically on algorithms for scalable, fault-tolerant quantum computers. While it is possible that even noisy quantum computers can outperform classical ones for specific tasks, high-depth and therefore fault-tolerance is likely required for most applications. In this thesis, I introduce three sets of techniques that can be used by themselves or as subroutines in other algorithms. The first components are coherent versions of classical sort and shuffle. We require that a quantum shuffle prepares a uniform superposition over all permutations of a sequence. The quantum sort is used within the shuffle and as well as in the next algorithm in this thesis. The quantum shuffle is an essential part of state preparation for quantum chemistry computation in first quantization. Second, I review the progress of Hamiltonian simulations and give a new algorithm for simulating time-dependent Hamiltonians. This algorithm scales polylogarithmic in the inverse error, and the query complexity does not depend on the derivatives of the Hamiltonian. A time-dependent Hamiltonian simulation was recently used for interaction picture simulation with applications to quantum chemistry. Next, I present a fully quantum Boltzmann machine. I show that our algorithm can train on quantum data and learn a classical description of quantum states. This type of machine learning can be used for tomography, Hamiltonian learning, and approximate quantum cloning

    Improving post-quantum cryptography through cryptanalysis

    Get PDF
    Large quantum computers pose a threat to our public-key cryptographic infrastructure. The possible responses are: Do nothing; accept the fact that quantum computers might be used to break widely deployed protocols. Mitigate the threat by switching entirely to symmetric-key protocols. Mitigate the threat by switching to different public-key protocols. Each user of public-key cryptography will make one of these choices, and we should not expect consensus. Some users will do nothing---perhaps because they view the threat as being too remote. And some users will find that they never needed public-key cryptography in the first place. The work that I present here is for people who need public-key cryptography and want to switch to new protocols. Each of the three articles raises the security estimate of a cryptosystem by showing that some attack is less effective than was previously believed. Each article thereby reduces the cost of using a protocol by letting the user choose smaller (or more efficient) parameters at a fixed level of security. In Part 1, I present joint work with Samuel Jaques in which we revise security estimates for the Supersingular Isogeny Key Exchange (SIKE) protocol. We show that known quantum claw-finding algorithms do not outperform classical claw-finding algorithms. This allows us to recommend 434-bit primes for use in SIKE at the same security level that 503-bit primes had previously been recommended. In Part 2, I present joint work with Martin Albrecht, Vlad Gheorghiu, and Eamonn Postelthwaite that examines the impact of quantum search on sieving algorithms for the shortest vector problem. Cryptographers commonly assume that the cost of solving the shortest vector problem in dimension dd is 2(0.265
+o(1))d2^{(0.265\ldots +o(1))d} quantumly and 2(0.292
+o(1))d2^{(0.292\ldots + o(1))d} classically. These are upper bounds based on a near neighbor search algorithm due to Becker--Ducas--Gama--Laarhoven. Naively, one might think that dd must be at least 483(≈128/0.265)483 (\approx 128/0.265) to avoid attacks that cost fewer than 21282^{128} operations. Our analysis accounts for terms in the o(1)o(1) that were previously ignored. In a realistic model of quantum computation, we find that applying the Becker--Ducas--Gama--Laarhoven algorithm in dimension d>376d > 376 will cost more than 21282^{128} operations. We also find reason to believe that the classical algorithm will outperform the quantum algorithm in dimensions d<288d < 288. In Part 3, I present solo work on a variant of post-quantum RSA. The original pqRSA proposal by Bernstein--Heninger--Lou--Valenta uses terabyte keys of the form n=p1p2p3p4⋯pi⋯p231n = p_1p_2p_3p_4\cdots p_i\cdots p_{2^{31}} where each pip_i is a 40964096-bit prime. My variant uses terabyte keys of the form n=p12p23p35p47⋯piπi⋯p20044225287n = p_1^2p_2^3p_3^5p_4^7\cdots p_i^{\pi_i}\cdots p_{20044}^{225287} where each pip_i is a 40964096-bit prime and πi\pi_i is the ii-th prime. Prime generation is the most expensive part of post-quantum RSA in practice, so the smaller number of prime factors in my proposal gives a large speedup in key generation. The repeated factors help an attacker identify an element of small order, and thereby allow the attacker to use a small-order variant of Shor's algorithm. I analyze small-order attacks and discuss the cost of the classical pre-computation that they require

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Reconfigurable Architectures for Cryptographic Systems

    No full text
    Field Programmable Gate Arrays (FPGAs) are suitable platforms for implementing cryptographic algorithms in hardware due to their flexibility, good performance and low power consumption. Computer security is becoming increasingly important and security requirements such as key sizes are quickly evolving. This creates the need for customisable hardware designs for cryptographic operations capable of covering a large design space. In this thesis we explore the four design dimensions relevant to cryptography - speed, area, power consumption and security of the crypto-system - by developing parametric designs for public-key generation and encryption as well as side-channel attack countermeasures. There are four contributions. First, we present new architectures for Montgomery multiplication and exponentiation based on variable pipelining and variable serial replication. Our implementations of these architectures are compared to the best implementations in the literature and the design space is explored in terms of speed and area trade-offs. Second, we generalise our Montgomery multiplier design ideas by developing a parametric model to allow rapid optimisation of a general class of algorithms containing loops with dependencies carried from one iteration to the next. By predicting the throughput and the area of the design, our model facilitates and speeds up design space exploration. Third, we develop new architectures for primality testing including the first hardware architecture for the NIST approved Lucas primality test. We explore the area, speed and power consumption trade-offs by comparing our Lucas architectures on CPU, FPGA and ASIC. Finally, we tackle the security issue by presenting two novel power attack countermeasures based on on-chip power monitoring. Our constant power framework uses a closed-loop control system to keep the power consumption of any FPGA implementation constant. Our attack detection framework uses a network of ring-oscillators to detect the insertion of a shunt resistor-based power measurement circuit on a device's power rail. This countermeasure is lightweight and has a relatively low power overhead compared to existing masking and hiding countermeasures

    Cloud-based homomorphic encryption for privacy-preserving machine learning in clinical decision support

    Get PDF
    While privacy and security concerns dominate public cloud services, Homomorphic Encryption (HE) is seen as an emerging solution that ensures secure processing of sensitive data via untrusted networks in the public cloud or by third-party cloud vendors. It relies on the fact that some encryption algorithms display the property of homomorphism, which allows them to manipulate data meaningfully while still in encrypted form; although there are major stumbling blocks to overcome before the technology is considered mature for production cloud environments. Such a framework would find particular relevance in Clinical Decision Support (CDS) applications deployed in the public cloud. CDS applications have an important computational and analytical role over confidential healthcare information with the aim of supporting decision-making in clinical practice. Machine Learning (ML) is employed in CDS applications that typically learn and can personalise actions based on individual behaviour. A relatively simple-to-implement, common and consistent framework is sought that can overcome most limitations of Fully Homomorphic Encryption (FHE) in order to offer an expanded and flexible set of HE capabilities. In the absence of a significant breakthrough in FHE efficiency and practical use, it would appear that a solution relying on client interactions is the best known entity for meeting the requirements of private CDS-based computation, so long as security is not significantly compromised. A hybrid solution is introduced, that intersperses limited two-party interactions amongst the main homomorphic computations, allowing exchange of both numerical and logical cryptographic contexts in addition to resolving other major FHE limitations. Interactions involve the use of client-based ciphertext decryptions blinded by data obfuscation techniques, to maintain privacy. This thesis explores the middle ground whereby HE schemes can provide improved and efficient arbitrary computational functionality over a significantly reduced two-party network interaction model involving data obfuscation techniques. This compromise allows for the powerful capabilities of HE to be leveraged, providing a more uniform, flexible and general approach to privacy-preserving system integration, which is suitable for cloud deployment. The proposed platform is uniquely designed to make HE more practical for mainstream clinical application use, equipped with a rich set of capabilities and potentially very complex depth of HE operations. Such a solution would be suitable for the long-term privacy preserving-processing requirements of a cloud-based CDS system, which would typically require complex combinatorial logic, workflow and ML capabilities

    Towards the development of a reliable reconfigurable real-time operating system on FPGAs

    Get PDF
    In the last two decades, Field Programmable Gate Arrays (FPGAs) have been rapidly developed from simple “glue-logic” to a powerful platform capable of implementing a System on Chip (SoC). Modern FPGAs achieve not only the high performance compared with General Purpose Processors (GPPs), thanks to hardware parallelism and dedication, but also better programming flexibility, in comparison to Application Specific Integrated Circuits (ASICs). Moreover, the hardware programming flexibility of FPGAs is further harnessed for both performance and manipulability, which makes Dynamic Partial Reconfiguration (DPR) possible. DPR allows a part or parts of a circuit to be reconfigured at run-time, without interrupting the rest of the chip’s operation. As a result, hardware resources can be more efficiently exploited since the chip resources can be reused by swapping in or out hardware tasks to or from the chip in a time-multiplexed fashion. In addition, DPR improves fault tolerance against transient errors and permanent damage, such as Single Event Upsets (SEUs) can be mitigated by reconfiguring the FPGA to avoid error accumulation. Furthermore, power and heat can be reduced by removing finished or idle tasks from the chip. For all these reasons above, DPR has significantly promoted Reconfigurable Computing (RC) and has become a very hot topic. However, since hardware integration is increasing at an exponential rate, and applications are becoming more complex with the growth of user demands, highlevel application design and low-level hardware implementation are increasingly separated and layered. As a consequence, users can obtain little advantage from DPR without the support of system-level middleware. To bridge the gap between the high-level application and the low-level hardware implementation, this thesis presents the important contributions towards a Reliable, Reconfigurable and Real-Time Operating System (R3TOS), which facilitates the user exploitation of DPR from the application level, by managing the complex hardware in the background. In R3TOS, hardware tasks behave just like software tasks, which can be created, scheduled, and mapped to different computing resources on the fly. The novel contributions of this work are: 1) a novel implementation of an efficient task scheduler and allocator; 2) implementation of a novel real-time scheduling algorithm (FAEDF) and two efficacious allocating algorithms (EAC and EVC), which schedule tasks in real-time and circumvent emerging faults while maintaining more compact empty areas. 3) Design and implementation of a faulttolerant microprocessor by harnessing the existing FPGA resources, such as Error Correction Code (ECC) and configuration primitives. 4) A novel symmetric multiprocessing (SMP)-based architectures that supports shared memory programing interface. 5) Two demonstrations of the integrated system, including a) the K-Nearest Neighbour classifier, which is a non-parametric classification algorithm widely used in various fields of data mining; and b) pairwise sequence alignment, namely the Smith Waterman algorithm, used for identifying similarities between two biological sequences. R3TOS gives considerably higher flexibility to support scalable multi-user, multitasking applications, whereby resources can be dynamically managed in respect of user requirements and hardware availability. Benefiting from this, not only the hardware resources can be more efficiently used, but also the system performance can be significantly increased. Results show that the scheduling and allocating efficiencies have been improved up to 2x, and the overall system performance is further improved by ~2.5x. Future work includes the development of Network on Chip (NoC), which is expected to further increase the communication throughput; as well as the standardization and automation of our system design, which will be carried out in line with the enablement of other high-level synthesis tools, to allow application developers to benefit from the system in a more efficient manner

    Computer Aided Verification

    Get PDF
    This open access two-volume set LNCS 11561 and 11562 constitutes the refereed proceedings of the 31st International Conference on Computer Aided Verification, CAV 2019, held in New York City, USA, in July 2019. The 52 full papers presented together with 13 tool papers and 2 case studies, were carefully reviewed and selected from 258 submissions. The papers were organized in the following topical sections: Part I: automata and timed systems; security and hyperproperties; synthesis; model checking; cyber-physical systems and machine learning; probabilistic systems, runtime techniques; dynamical, hybrid, and reactive systems; Part II: logics, decision procedures; and solvers; numerical programs; verification; distributed systems and networks; verification and invariants; and concurrency
    corecore