106 research outputs found

    Efficient Implementation on Low-Cost SoC-FPGAs of TLSv1.2 Protocol with ECC_AES Support for Secure IoT Coordinators

    Get PDF
    Security management for IoT applications is a critical research field, especially when taking into account the performance variation over the very different IoT devices. In this paper, we present high-performance client/server coordinators on low-cost SoC-FPGA devices for secure IoT data collection. Security is ensured by using the Transport Layer Security (TLS) protocol based on the TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 cipher suite. The hardware architecture of the proposed coordinators is based on SW/HW co-design, implementing within the hardware accelerator core Elliptic Curve Scalar Multiplication (ECSM), which is the core operation of Elliptic Curve Cryptosystems (ECC). Meanwhile, the control of the overall TLS scheme is performed in software by an ARM Cortex-A9 microprocessor. In fact, the implementation of the ECC accelerator core around an ARM microprocessor allows not only the improvement of ECSM execution but also the performance enhancement of the overall cryptosystem. The integration of the ARM processor enables to exploit the possibility of embedded Linux features for high system flexibility. As a result, the proposed ECC accelerator requires limited area, with only 3395 LUTs on the Zynq device used to perform high-speed, 233-bit ECSMs in 413 µs, with a 50 MHz clock. Moreover, the generation of a 384-bit TLS handshake secret key between client and server coordinators requires 67.5 ms on a low cost Zynq 7Z007S device

    Cryptographic coprocessors for embedded systems

    Get PDF
    In the field of embedded systems design, coprocessors play an important role as a component to increase performance. Many embedded systems are built around a small General Purpose Processor (GPP). If the GPP cannot meet the performance requirements for a certain operation, a coprocessor can be included in the design. The GPP can then offload the computationally intensive operation to the coprocessor; thus increasing the performance of the overall system. A common application of coprocessors is the acceleration of cryptographic algorithms. The work presented in this thesis discusses coprocessor architectures for various cryptographic algorithms that are found in many cryptographic protocols. Their performance is then analysed on a Field Programmable Gate Array (FPGA) platform. Firstly, the acceleration of Elliptic Curve Cryptography (ECC) algorithms is investigated through the use of instruction set extension of a GPP. The performance of these algorithms in a full hardware implementation is then investigated, and an architecture for the acceleration the ECC based digital signature algorithm is developed. Hash functions are also an important component of a cryptographic system. The FPGA implementation of recent hash function designs from the SHA-3 competition are discussed and a fair comparison methodology for hash functions presented. Many cryptographic protocols involve the generation of random data, for keys or nonces. This requires a True Random Number Generator (TRNG) to be present in the system. Various TRNG designs are discussed and a secure implementation, including post-processing and failure detection, is introduced. Finally, a coprocessor for the acceleration of operations at the protocol level will be discussed, where, a novel aspect of the design is the secure method in which private-key data is handle

    Hardware Design and Implementation of Role-Based Cryptography

    Get PDF
    Traditional public key cryptographic methods provide access control to sensitive data by allowing the message sender to grant a single recipient permission to read the encrypted message. The Need2Know® system (N2K) improves upon these methods by providing role-based access control. N2K defines data access permissions similar to those of a multi-user file system, but N2K strictly enforces access through cryptographic standards. Since custom hardware can efficiently implement many cryptographic algorithms and can provide additional security, N2K stands to benefit greatly from a hardware implementation. To this end, the main N2K algorithm, the Key Protection Module (KPM), is being specified in VHDL. The design is being built and tested incrementally: this first phase implements the core control logic of the KPM without integrating its cryptographic sub-modules. Both RTL simulation and formal verification are used to test the design. This is the first N2K implementation in hardware, and it promises to provide an accelerated and secured alternative to the software-based system. A hardware implementation is a necessary step toward highly secure and flexible deployments of the N2K system

    Virtualized Reconfigurable Resources and Their Secured Provision in an Untrusted Cloud Environment

    Get PDF
    The cloud computing business grows year after year. To keep up with increasing demand and to offer more services, data center providers are always searching for novel architectures. One of them are FPGAs, reconfigurable hardware with high compute power and energy efficiency. But some clients cannot make use of the remote processing capabilities. Not every involved party is trustworthy and the complex management software has potential security flaws. Hence, clients’ sensitive data or algorithms cannot be sufficiently protected. In this thesis state-of-the-art hardware, cloud and security concepts are analyzed and com- bined. On one side are reconfigurable virtual FPGAs. They are a flexible resource and fulfill the cloud characteristics at the price of security. But on the other side is a strong requirement for said security. To provide it, an immutable controller is embedded enabling a direct, confidential and secure transfer of clients’ configurations. This establishes a trustworthy compute space inside an untrusted cloud environment. Clients can securely transfer their sensitive data and algorithms without involving vulnerable software or a data center provider. This concept is implemented as a prototype. Based on it, necessary changes to current FPGAs are analyzed. To fully enable reconfigurable yet secure hardware in the cloud, a new hybrid architecture is required.Das Geschäft mit dem Cloud Computing wächst Jahr für Jahr. Um mit der steigenden Nachfrage mitzuhalten und neue Angebote zu bieten, sind Betreiber von Rechenzentren immer auf der Suche nach neuen Architekturen. Eine davon sind FPGAs, rekonfigurierbare Hardware mit hoher Rechenleistung und Energieeffizienz. Aber manche Kunden können die ausgelagerten Rechenkapazitäten nicht nutzen. Nicht alle Beteiligten sind vertrauenswürdig und die komplexe Verwaltungssoftware ist anfällig für Sicherheitslücken. Daher können die sensiblen Daten dieser Kunden nicht ausreichend geschützt werden. In dieser Arbeit werden modernste Hardware, Cloud und Sicherheitskonzept analysiert und kombiniert. Auf der einen Seite sind virtuelle FPGAs. Sie sind eine flexible Ressource und haben Cloud Charakteristiken zum Preis der Sicherheit. Aber auf der anderen Seite steht ein hohes Sicherheitsbedürfnis. Um dieses zu bieten ist ein unveränderlicher Controller eingebettet und ermöglicht eine direkte, vertrauliche und sichere Übertragung der Konfigurationen der Kunden. Das etabliert eine vertrauenswürdige Rechenumgebung in einer nicht vertrauenswürdigen Cloud Umgebung. Kunden können sicher ihre sensiblen Daten und Algorithmen übertragen ohne verwundbare Software zu nutzen oder den Betreiber des Rechenzentrums einzubeziehen. Dieses Konzept ist als Prototyp implementiert. Darauf basierend werden nötige Änderungen von modernen FPGAs analysiert. Um in vollem Umfang eine rekonfigurierbare aber dennoch sichere Hardware in der Cloud zu ermöglichen, wird eine neue hybride Architektur benötigt

    Simple AEAD Hardware Interface (SÆHI) in a SoC: Implementing an On-Chip Keyak/WhirlBob Coprocessor

    Get PDF
    Simple AEAD Hardware Interface (SÆHI) is a hardware cryptographic interface aimed at CAESAR Authenticated Encryption with Associated Data (AEAD) algorithms. Cryptographic acceleration is typically achieved either with a coprocessor or via instruction set extensions. ISA modifications require re-engineering the CPU core, making the approach inapplicable outside the realm of open source processor cores. At minimum, we suggest implementing CAESAR AEADs as universal memory-mapped cryptographic coprocessors, synthesizable even on low end FPGA platforms. AEADs complying to SÆHI must also include C language API drivers targeting low-end MCUs that directly utilize the memory mapping in a ``bare metal\u27\u27 fashion. This can also be accommodated on MMU-equipped mid-range CPUs. Extended battery life and bandwidth resulting from dedicated cryptographic hardware is vital for currently dominant computing and communication devices: mobile phones, tablets, and Internet-of-Things (IoT) applications. We argue that these should be priority hardware optimization targets for AEAD algorithms with realistic payload profiles. We demonstrate a fully integrated implementation of WhirlBob and Keyak AEADs on the FPGA fabric of Xilinx Zynq 7010. This low-cost System-on-Chip (SoC) also houses a dual-core Cortex-A9 CPU, closely matching the architecture of many embedded devices. The on-chip coprocessor is accessible from user space with a Linux kernel driver. An integration path exists all the way to end-user applications

    Area-throughput trade-offs for SHA-1 and SHA-256 hash functions’ pipelined designs

    Get PDF
    High-throughput designs of hash functions are strongly demanded due to the need for security in every transmitted packet of worldwide e-transactions. Thus, optimized and non-optimized pipelined architectures have been proposed raising, however, important questions. Which is the optimum number of the pipeline stages? Is it worth to develop optimized designs or could the same results be achieved by increasing only the pipeline stages of the non-optimized designs? The paper answers the above questions studying extensively many pipelined architectures of SHA-1 and SHA-256 hashes, implemented in FPGAs, in terms of throughput/area (T/A) factor. Also, guides for developing efficient security schemes designs are provided. Read More: https://www.worldscientific.com/doi/abs/10.1142/S021812661650032

    Hardware Implementations of Scalable and Unified Elliptic Curve Cryptosystem Processors

    Get PDF
    As the amount of information exchanged through the network grows, so does the demand for increased security over the transmission of this information. As the growth of computers increased in the past few decades, more sophisticated methods of cryptography have been developed. One method of transmitting data securely over the network is by using symmetric-key cryptography. However, a drawback of symmetric-key cryptography is the need to exchange the shared key securely. One of the solutions is to use public-key cryptography. One of the modern public-key cryptography algorithms is called Elliptic Curve Cryptography (ECC). The advantage of ECC over some older algorithms is the smaller number of key sizes to provide a similar level of security. As a result, implementations of ECC are much faster and consume fewer resources. In order to achieve better performance, ECC operations are often offloaded onto hardware to alleviate the workload from the servers' processors. The most important and complex operation in ECC schemes is the elliptic curve point multiplication (ECPM). This thesis explores the implementation of hardware accelerators that offload the ECPM operation to hardware. These processors are referred to as ECC processors, or simply ECPs. This thesis targets the efficient hardware implementation of ECPs specifically for the 15 elliptic curves recommended by the National Institute of Standards and Technology (NIST). The main contribution of this thesis is the implementation of highly efficient hardware for scalable and unified finite field arithmetic units that are used in the design of ECPs. In this thesis, scalability refers to the processor's ability to support multiple key sizes without the need to reconfigure the hardware. By doing so, the hardware does not need to be redesigned for the server to handle different levels of security. Unified refers to the ability of the ECP to handle both prime and binary fields. The resultant designs are valuable to the research community and industry, as a single hardware device is able to handle a wide range of ECC operations efficiently and at high speeds. Thus, improving the ability of network servers to handle secure transaction more quickly and improve productivity at lower costs
    corecore