39 research outputs found

    An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics

    Full text link
    Near-sensor data analytics is a promising direction for IoT endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data is stored or sent over the network at various stages of the analytics pipeline. Using encryption to protect sensitive data at the boundary of the on-chip analytics engine is a way to address data security issues. To cope with the combined workload of analytics and encryption in a tight power envelope, we propose Fulmine, a System-on-Chip based on a tightly-coupled multi-core cluster augmented with specialized blocks for compute-intensive data processing and encryption functions, supporting software programmability for regular computing tasks. The Fulmine SoC, fabricated in 65nm technology, consumes less than 20mW on average at 0.8V achieving an efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to 25MIPS/mW in software. As a strong argument for real-life flexible application of our platform, we show experimental results for three secure analytics use cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with secured remote recognition in 5.74pJ/op; and seizure detection with encrypted data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE Transactions on Circuits and Systems - I: Regular Paper

    Impact of System-on-Chip Integration of AEAD Ciphers

    Get PDF
    Authenticated Encryption has emerged as a high-performance and resource-efficient solution to achieve message authentication in addition to encryption. This has motivated extensive study of algorithms for Authenticated Encryption with Associated Data (AEAD). While there have been significant efforts to benchmark these algorithms on hardware and software platforms, very little work has focused on the integration of these ciphers onto a System-on-Chip (SoC). This work looks at design alternatives for the SoC integration of few of the finalists of the Competition for Authenticated Encryption: Security, Applicability, and Robustness (CAESAR). We highlight the penalty on area and performance that is incurred during SoC integration, and analyze the impact of design choices on the same. Our observations indicate that integration onto a system significantly affects the lightweight and high-performance properties of these ciphers, and achieving a trade-off requires careful design decisions

    A Comprehensive Performance Analysis of Hardware Implementations of CAESAR Candidates

    Get PDF
    Authenticated encryption with Associated Data (AEAD) plays a significant role in cryptography because of its ability to provide integrity, confidentiality and authenticity at the same time. Due to the emergence of security at the edge of computing fabric, such as, sensors and smartphone devices, there is a growing need of lightweight AEAD ciphers. Currently, a worldwide contest, titled CAESAR, is being held to decide on a set of AEAD ciphers, which are distinguished by their security, run-time performance, energy-efficiency and low area budget. For accurate evaluation of CAESAR candidates, it is of utmost importance to have independent and thorough optimization for each of the ciphers both for their corresponding hardware and software implementations. In this paper, we have carried out an evaluation of the optimized hardware implementation of AEAD ciphers selected in CAESAR third round. We specifically focus on manual optimization of the micro-architecture, evaluations for ASIC technology libraries and the effect of CAESAR APIs on the performances. While these has been studied for FPGA platforms and standalone cipher implementation - to the best of our knowledge, this is the first detailed ASIC benchmarking of CAESAR candidates including manual optimization. In this regard, we benchmarked all prior reported designs, including the code generated by high-level synthesis flows. Detailed optimization studies are reported for NORX, CLOC and Deoxys-I. Our pre-layout results using commercial ASIC technology library and synthesis tools show that optimized NORX is 40.81% faster and 18.02% smaller, optimized CLOC is 38.30% more energy efficient and 20.65% faster and optimized Deoxys-I is 35.16% faster, with respect to the best known results. Similar or better performance results are also achieved for FPGA platforms

    SPAE un schéma opératoire pour l'AES sur du matériel bas-coût.

    Get PDF
    We propose SPAE, a single pass, patent free, authenticated encryption with associated data (AEAD) for AES. The algorithm has been developped to address the needs of a growing trend in IoT systems: storing code and data on a low cost flash memory external to the main SOC. Existing AEAD algorithms such as OCB, GCM, CCM, EAX , SIV, provide the required functionality however in practice each of them suffer from various drawbacks for this particular use case. Academic contributions such as ASCON and AEGIS-128 are suitable and efficient however they require the development of new hardware accelerators and they use primitives which are not 'approved' by governemental institutions such as NIST, BSI, ANSSI. From a silicon manufacturer point of view, an efficient AEAD which use existing AES hardware is much more enticing: the AES is required already by most industry standards invovling symmetric encryption (GSMA, EMVco, FIDO, Bluetooth, ZigBee to name few). This paper expose the properties of an ideal AEAD for external memory encryption, present the SPAE algorithm and analyze various security aspects. Performances of SPAE on actual hardware are better than OCB, GCM and CCM.Nous présentons SPAE, un schéma en une passe, libre de droit, d'encryption authentifiée avec données associées (AEAD) appliqué à l'AES. Cet algorithme a été développé afin de répondre à une tendance grandissante dans l'internet des objets: stocker du code et des données sur une mémoire flash à bas coût externe au système sur puce (SOC). Des algorithmes AEAD existent déjà tels que OCB, GCM, CCM, EAX, SIV, ils répondent à l'usage demandé cependant en pratique chacun de ces algorithmes présente des désavantages pour cet usage particulier. Les contributions académique telles que ASCON et AEGIS-128 sont appropriés et efficaces cependant ils nécessitent le développement de nouveaux accélérateurs matériels et ils utilisent des primitives qui ne sont pas approuvés par les instituions gouvernementales telles que le NIST, BSI ANSSI. Du point de vue du fabricant de silicone, un AEAD efficace qui utilise du matériel AES existant est beaucoup plus attirant: l'AES est déjà requis par la plupart des standards industriels utilisant de l’encryption symétrique (GSMA, EMVco, FIDO, Bluetooth, ZigBee par exemple). Cet article montre les propriétés d'un AEAD idéal pour de la mémoire encryptée externe, présente l'algorithme SPAE et analyse plusieurs aspects de sécurité. Les performances de SPAE sur du matériel actuel sont meilleures que sur OCB, GCM, et CCM

    SecDDR: Enabling Low-Cost Secure Memories by Protecting the DDR Interface

    Full text link
    The security goals of cloud providers and users include memory confidentiality and integrity, which requires implementing Replay-Attack protection (RAP). RAP can be achieved using integrity trees or mutually authenticated channels. Integrity trees incur significant performance overheads and are impractical for protecting large memories. Mutually authenticated channels have been proposed only for packetized memory interfaces that address only a very small niche domain and require fundamental changes to memory system architecture. We propose SecDDR, a low-cost RAP that targets direct-attached memories, like DDRx. SecDDR avoids memory-side data authentication, and thus, only adds a small amount of logic to memory components and does not change the underlying DDR protocol, making it practical for widespread adoption. In contrast to prior mutual authentication proposals, which require trusting the entire memory module, SecDDR targets untrusted modules by placing its limited security logic on the DRAM die (or package) of the ECC chip. Our evaluation shows that SecDDR performs within 1% of an encryption-only memory without RAP and that SecDDR provides 18.8% and 7.8% average performance improvements (up to 190.4% and 24.8%) relative to a 64-ary integrity tree and an authenticated channel, respectively

    GuardNN: Secure DNN Accelerator for Privacy-Preserving Deep Learning

    Full text link
    This paper proposes GuardNN, a secure deep neural network (DNN) accelerator, which provides strong hardware-based protection for user data and model parameters even in an untrusted environment. GuardNN shows that the architecture and protection can be customized for a specific application to provide strong confidentiality and integrity protection with negligible overhead. The design of the GuardNN instruction set reduces the TCB to just the accelerator and enables confidentiality protection without the overhead of integrity protection. GuardNN also introduces a new application-specific memory protection scheme to minimize the overhead of memory encryption and integrity verification. The scheme shows that most of the off-chip meta-data in today's state-of-the-art memory protection can be removed by exploiting the known memory access patterns of a DNN accelerator. GuardNN is implemented as an FPGA prototype, which demonstrates effective protection with less than 2% performance overhead for inference over a variety of modern DNN models

    A >100 Gbps Inline AES-GCM Hardware Engine and Protected DMA Transfers between SGX Enclave and FPGA Accelerator Device

    Get PDF
    This paper proposes a method to protect DMA data transfer that can be used to offload computation to an accelerator. The proposal minimizes changes in the hardware platform and to the application and SW stack. The paper de-scribes the end-to-end scheme to protect communication between an appli-cation running inside a SGX enclave and a FPGA accelerator optimized for bandwidth and latency and details the implementation of AES-GCM hard-ware engines with high bandwidth and low latency

    TriviA: A Fast and Secure Authenticated Encryption Scheme

    Get PDF
    In this paper, we propose a new hardware friendly authen- ticated encryption (AE) scheme TriviA based on (i) a stream cipher for generating keys for the ciphertext and the tag, and (ii) a pairwise in- dependent hash to compute the tag. We have adopted one of the ISO- standardized stream ciphers for lightweight cryptography, namely Triv- ium, to obtain our underlying stream cipher. This new stream cipher has a state that is a little larger than the state of Trivium to accommodate a 128-bit secret key and IV. Our pairwise independent hash is also an adaptation of the EHC or “Encode-Hash-Combine” hash, that requires the optimum number of field multiplications and hence requires small hardware footprint. We have implemented the design in synthesizable RTL. Pre-layout synthesis, using 65 nm standard cell technology under typical operating conditions, reveals that TriviA is able to achieve a high throughput of 91.2 Gbps for an area of 24.4 KGE. We prove that our construction has at least 128-bit security for privacy and 124-bit security of authenticity under the assumption that the underlying stream cipher produces a pseudorandom bit stream

    Near Data Processing for Efficient and Trusted Systems

    Full text link
    We live in a world which constantly produces data at a rate which only increases with time. Conventional processor architectures fail to process this abundant data in an efficient manner as they expend significant energy in instruction processing and moving data over deep memory hierarchies. Furthermore, to process large amounts of data in a cost effective manner, there is increased demand for remote computation. While cloud service providers have come up with innovative solutions to cater to this increased demand, the security concerns users feel for their data remains a strong impediment to their wide scale adoption. An exciting technique in our repertoire to deal with these challenges is near-data processing. Near-data processing (NDP) is a data-centric paradigm which moves computation to where data resides. This dissertation exploits NDP to both process the data deluge we face efficiently and design low-overhead secure hardware designs. To this end, we first propose Compute Caches, a novel NDP technique. Simple augmentations to underlying SRAM design enable caches to perform commonly used operations. In-place computation in caches not only avoids excessive data movement over memory hierarchy, but also significantly reduces instruction processing energy as independent sub-units inside caches perform computation in parallel. Compute Caches significantly improve the performance and reduce energy expended for a suite of data intensive applications. Second, this dissertation identifies security advantages of NDP. While memory bus side channel has received much attention, a low-overhead hardware design which defends against it remains elusive. We observe that smart memory, memory with compute capability, can dramatically simplify this problem. To exploit this observation, we propose InvisiMem which uses the logic layer in the smart memory to implement cryptographic primitives, which aid in addressing memory bus side channel efficiently. Our solutions obviate the need for expensive constructs like Oblivious RAM (ORAM) and Merkle trees, and have one to two orders of magnitude lower overheads for performance, space, energy, and memory bandwidth, compared to prior solutions. This dissertation also addresses a related vulnerability of page fault side channel in which the Operating System (OS) induces page faults to learn application's address trace and deduces application secrets from it. To tackle it, we propose Sanctuary which obfuscates page fault channel while allowing the OS to manage memory as a resource. To do so, we design a novel construct, Oblivious Page Management (OPAM) which is derived from ORAM but is customized for page management context. We employ near-memory page moves to reduce OPAM overhead and also propose a novel memory partition to reduce OPAM transactions required. For a suite of cloud applications which process sensitive data we show that page fault channel can be tackled at reasonable overheads.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144139/1/shaizeen_1.pd
    corecore