1,465 research outputs found
Implementing a protected zone in a reconfigurable processor for isolated execution of cryptographic algorithms
We design and realize a protected zone inside a reconfigurable and extensible embedded RISC processor for isolated execution of cryptographic algorithms. The protected zone is a collection of processor subsystems such as functional units optimized for high-speed execution of integer operations, a small amount of local memory, and general and special-purpose registers. We outline the principles for secure software implementation of cryptographic algorithms
in a processor equipped with the protected zone. We also demonstrate the efficiency and effectiveness of the protected zone by implementing major cryptographic algorithms, namely RSA, elliptic curve cryptography, and AES in the protected zone. In terms of time efficiency, software implementations
of these three cryptographic algorithms outperform equivalent software implementations on similar processors reported in the literature. The protected zone is designed in such a modular fashion that it can easily be integrated into any RISC processor; its area overhead is considerably moderate in the sense that
it can be used in vast majority of embedded processors. The protected zone can also provide the necessary support to implement TPM functionality within the boundary of a processor
Enhancing an Embedded Processor Core with a Cryptographic Unit for Performance and Security
We present a set of low-cost architectural enhancements to accelerate the execution of certain arithmetic operations common in cryptographic applications on an extensible embedded processor core. The proposed enhancements are generic in the sense that they can be beneficially applied in almost any RISC processor. We implemented the enhancements in form of a cryptographic unit (CU) that offers the programmer an extended instruction set. The CU features a 128-bit wide register file and datapath, which enables it to process 128-bit words and perform 128-bit loads/stores. We analyze the speed-up factors for some arithmetic operations and public-key cryptographic algorithms obtained through
these enhancements. In addition, we evaluate the hardware overhead (i.e. silicon area) of integrating the CU into an embedded RISC processor. Our experimental results show that the proposed architectural enhancements allow for a
significant performance gain for both RSA and ECC at the expense of an acceptable increase in silicon area. We also demonstrate that the proposed enhancements facilitate the protection of cryptographic algorithms against certain types of side-channel attacks and present an AES implementation
hardened against cache-based attacks as a case study
An Energy-Efficient Reconfigurable DTLS Cryptographic Engine for End-to-End Security in IoT Applications
This paper presents a reconfigurable cryptographic engine that implements the
DTLS protocol to enable end-to-end security for IoT. This implementation of the
DTLS engine demonstrates 10x reduction in code size and 438x improvement in
energy-efficiency over software. Our ECC primitive is 237x and 9x more
energy-efficient compared to software and state-of-the-art hardware
respectively. Pairing the DTLS engine with an on-chip RISC-V allows us to
demonstrate applications beyond DTLS with up to 2 orders of magnitude energy
savings.Comment: Published in 2018 IEEE International Solid-State Circuits Conference
(ISSCC
An Energy-Efficient Reconfigurable DTLS Cryptographic Engine for End-to-End Security in IoT Applications
This paper presents a reconfigurable cryptographic engine that implements the
DTLS protocol to enable end-to-end security for IoT. This implementation of the
DTLS engine demonstrates 10x reduction in code size and 438x improvement in
energy-efficiency over software. Our ECC primitive is 237x and 9x more
energy-efficient compared to software and state-of-the-art hardware
respectively. Pairing the DTLS engine with an on-chip RISC-V allows us to
demonstrate applications beyond DTLS with up to 2 orders of magnitude energy
savings.Comment: Published in 2018 IEEE International Solid-State Circuits Conference
(ISSCC
Analysis on the Possibility of RISC-V Adoption
As the interface between hardware and software, Instruction Set Architectures (ISAs) play a key role in the operation of computers. While both hardware and software have continued to evolve rapidly over time, ISAs have undergone minimal change. Since its release in 2010, RISC-V has begun to erode the industry aversion to ISA innovation. Established on the principals of the Reduced Instruction Set Computer (RISC), and as an open source ISA, RISC-V offers many benefits over popular ISAs like Intel’s x86 and Arm Holding’s Advanced RISC Machine (ARM). In this literature review I evaluate the literature discussing: What makes changing Instruction Set Architectures difficultWhy might the industry choose to implement RISC-V When researching this topic I visited the IEEE (Institute of Electrical and Electronics Engineers), INSPEC (Engineering Village), and ACM (Association for Computing Machinery) Digital Library databases. I used the search terms, “RISC-V”, “Instruction Set Architecture”, “RISC-V” AND “x86”, and “RISC-V” AND “Instruction Set Architecture”. This literature review evaluates 10 papers on implementation of RISC-V. As this paper was intended to cover recent developments in the field, publication dates were limited to from 2015 to present
An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics
Near-sensor data analytics is a promising direction for IoT endpoints, as it
minimizes energy spent on communication and reduces network load - but it also
poses security concerns, as valuable data is stored or sent over the network at
various stages of the analytics pipeline. Using encryption to protect sensitive
data at the boundary of the on-chip analytics engine is a way to address data
security issues. To cope with the combined workload of analytics and encryption
in a tight power envelope, we propose Fulmine, a System-on-Chip based on a
tightly-coupled multi-core cluster augmented with specialized blocks for
compute-intensive data processing and encryption functions, supporting software
programmability for regular computing tasks. The Fulmine SoC, fabricated in
65nm technology, consumes less than 20mW on average at 0.8V achieving an
efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to
25MIPS/mW in software. As a strong argument for real-life flexible application
of our platform, we show experimental results for three secure analytics use
cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN
consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with
secured remote recognition in 5.74pJ/op; and seizure detection with encrypted
data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE
Transactions on Circuits and Systems - I: Regular Paper
Secure Silicon: Towards Virtual Prototyping
Evaluating security vulnerabilities of software implementations at design step is of primary
importance for applications developers, while it has received litte attention from scientific
communauty. In this paper, we describe virtual prototyping of an implementation of
Elliptic curve cryptography (ECC), aiming to make it secure against first-order horizontal
and vertical side-channel attacks (SCAs). Reproducing information leakage as close to
reality as possible requires bit- and clock-cycle accuracy, we got with Mentor Graphics
Modelsim tool, simulating the execution of the ECC software implementations on PULPino,
an open-source 32-bit microcontroller based on the recently released RISC-V instruction
set architecture. For each clock cycle, we compute the number of bit toggles into
microcontroller's registers, an image of the power consumption, and watch the program
counter to identify the assembly instruction executed, then the corresponding C function.
We first start with a naive double-and-add implementation relying on cryptographic
primitives of the mbed TLS library, formerly PolarSSL before acquisition by ARM. The
virtual analysis pinpoints differences in the way the double function on one side and the
add function on the other side manage variables and internal operations, which can be used
for horizontal SCAs. We propose some modifications of the C code, hence independent of
the considered microcontroller, with an overhead extremely small compared to that of the
double-and-add-always countermeasure. Then, we reiterate analyses, still for the mbed
TLS library, but using the regular Montgomery ladder version, most used in practice as
more efficient
- …