1,338 research outputs found

    Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications

    Get PDF
    We present Chameleon, a novel hybrid (mixed-protocol) framework for secure function evaluation (SFE) which enables two parties to jointly compute a function without disclosing their private inputs. Chameleon combines the best aspects of generic SFE protocols with the ones that are based upon additive secret sharing. In particular, the framework performs linear operations in the ring Z2l\mathbb{Z}_{2^l} using additively secret shared values and nonlinear operations using Yao's Garbled Circuits or the Goldreich-Micali-Wigderson protocol. Chameleon departs from the common assumption of additive or linear secret sharing models where three or more parties need to communicate in the online phase: the framework allows two parties with private inputs to communicate in the online phase under the assumption of a third node generating correlated randomness in an offline phase. Almost all of the heavy cryptographic operations are precomputed in an offline phase which substantially reduces the communication overhead. Chameleon is both scalable and significantly more efficient than the ABY framework (NDSS'15) it is based on. Our framework supports signed fixed-point numbers. In particular, Chameleon's vector dot product of signed fixed-point numbers improves the efficiency of mining and classification of encrypted data for algorithms based upon heavy matrix multiplications. Our evaluation of Chameleon on a 5 layer convolutional deep neural network shows 133x and 4.2x faster executions than Microsoft CryptoNets (ICML'16) and MiniONN (CCS'17), respectively

    Multi-Unit Serial Polynomial Multiplier to Accelerate NTRU-Based Cryptographic Schemes in IoT Embedded Systems

    Get PDF
    Concern for the security of embedded systems that implement IoT devices has become a crucial issue, as these devices today support an increasing number of applications and services that store and exchange information whose integrity, privacy, and authenticity must be adequately guaranteed. Modern lattice-based cryptographic schemes have proven to be a good alternative, both to face the security threats that arise as a consequence of the development of quantum computing and to allow efficient implementations of cryptographic primitives in resource-limited embedded systems, such as those used in consumer and industrial applications of the IoT. This article describes the hardware implementation of parameterized multi-unit serial polynomial multipliers to speed up time-consuming operations in NTRU-based cryptographic schemes. The flexibility in selecting the design parameters and the interconnection protocol with a general-purpose processor allow them to be applied both to the standardized variants of NTRU and to the new proposals that are being considered in the post-quantum contest currently held by the National Institute of Standards and Technology, as well as to obtain an adequate cost/performance/security-level trade-off for a target application. The designs are provided as AXI4 bus-compliant intellectual property modules that can be easily incorporated into embedded systems developed with the Vivado design tools. The work provides an extensive set of implementation and characterization results in devices of the Xilinx Zynq-7000 and Zynq UltraScale+ families for the different sets of parameters defined in the NTRUEncrypt standard. It also includes details of their plug and play inclusion as hardware accelerators in the C implementation of this public-key encryption scheme codified in the LibNTRU library, showing that acceleration factors of up to 3.1 are achieved when compared to pure software implementations running on the processing systems included in the programmable devices.European Union 952622Ministerio de Ciencia e Innovación PID2020-116664RB100, 10.13039/50110001103

    Low-Weight Primes for Lightweight Elliptic Curve Cryptography on 8-bit AVR Processors

    Get PDF
    Small 8-bit RISC processors and micro-controllers based on the AVR instruction set architecture are widely used in the embedded domain with applications ranging from smartcards over control systems to wireless sensor nodes. Many of these applications require asymmetric encryption or authentication, which has spurred a body of research into implementation aspects of Elliptic Curve Cryptography (ECC) on the AVR platform. In this paper, we study the suitability of a special class of finite fields, the so-called Optimal Prime Fields (OPFs), for a "lightweight" implementation of ECC with a view towards high performance and security. An OPF is a finite field Fp defined by a prime of the form p = u*2^k + v, whereby both u and v are "small" (in relation to 2^k) so that they fit into one or two registers of an AVR processor. OPFs have a low Hamming weight, which allows for a very efficient implementation of the modular reduction since only the non-zero words of p need to be processed. We describe a special variant of Montgomery multiplication for OPFs that does not execute any input-dependent conditional statements (e.g. branch instructions) and is, hence, resistant against certain side-channel attacks. When executed on an Atmel ATmega processor, a multiplication in a 160-bit OPF takes just 3237 cycles, which compares favorably with other implementations of 160-bit modular multiplication on an 8-bit processor. We also describe a performance-optimized and a security-optimized implementation of elliptic curve scalar multiplication over OPFs. The former uses a GLV curve and executes in 4.19M cycles (over a 160-bit OPF), while the latter is based on a Montgomery curve and has an execution time of approximately 5.93M cycles. Both results improve the state-of-the-art in lightweight ECC on 8-bit processors

    Versatile Montgomery Multiplier Architectures

    Get PDF
    Several algorithms for Public Key Cryptography (PKC), such as RSA, Diffie-Hellman, and Elliptic Curve Cryptography, require modular multiplication of very large operands (sizes from 160 to 4096 bits) as their core arithmetic operation. To perform this operation reasonably fast, general purpose processors are not always the best choice. This is why specialized hardware, in the form of cryptographic co-processors, become more attractive. Based upon the analysis of recent publications on hardware design for modular multiplication, this M.S. thesis presents a new architecture that is scalable with respect to word size and pipelining depth. To our knowledge, this is the first time a word based algorithm for Montgomery\u27s method is realized using high-radix bit-parallel multipliers working with two different types of finite fields (unified architecture for GF(p) and GF(2n)). Previous approaches have relied mostly on bit serial multiplication in combination with massive pipelining, or Radix-8 multiplication with the limitation to a single type of finite field. Our approach is centered around the notion that the optimal delay in bit-parallel multipliers grows with logarithmic complexity with respect to the operand size n, O(log3/2 n), while the delay of bit serial implementations grows with linear complexity O(n). Our design has been implemented in VHDL, simulated and synthesized in 0.5μ CMOS technology. The synthesized net list has been verified in back-annotated timing simulations and analyzed in terms of performance and area consumption

    Transparent code authentication at the processor level

    Get PDF
    The authors present a lightweight authentication mechanism that verifies the authenticity of code and thereby addresses the virus and malicious code problems at the hardware level eliminating the need for trusted extensions in the operating system. The technique proposed tightly integrates the authentication mechanism into the processor core. The authentication latency is hidden behind the memory access latency, thereby allowing seamless on-the-fly authentication of instructions. In addition, the proposed authentication method supports seamless encryption of code (and static data). Consequently, while providing the software users with assurance for authenticity of programs executing on their hardware, the proposed technique also protects the software manufacturers’ intellectual property through encryption. The performance analysis shows that, under mild assumptions, the presented technique introduces negligible overhead for even moderate cache sizes

    Reconfigurable Architecture for Elliptic Curve Cryptography Using FPGA

    Get PDF
    The high performance of an elliptic curve (EC) crypto system depends efficiently on the arithmetic in the underlying finite field. We have to propose and compare three levels of Galois Field , , and . The proposed architecture is based on Lopez-Dahab elliptic curve point multiplication algorithm, which uses Gaussian normal basis for field arithmetic. The proposed is based on an efficient Montgomery add and double algorithm, also the Karatsuba-Ofman multiplier and Itoh-Tsujii algorithm are used as the inverse component. The hardware design is based on optimized finite state machine (FSM), with a single cycle 193 bits multiplier, field adder, and field squarer. The another proposed architecture is based on applications for which compactness is more important than speed. The FPGA’s dedicated multipliers and carry-chain logic are used to obtain the small data path. The different optimization at the hardware level improves the acceleration of the ECC scalar multiplication, increases frequency and the speed of operation such as key generation, encryption, and decryption. Finally, we have to implement our design using Xilinx XC4VLX200 FPGA device
    • …
    corecore