Search CORE

569 research outputs found

Arithmetic Operations in Multi-Valued Logic

Author: Gurumurthy K. S.
Patel Vasundara
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 29/03/2010
Field of study

This paper presents arithmetic operations like addition, subtraction and multiplications in Modulo-4 arithmetic, and also addition, multiplication in Galois field, using multi-valued logic (MVL). Quaternary to binary and binary to quaternary converters are designed using down literal circuits. Negation in modular arithmetic is designed with only one gate. Logic design of each operation is achieved by reducing the terms using Karnaugh diagrams, keeping minimum number of gates and depth of net in to consideration. Quaternary multiplier circuit is proposed to achieve required optimization. Simulation result of each operation is shown separately using Hspice.Comment: 12 Pages, VLSICS Journal 201

arXiv.org e-Print Archive

Crossref

A versatile Montgomery multiplier architecture with characteristic three support

Author: Ozturk Erdinc
Savas Erkay
Savaş Erkay
Sunar Berk
Öztürk Erdinç
Publication venue: 'Elsevier BV'
Publication date: 08/05/2008
Field of study

We present a novel unified core design which is extended to realize Montgomery multiplication in the fields GF(2n), GF(3m), and GF(p). Our unified design supports RSA and elliptic curve schemes, as well as the identity-based encryption which requires a pairing computation on an elliptic curve. The architecture is pipelined and is highly scalable. The unified core utilizes the redundant signed digit representation to reduce the critical path delay. While the carry-save representation used in classical unified architectures is only good for addition and multiplication operations, the redundant signed digit representation also facilitates efficient computation of comparison and subtraction operations besides addition and multiplication. Thus, there is no need for a transformation between the redundant and the non-redundant representations of field elements, which would be required in the classical unified architectures to realize the subtraction and comparison operations. We also quantify the benefits of the unified architectures in terms of area and critical path delay. We provide detailed implementation results. The metric shows that the new unified architecture provides an improvement over a hypothetical non-unified architecture of at least 24.88%, while the improvement over a classical unified architecture is at least 32.07%

Sabanci University Research Database

Area- Efficient VLSI Implementation of Serial-In Parallel-Out Multiplier Using Polynomial Representation in Finite Field GF(2m)

Author: Fatin Gholamreza Zare
Javidan Javad
Nabipour Saeideh
Publication venue
Publication date: 16/07/2020
Field of study

Finite field multiplier is mainly used in elliptic curve cryptography, error-correcting codes and signal processing. Finite field multiplier is regarded as the bottleneck arithmetic unit for such applications and it is the most complicated operation over finite field GF(2m) which requires a huge amount of logic resources. In this paper, a new modified serial-in parallel-out multiplication algorithm with interleaved modular reduction is suggested. The proposed method offers efficient area architecture as compared to proposed algorithms in the literature. The reduced finite field multiplier complexity is achieved by means of utilizing logic NAND gate in a particular architecture. The efficiency of the proposed architecture is evaluated based on criteria such as time (latency, critical path) and space (gate-latch number) complexity. A detailed comparative analysis indicates that, the proposed finite field multiplier based on logic NAND gate outperforms previously known resultsComment: 19 pages, 4 figure

arXiv.org e-Print Archive

Hardware Implementations for Symmetric Key Cryptosystems

Author: El-Razouk Hayssam
Publication venue: Scholarship@Western
Publication date: 05/06/2015
Field of study

The utilization of global communications network for supporting new electronic applications is growing. Many applications provided over the global communications network involve exchange of security-sensitive information between different entities. Often, communicating entities are located at different locations around the globe. This demands deployment of certain mechanisms for providing secure communications channels between these entities. For this purpose, cryptographic algorithms are used by many of today\u27s electronic applications to maintain security. Cryptographic algorithms provide set of primitives for achieving different security goals such as: confidentiality, data integrity, authenticity, and non-repudiation. In general, two main categories of cryptographic algorithms can be used to accomplish any of these security goals, namely, asymmetric key algorithms and symmetric key algorithms. The security of asymmetric key algorithms is based on the hardness of the underlying computational problems, which usually require large overhead of space and time complexities. On the other hand, the security of symmetric key algorithms is based on non-linear transformations and permutations, which provide efficient implementations compared to the asymmetric key ones. Therefore, it is common to use asymmetric key algorithms for key exchange, while symmetric key counterparts are deployed in securing the communications sessions. This thesis focuses on finding efficient hardware implementations for symmetric key cryptosystems targeting mobile communications and resource constrained applications. First, efficient lightweight hardware implementations of two members of the Welch-Gong (WG) family of stream ciphers, the WG

\left(29,11\right)

and WG-

16

, are considered for the mobile communications domain. Optimizations in the WG

\left(29,11\right)

stream cipher are considered when the

GF\left(2^{29}\right)

elements are represented in either the Optimal normal basis type-II (ONB-II) or the Polynomial basis (PB). For WG-

16

, optimizations are considered only for PB representations of the

GF\left(2^{16}\right)

elements. In this regard, optimizations for both ciphers are accomplished mainly at the arithmetic level through reducing the number of field multipliers, based on novel trace properties. In addition, other optimization techniques such as serialization and pipelining, are also considered. After this, the thesis explores efficient hardware implementations for digit-level multiplication over binary extension fields

GF\left(2^{m}\right)

. Efficient digit-level

GF\left(2^{m}\right)

multiplications are advantageous for ultra-lightweight implementations, not only in symmetric key algorithms, but also in asymmetric key algorithms. The thesis introduces new architectures for digit-level

GF\left(2^{m}\right)

multipliers considering the Gaussian normal basis (GNB) and PB representations of the field elements. The new digit-level

GF\left(2^{m}\right)

single multipliers do not require loading of the two input field elements in advance to computations. This feature results in high throughput fast multiplication in resource constrained applications with limited capacity of input data-paths. The new digit-level

GF\left(2^{m}\right)

single multipliers are considered for both the GNB and PB. In addition, for the GNB representation, new architectures for digit-level

GF\left(2^{m}\right)

hybrid-double and hybrid-triple multipliers are introduced. The new digit-level

GF\left(2^{m}\right)

hybrid-double and hybrid-triple GNB multipliers, respectively, accomplish the multiplication of three and four field elements using the latency required for multiplying two field elements. Furthermore, a new hardware architecture for the eight-ary exponentiation scheme is proposed by utilizing the new digit-level

GF\left(2^{m}\right)

hybrid-triple GNB multipliers

Scholarship@Western

Error Detecting Dual Basis Bit Parallel Systolic Multiplication Architecture over GF(2m)

Author: Bera A.
Mathew J.
Pradhan D.
Rahaman H.
Singh Ashutosh Kumar
Publication venue: University of Electronic Science and Technology
Publication date: 01/01/2009
Field of study

An error tolerant hardware efficient very large scale integration (VLSI) architecture for bit parallel systolic multiplication over dual base, which can be pipelined, is presented. Since this architecture has the features of regularity, modularity and unidirectional data flow, this structure is well suited to VLSI implementations. The length of the largest delay path and area of this architecture are less compared to the bit parallel systolic multiplication architectures reported earlier. The architecture is implemented using Austria Micro System's 0.35 m CMOS (complementary metal oxide semiconductor) technology. This architecture can also operate over both the dual-base and polynomial base

espace@Curtin

Efficient Implementation on Low-Cost SoC-FPGAs of TLSv1.2 Protocol with ECC_AES Support for Secure IoT Coordinators

Author: Anane Mohamed
Bellemou Ahmed Mohamed
Benblidia Nadjia
Castillo Encarnación
García Antonio
Parrilla Luis
Álvarez Bermejo José Antonio
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Security management for IoT applications is a critical research field, especially when taking into account the performance variation over the very different IoT devices. In this paper, we present high-performance client/server coordinators on low-cost SoC-FPGA devices for secure IoT data collection. Security is ensured by using the Transport Layer Security (TLS) protocol based on the TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 cipher suite. The hardware architecture of the proposed coordinators is based on SW/HW co-design, implementing within the hardware accelerator core Elliptic Curve Scalar Multiplication (ECSM), which is the core operation of Elliptic Curve Cryptosystems (ECC). Meanwhile, the control of the overall TLS scheme is performed in software by an ARM Cortex-A9 microprocessor. In fact, the implementation of the ECC accelerator core around an ARM microprocessor allows not only the improvement of ECSM execution but also the performance enhancement of the overall cryptosystem. The integration of the ARM processor enables to exploit the possibility of embedded Linux features for high system flexibility. As a result, the proposed ECC accelerator requires limited area, with only 3395 LUTs on the Zynq device used to perform high-speed, 233-bit ECSMs in 413 µs, with a 50 MHz clock. Moreover, the generation of a 384-bit TLS handshake secret key between client and server coordinators requires 67.5 ms on a low cost Zynq 7Z007S device

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional de la Universidad de Almería (Spain)

High speed world level finite field multipliers in F2m

Author: Namin Ashkan Hosseinzadeh
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2009
Field of study

Finite fields have important applications in number theory, algebraic geometry, Galois theory, cryptography, and coding theory. Recently, the use of finite field arithmetic in the area of cryptography has increasingly gained importance. Elliptic curve and El-Gamal cryptosystems are two important examples of public key cryptosystems widely used today based on finite field arithmetic. Research in this area is moving toward finding new architectures to implement the arithmetic operations more efficiently. Two types of finite fields are commonly used in practice, prime field GF(p) and the binary extension field GF(2 m). The binary extension fields are attractive for high speed cryptography applications since they are suitable for hardware implementations. Hardware implementation of finite field multipliers can usually be categorized into three categories: bit-serial, bit-parallel, and word-level architectures. The word-level multipliers provide architectural flexibility and trade-off between the performance and limitations of VLSI implementation and I/O ports, thus it is of more practical significance. In this work, different word level architectures for multiplication using binary field are proposed. It has been shown that the proposed architectures are more efficient compared to similar proposals considering area/delay complexities as a measure of performance. Practical size multipliers for cryptography applications have been realized in hardware using FPGA or standard CMOS technology, to similar proposals considering area/delay complexities as a measure of performance. Practical size multipliers for cryptography applications have been realized in hardware using FPGA or standard CMOS technology. Also different VLSI implementations for multipliers were explored which resulted in more efficient implementations for some of the regular architectures. The new implementations use a simple module designed in domino logic as the main building block for the multiplier. Significant speed improvements was achieved designing practical size multipliers using the proposed methodology

Scholarship at UWindsor

High-Speed Area-Efficient Hardware Architecture for the Efficient Detection of Faults in a Bit-Parallel Multiplier Utilizing the Polynomial Basis of GF(2m)

Author: Javidan Javad
Nabipour Saeideh
Publication venue
Publication date: 23/06/2023
Field of study

The utilization of finite field multipliers is pervasive in contemporary digital systems, with hardware implementation for bit parallel operation often necessitating millions of logic gates. However, various digital design issues, whether natural or stemming from soft errors, can result in gate malfunction, ultimately leading to erroneous multiplier outputs. Thus, to prevent susceptibility to error, it is imperative to employ an effective finite field multiplier implementation that boasts a robust fault detection capability. This study proposes a novel fault detection scheme for a recent bit-parallel polynomial basis multiplier over GF(2m), intended to achieve optimal fault detection performance for finite field multipliers while simultaneously maintaining a low-complexity implementation, a favored attribute in resource-constrained applications like smart cards. The primary concept behind the proposed approach is centered on the implementation of a BCH decoder that utilizes re-encoding technique and FIBM algorithm in its first and second sub-modules, respectively. This approach serves to address hardware complexity concerns while also making use of Berlekamp-Rumsey-Solomon (BRS) algorithm and Chien search method in the third sub-module of the decoder to effectively locate errors with minimal delay. The results of our synthesis indicate that our proposed error detection and correction architecture for a 45-bit multiplier with 5-bit errors achieves a 37% and 49% reduction in critical path delay compared to existing designs. Furthermore, the hardware complexity associated with a 45-bit multiplicand that contains 5 errors is confined to a mere 80%, which is significantly lower than the most exceptional BCH-based fault recognition methodologies, including TMR, Hamming's single error correction, and LDPC-based procedures within the realm of finite field multiplication.Comment: 9 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2209.1338

arXiv.org e-Print Archive

Multiple bit error correcting architectures over finite fields

Author: Poolakkaparambil M
Publication venue: 'Oxford Brookes University'
Publication date: 01/01/2012
Field of study

This thesis proposes techniques to mitigate multiple bit errors in GF arithmetic circuits. As GF arithmetic circuits such as multipliers constitute the complex and important functional unit of a crypto-processor, making them fault tolerant will improve the reliability of circuits that are employed in safety applications and the errors may cause catastrophe if not mitigated. Firstly, a thorough literature review has been carried out. The merits of efficient schemes are carefully analyzed to study the space for improvement in error correction, area and power consumption. Proposed error correction schemes include bit parallel ones using optimized BCH codes that are useful in applications where power and area are not prime concerns. The scheme is also extended to dynamically correcting scheme to reduce decoder delay. Other method that suits low power and area applications such as RFIDs and smart cards using cross parity codes is also proposed. The experimental evaluation shows that the proposed techniques can mitigate single and multiple bit errors with wider error coverage compared to existing methods with lesser area and power consumption. The proposed scheme is used to mask the errors appearing at the output of the circuit irrespective of their cause. This thesis also investigates the error mitigation schemes in emerging technologies (QCA, CNTFET) to compare area, power and delay with existing CMOS equivalent. Though the proposed novel multiple error correcting techniques can not ensure 100% error mitigation, inclusion of these techniques to actual design can improve the reliability of the circuits or increase the difficulty in hacking crypto-devices. Proposed schemes can also be extended to non GF digital circuits

Oxford Brookes University: RADAR

Private and Public-Key Side-Channel Threats Against Hardware Accelerated Cryptosystems

Author: Lalonde Dylan Roderick
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2017
Field of study

Modern side-channel attacks (SCA) have the ability to reveal sensitive data from non-protected hardware implementations of cryptographic accelerators whether they be private or public-key systems. These protocols include but are not limited to symmetric, private-key encryption using AES-128, 192, 256, or public-key cryptosystems using elliptic curve cryptography (ECC). Traditionally, scalar point (SP) operations are compelled to be high-speed at any cost to reduce point multiplication latency. The majority of high-speed architectures of contemporary elliptic curve protocols rely on non-secure SP algorithms. This thesis delivers a novel design, analysis, and successful results from a custom differential power analysis attack on AES-128. The resulting SCA can break any 16-byte master key the sophisticated cipher uses and it\u27s direct applications towards public-key cryptosystems will become clear. Further, the architecture of a SCA resistant scalar point algorithm accompanied by an implementation of an optimized serial multiplier will be constructed. The optimized hardware design of the multiplier is highly modular and can use either NIST approved 233 & 283-bit Kobliz curves utilizing a polynomial basis. The proposed architecture will be implemented on Kintex-7 FPGA to later be integrated with the ARM Cortex-A9 processor on the Zynq-7000 AP SoC (XC7Z045) for seamless data transfer and analysis of the vulnerabilities SCAs can exploit

Scholarship at UWindsor