117 research outputs found

    Investigating the VLSI Characterization of Parallel Signed Multipliers for RNS Applications Using FPGAs

    Get PDF
    Signed multiplication is a complex arithmetic operation, which is reflected in its relatively high signal propagation delay, high power dissipation, and large area requirement. High reliability applications such as Cryptography, Residue Number System (RNS) and Digital Signal Processing (DSP)2019;s effective performance is mainly depend on its arithmetic circuit's performance. Trend of using Residue Number System (RNS) instead of Constrain over-whelming Binary representation is promising technique in VLSI Systems and Multiplier is the basic building block of such systems. In this paper we have considered signed Modified Baugh Wooley Multiplier and Modified Booth Encoding (MBE) Multiplier logic for analysis and synthesized on best suited application platform. Analysis has taken account of Delay, Number of Logic Element requirements; Number of Signal Transition for particular sample input and its Power Consumption were analyzed for both Modified Baugh Wooley Multiplier and Modified Booth Encoding Multiplier. Analysis of Multiplier is described in Verilog HDL and Simulated using two different simulators namely Xilinx ISIM and Altera Quartus II. Then for comparative study, both multipliers are synthesized with Xilinx Virtex 7 XCV2000T-2FLG1925 and Altera Cyclone II EP2C35F672C6 and same parameter as discussed above are also evaluated. Booth Recoding provides overall advent of 9.691% in terms of area and approximately 43 % in terms of Delay compared to Modified Baugh Wooley Multiplier implemented using FPGA Technology

    Radix-8 Booth Encoded Modulo Multiplier

    Get PDF
    Abstract To design an efficient integrated circuit in terms of area, power and speed, has become a challenging task in modern VLSI design field. The encryption and decryption of PKC algorithms are performed by repeated modulo multiplications these multiplications differ from those encountered in signal processing and general computing applications. The Residue Number System (RNS) has emerged as a promising alternative number representation for the design of faster and low power multipliers owing to its merit to distribute a long integer multiplication into several shorter and independent modulo multiplications. The multipliers are the essential elements of the digital signal processing such as filtering, convolution, transformations and Inner products. RNS has also been successfully employed to design fault tolerant digital circuits. The modulo multiplier is usually the noncritical data path among all modulo multipliers in such high-DR RNS multiplier. This timing slack can be exploited to reduce the system area and power consumption without compromising the system performance. With this precept, a family of radix-8 Booth encoded modulo multipliers, with delay adaptable to the RNS multiplier delay, is proposed. In this paper, the radix-8 Booth encoded modulo multipliers whose delay can be tuned to match the RNS delay. In the proposed multiplier, the hard multiple is implemented using small word-length ripple carry adders (RCAs) operating in parallel. The carry-out bits from the adders are not propagated but treated as partial product bits to be accumulated in the CSA tree. The delay of the modulo multiplier can be directly controlled by the word-length of the RCAs to equal the delay of the critical modulo multiplier of the RNS. By combining radix-8 Booth encoded modulo multiplier, CSA and prefix architecture of multiplier, for high speed and low-power is achieved

    DESIGN OF RADIX-8 MODIFIED BOOTH ENCODED BASED MODULO 2n+1 MULTIPLIER USING HARD MULTIPLE GENERATOR

    Get PDF
    The method 2n + 1 multiplier is the congestion of a wide drift of applications from silt collection structure arithmetic to cryptanalysis. Recently, with the request for low-power and energy-efficient designs, the radix-8 Booth recoding archaic considered to evolve modulo 2n + 1 multipliers. This temporary presents two different manners to raise the performance and improve the competence of radix-8 modulo 2n + 1 multipliers. The first skill is a manner to far cut down on the part of bias terms that need afterlife organized. The assist routine is a new hard numerous alternator stationed on a parallel-prefix network that computer only for odd positions; it gravitates a lightweight parallel prefix adder for the computing of a third of a portion with significant area-saving and upgraded fan-out. The implementation results positioned on the TSMC 65-nm robotics show improvements of not fully 27% and meantime 57% in the area–time2 stock when compared with the lately planned radix-8 multiplier

    Towards Verifying Nonlinear Integer Arithmetic

    Full text link
    We eliminate a key roadblock to efficient verification of nonlinear integer arithmetic using CDCL SAT solvers, by showing how to construct short resolution proofs for many properties of the most widely used multiplier circuits. Such short proofs were conjectured not to exist. More precisely, we give n^{O(1)} size regular resolution proofs for arbitrary degree 2 identities on array, diagonal, and Booth multipliers and quasipolynomial- n^{O(\log n)} size proofs for these identities on Wallace tree multipliers.Comment: Expanded and simplified with improved result

    Efficient modular arithmetic units for low power cryptographic applications

    Get PDF
    The demand for high security in energy constrained devices such as mobiles and PDAs is growing rapidly. This leads to the need for efficient design of cryptographic algorithms which offer data integrity, authentication, non-repudiation and confidentiality of the encrypted data and communication channels. The public key cryptography is an ideal choice for data integrity, authentication and non-repudiation whereas the private key cryptography ensures the confidentiality of the data transmitted. The latter has an extremely high encryption speed but it has certain limitations which make it unsuitable for use in certain applications. Numerous public key cryptographic algorithms are available in the literature which comprise modular arithmetic modules such as modular addition, multiplication, inversion and exponentiation. Recently, numerous cryptographic algorithms have been proposed based on modular arithmetic which are scalable, do word based operations and efficient in various aspects. The modular arithmetic modules play a crucial role in the overall performance of the cryptographic processor. Hence, better results can be obtained by designing efficient arithmetic modules such as modular addition, multiplication, exponentiation and squaring. This thesis is organized into three papers, describes the efficient implementation of modular arithmetic units, application of these modules in International Data Encryption Algorithm (IDEA). Second paper describes the IDEA algorithm implementation using the existing techniques and using the proposed efficient modular units. The third paper describes the fault tolerant design of a modular unit which has online self-checking capability --Abstract, page iv

    Booth Algorithm with Implementation of UART Module using FPGA

    Get PDF
    FPGA gives high level of flexibility to the user to rapidly construct and test any hardware. It has a lot of gates which are used depending upon the hardware to be implemented. These project aims at designing Booth multiplier using VHDL for signed bit multiplication in FPGA for high speed operations, developed and implemented of UART module required to enable two-way communication between the DE-2 board and computer. It is also designed GUI interface using MATLAB for sending data and enable the output of the process result to be displayed. The Booth multiplier was implemented using the algorithm in both signed and unsigned number and the input and output of the multiplication was successfully achieved and confirmed through simulation. The GUI was implemented and tested, which UART module also performed well for transmitting and receiving of 8-bit width data. In general, the objective of this project was successfully achieved, which, the result of the component part were able to be tested

    Efficient Computation and FPGA implementation of Fully Homomorphic Encryption with Cloud Computing Significance

    Get PDF
    Homomorphic Encryption provides unique security solution for cloud computing. It ensures not only that data in cloud have confidentiality but also that data processing by cloud server does not compromise data privacy. The Fully Homomorphic Encryption (FHE) scheme proposed by Lopez-Alt, Tromer, and Vaikuntanathan (LTV), also known as NTRU(Nth degree truncated polynomial ring) based method, is considered one of the most important FHE methods suitable for practical implementation. In this thesis, an efficient algorithm and architecture for LTV Fully Homomorphic Encryption is proposed. Conventional linear feedback shift register (LFSR) structure is expanded and modified for performing the truncated polynomial ring multiplication in LTV scheme in parallel. Novel and efficient modular multiplier, modular adder and modular subtractor are proposed to support high speed processing of LFSR operations. In addition, a family of special moduli are selected for high speed computation of modular operations. Though the area keeps the complexity of O(Nn^2) with no advantage in circuit level. The proposed architecture effectively reduces the time complexity from O(N log N) to linear time, O(N), compared to the best existing works. An FPGA implementation of the proposed architecture for LTV FHE is achieved and demonstrated. An elaborate comparison of the existing methods and the proposed work is presented, which shows the proposed work gains significant speed up over existing works

    Design And Implementation Of Rsa Cryptosystem Using Partially Interleaved Modular Karatsuba-ofman Multiplier

    Get PDF
    Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2012Thesis (M.Sc.) -- İstanbul Technical University, Institute of Science and Technology, 2012Kriptografinin, yani şifreleme biliminin önemi gün geçtikçe artmaktadır. Kullanılan teknoloji ne olursa olsun güvenli iletişim her zaman en başta gelen ihtiyaçlardan birisi olacaktır. Günümüzde şifreleme, sistemlere kullanıcı hesabıyla giriş yapılıyorken, internetten herhangi bir hizmet ya da ürün satın alınıyorken, iletişim araçları kullanılıyorken, araçların kapıları uzaktan kilitlenip açılıyorken ve daha birçok yerde kullanılmaktadır. Kullanım alanına ve güvenlik ihtiyacının niteliğine göre değişik şifreleme algoritmaları farklı teknolojilerle karşımıza çıkmaktadır. Bu algoritmaların bir tanesi de Rivest-Shamir-Adleman(RSA) şifreleme sistemidir. RSA bankacılık başta olmak üzere birçok sektörde şifreleme ve sayısal imza işlemleri için sıklıkla kullanılmaktadır. RSA algoritması gücünü çok büyük sayıların asal çarpanlarına ayrılmalarındaki zorluktan almaktadır. Özetle çok büyük sayılarla yapılan modüler üs alma işlemlerinden oluşmaktadır. Modüler üs alma işlemleri de özünde modüler çarpma işlemlerinden ibaret olduğu için hızlı bir RSA gerçeklemesi ancak hızlı modüler çarpma işlemi yapan bir tasarımla mümkün olmaktadır. Güvenlik ve hız gibi sebeplerden ötürü RSA kriptosistemi genellikle modüler üs alma işleminin yazılımda gerçeklenmesi ve modüler çarpma işlemlerinin de donanımda tasarlanan özel bloklarla yapılması yoluyla gerçeklenir. Modüler üs alma işlemi için birçok yöntem mevcuttur. Bu yöntemler modüler üs alma işlemi esnasında yapılan modüler çarpma işlemi sayısını değişik yöntemlerle en aza indirmeye çalışırlar. Modüler çarpma işlemi için de bilim dünyasında epeyi çalışmalar yapılmıştır. Modüler çarpma, önce çarpıp sonra indirgeme yapma ya da çarpma ve indirgeme işlemlerini iç-içe yapma gibi iki yöntemle mümkündür. Alan kısıtlamaları sebebiyle çoğunlukla ikinci metot tercih edilmektedir. İndirgeme işleminin uygulanma yönüne göre modüler çarpma algoritmaları soldan sağa doğru işleyenler ve sağdan sola doğru işleyenler olmak üzere ikiye ayrılır. Soldan sağa doğru işleyen yöntemlerin en bilindik örnekleri Blakley ve Barrett algoritmalarıdır. Sağdan sola doğru indirgeme yapan sınıfa ise tek örnek Montgomery yöntemidir. Hızlı çarpma yapan Karatsuba-Ofman, Schönhage-Strassen gibi yöntemler olsa da bu hızlı çarpıcılar indirgeme algoritmalarıyla birleştirilememektedirler. Bu sorun paralel çalışmaya izin vermeyen indirgeme yöntemlerinden kaynaklanmaktadır. Ancak Kaihara ve Takagi tarafından bilim dünyasına sunulan İki Parçalı Modüler çarpma yöntemi bu soruna bir nebze de olsa çözüm bulmaktadır. Bu indirgeme metodu Montgomery algoritmasının sahip olduğu bir özelliği kullanarak modüler çarpmadaki çarpanı ikiye ayırmakta ve böylelikle Blakley ve Montgomery paralel olarak çalışabilmektedir. Hızlı çarpma algoritmalarından birisi olan Karatsuba-Ofman yöntemiyle İki Parçalı indirgemeyi birleştiren ilk çalışma Gökay Saldamlı tarafından ortaya atılmıştır. İki Parçalı Örgü Karatsuba-Ofman çarpıcısı iki parçalı indirgemeyi Karatsuba-Ofman rekürsif çarpma yönteminin en üst katmanında birleştirmektedir. Bu yeni yöntemin gerçeklenmesinde Montgomery çarpıcısına, Blakley modüler çarpıcısına ve standart çarpma yapan bloklara ihtiyaç vardır. Ancak bu algoritma Montgomery ve Blakley modüllerinde literatürdeki daha önceki gerçeklemelerde bulunmayan değerlerin de hesaplanmasına gereksinim duymaktadır. Bu tezde İki Parçalı Modüler Örgü Karatsuba-Ofman çarpıcısının donanımda gerçeklenmesine ait iki çalışma ve bu tasarımlardan birisi kullanılarak gerçeklenen RSA kriptosistemi anlatılmaktadır. Bu çalışmalardan ilki çarpanı birer bit işleyen gerçeklemedir. FPGA teknolojisinde gerçeklenmiştir. İkinci gerçekleme ise ilk tasarımdaki eksikleri kapatan ve daha hızlı bir modüler çarpma için kodlama yöntemleri, daha fazla sayıda bit işleme, donanımın çalışma frekansını arttırmak için en büyük gecikmeye sahip yolu kontrol sinyallerini kullanarak optimize etmeye çalışan bir ASIC gerçeklemesidir. Bu tezdeki donanım tasarımları İki Parçalı Örgü Karatsuba-Ofman çarpma yönteminin ilk gerçeklemeleridirler. Montgomery ve Blakley algoritmaları bu yeni yöntem için yeniden düzenlenmiştir. Tasarımların ikisi için de Maple’da kütüphaneler oluşturulmuş ve iki tasarım da Maple ortamında donanımla aynı yapıda gerçeklenmiş, gerekli testler yapılmış ve simülatörlerden gelen sonuçlarla yazılım gerçeklemesinden gelen sonuçlar karşılaştırılarak tasarımların doğru çalıştığı kanıtlanmıştır. İkinci modüler çarpma gerçeklemesi RSA kriptosistemi içinde modüler çarpıcı olarak kullanılmış ve piyasadaki RSA kırmıklarıyla karşılaştırılabilir sonuçlara ulaşılmıştır.Importance of cryptography is becoming more and more important day by day. Secure communication will always be a crucial need independent from the technology in use. Applications of cryptography can be seen in online banking, purchases of goods and services by means of credit cards, ID cards, remote lock and start systems for cars, telecommunication, and in many more places. Different cryptosystems are employed according to different needs and different security levels. Rivest-Shamir-Adleman(RSA) Algorithm is one of the most popular cryptosystem that is used in many sectors like banking. It takes its strength from factorization of very large integers. Briefly, it consists of modular exponentiations with large integers which are 1024-bit or 2048-bit numbers. As modular exponentiation operations are fundamentally composed of modular multiplications, designing a fast RSA cryptosystem becomes possible only with a fast modular multiplier implementation. Due to security and speed issues, modular exponentiation in RSA is implemented in software and modular multiplications are carried out in modular multipliers which are implemented in hardware. Many methods exist in the literature in order to perform modular exponentiation. These methods try to reduce the total number of multiplications that are required for a modular exponetiation operation. In addition to modular exponentiation algorithms, there are several techniques to do modular multiplication as well. Modular multiplication may be carried out either by multiplying first and performing reduction later on, or by interleaving multiplication and reduction stages. In order to reach compact hardware designs the latter approach is utilized. According to the direction of reduction operation, modular multiplication algorithms may be classified into two groups, which are algorithms reducing from left-to-right and from right-to-left. The most well known examples of left-to-right approach are Blakley and Barrett algorithms, whereas Montgomery multiplication is the only member of the right-to-left modular multiplication. Although fast multiplication algorithms, such as Karatsuba-Ofman(KO), Schönhage-Strassen exist, these multiplication methods can not be interleaved with reduction algorithms. This is caused from the reduction approaches not allowing parallel processing. However, Bipartite Modular Multiplication(BMM) method introduced by Kaihara and Takagi proposes a partial solution to the parallel reduction problem. This reduction method modifies a feature of Montgomery algorithm. This modification separates the multiplier bits into two halves so that a product can be reduced from left and right simultaneously without a dependency issue. The first method which combines Karatsuba-Ofman multiplication with bipartite reduction was proposed by Gökay Saldamlı. His algoritm, namely Partially Interleaved Modular Karatsuba-Ofman Multiplication, interleaves KO multiplier with the bipartite reduction on the uppermost layer of KO’s recursion. Implementation of this new method requires Montgomery multiplier, Blakley multiplier and standard integer multipliers. However, for this approach, both Montgomery and Blakley multipliers are needed to be designed in a way that, they compute not only the modular multiplication result, but also quotient values, what is somewhat different than existing implementations. In this thesis, two hardware implementations of Partially Interleaved Modular KO Multiplication and an RSA implementation utilizing the designed multiplier are proposed. The first design is Radix-2 implementation on FPGA technology. The second hardware implementation explores the design methodologies in order to reach a faster modular multiplier. These design methodologies include employing high radices, optimization of critical path according to the effect of control signals and analyzing parameter dependencies in Partially Interleaved Modular KO Multiplier and scheduling jobs effectively. Improved design was implemented on ASIC technology using 90 nm TSMC standard cell libraries in Design Compiler. Hardware implementations proposed in this thesis are the first hardware implementations of Partially Interleaved Modular KO Multiplication method. Montgomery and Blakley multiplication algorithms were modified in order to produce desired results. Maple libraries which emulate the operation of hardware building blocks were written and both hardware implementations were firstly implemented in Maple. Tests were run with random input vectors and when correct operation of Maple implementations were verified, hardwares were described in VHDL and implemented using FPGA and ASIC design tools respectively. The second implementation of Partially Interleaved Modular KO multiplier was used as a modular multiplier for RSA cryptosystem and very promising results which are comparable with commercial RSA chips were achieved.Yüksek LisansM.Sc

    A high-performance inner-product processor for real and complex numbers.

    Get PDF
    A novel, high-performance fixed-point inner-product processor based on a redundant binary number system is investigated in this dissertation. This scheme decreases the number of partial products to 50%, while achieving better speed and area performance, as well as providing pipeline extension opportunities. When modified Booth coding is used, partial products are reduced by almost 75%, thereby significantly reducing the multiplier addition depth. The design is applicable for digital signal and image processing applications that require real and/or complex numbers inner-product arithmetic, such as digital filters, correlation and convolution. This design is well suited for VLSI implementation and can also be embedded as an inner-product core inside a general purpose or DSP FPGA-based processor. Dynamic control of the computing structure permits different computations, such as a variety of inner-product real and complex number computations, parallel multiplication for real and complex numbers, and real and complex number division. The same structure can also be controlled to accept redundant binary number inputs for multiplication and inner-product computations. An improved 2's-complement to redundant binary converter is also presented
    corecore