10 research outputs found

    Optimizing S-box Implementations for Several Criteria using SAT Solvers

    Get PDF
    We explore the feasibility of applying SAT solvers to optimizing implementations of small functions such as S-boxes for multiple optimization criteria, e.g., the number of nonlinear gates and the number of gates. We provide optimized implementations for the S-boxes used in Ascon, ICEPOLE, Joltik/Piccolo, Keccak/Ketje/Keyak, LAC, Minalpher, PRIMATEs, Pr\o st, and RECTANGLE, most of which are candidates in the secound round of the CAESAR competition. We then suggest a new method to optimize for circuit depth and we make tooling publicly available to find efficient implementations for several criteria. Furthermore, we illustrate with the 5-bit S-box of PRIMATEs how multiple optimization criteria can be combined

    ПРОГРАМНА BITSLICED-ІМПЛЕМЕНТАЦІЯ ШИФРУ «КАЛИНА» ОРІЄНТОВАНА НА ВИКОРИСТАННЯ SIMD-ІНСТРУКЦІЙ МІКРОПРОЦЕСОРІВ З АРХІТЕКТУРОЮ Х86-64

    Get PDF
    The article is devoted to software bitsliced implementation of the Kalyna cipher using vector instructions SSE, AVX, AVX-512 for x86-64 processors. The advantages and disadvantages of different approaches to efficient and secure block cipher software implementation are shown. It is noted that bitslicing technology combines high speed and resistance to time and cache attacks, but its application to the Kalyna cipher is not available at the moment. The basic approaches to data representation and bitsliced encryption operations are considered, special attention is paid to the effective implementation of SubBytes operation, which largely determines the final performance. Existing methods for minimizing logical functions have been shown to either fail to produce the result in bitsliced format in the case of 8-bit non-algebraic SBoxs, or far from optimal. A heuristic algorithm for minimizing logic functions describing Kalyna SBoxes using the operations of AND, OR, XOR, NOT available in the instruction set of low- and high-end processors is proposed. The results show that a bitsliced description of one SBox requires about 520 gates, which is significantly less than other methods. Possible ways to increase performance by regrouping data into bitsliced variables before and after the SubBytes operation are indicated, which results in more efficient use of vector registers. The bitsliced implementations of Kalyna cipher were measured using C++ compilers from Microsoft and GCC for the Intel Xeon Skylake-SP processor. The results of the bitsliced Kalyna implementation can also be transferred to processors that do not support SIMD instructions, including low-end, to increase resistance to attacks through third-party channels. They also enable switching to ASIC or FPGA-based bitsliced implementation of Kalyna.Статтю присвячено програмній bitsliced-імплементації шифру «Калина» з використанням векторних інструкцій SSE, AVX, AVX-512 для х86-64 процесорів. Проаналізовано переваги і недоліки різних підходів до ефективної та захищеної програмної реалізації блокових шифрів. Відзначено, що технологія bitslicing поєднує в собі високу швидкодію та стійкість до часових- і кеш-атак, проте наразі відсутні її застосування щодо шифру «Калина». Розглянуто основні підходи до представлення даних і виконання операцій шифру у bitsliced-форматі, особливу увагу приділено ефективній реалізації операції SubBytes, що значною мірою визначає кінцеву швидкодію. Показано, що існуючі методи мінімізації логічних функцій або не дають змогу отримати результат у bitsliced-форматі у випадку 8-бітних неалгебраїчних SBox-ів, або результати далекі від оптимальних. Запропоновано евристичний алгоритм мінімізації логічних функцій, що описують SBox-и «Калини» з використанням операцій AND, OR, XOR, NOT, наявних у системі команд low- та high-end процесорів. У роботі одержані результати, які засвідчили, що для bitsliced-опису одного SBox потрібно близько 520 вентилів, що є відчутно менше ніж забезпечують інші методи. Вказано можливі шляхи збільшення швидкодії завдяки перегрупуванню даних в bitsliced-змінних до і після операції SubBytes, що призводить до ефективнішого використання векторних регістрів. Проведено вимірювання швидкодії bitsliced-реалізацій шифру «Калина» з використанням С++ компіляторів Microsoft та GCC для процесора Intel Xeon Skylake-SP. Одержані у роботі результати bitsliced-реалізації «Калина» можуть бути перенесені і на процесори, які не підтримують SIMD-інструкції, у тому числі low-end, щоб підвищити стійкість до атак через сторонні канали. Також вони дають змогу перейти до апаратної bitsliced-реалізації «Калини» на базі ASIC чи FPGA

    A Note on 5-bit Quadratic Permutations’ Classification

    Get PDF
    Classification of vectorial Boolean functions up to affine equivalence is used widely to analyze various cryptographic and implementation properties of symmetric-key algorithms. We show that there exist 75 affine equivalence classes of 5-bit quadratic permutations. Furthermore, we explore important cryptographic properties of these classes, such as linear and differential properties and degrees of their inverses, together with multiplicative complexity and existence of uniform threshold realizations

    ЕВРИСТИЧНИЙ МЕТОД ЗНАХОДЖЕННЯ BITSLICED-ОПИСУ ДОВІЛЬНИХ КРИПТОГРАФІЧНИХ S-Box

    Get PDF
    Bitsliced approach to the implementation of block ciphers combines advantages such as potentially high speed, security and unpretentiousness to computing resources. The main problem in the transition to the bitsliced-description of the cipher is the representation of the S-Box with a minimum number of logical operations. Known methods of minimizing the logical description of the S-Box have a number of limitations, for example, work only with small S-Box, are slow or inefficient, which generally hinders the use of bitsliced-approach. The paper proposes a new heuristic method of bitsliced-description of arbitrary cryptographic S-Box and compares its efficiency with existing methods on the example of S-Box DES cipher. The proposed method is focused on software implementation in the logical basis AND, OR, XOR, NOT, which allows implementation using standard logical instructions on any 8/16/32/64-bit processors. The method uses a number of heuristic techniques, such as, fast algorithms for exhaustive search at shallow depth, flexible procedure for planning the search process, search in depth, etc., which together provide high efficiency and speed. This allows you to adapt it to minimize the 8×8 S-Box, which is very relevant today for many block ciphers, including the domestic cipher "Kalyna". The proposed approach to the bitsliced-description of arbitrary S-Box eliminates the limitations of the known methods of such representation, which restrained the use of the bitcliced-approach in improving software implementations of block ciphers for a wide range of processor architectures.Bitsliced-подход к имплементации блочных шифров объединяет такие преимущества как потенциально высокое быстродействие, безопасность и нетребовательность к вычислительным ресурсам. Главной проблемой при переходе к bitsliced-описанию шифра является представление S-Box минимальным количеством логических операций. Известные методы минимизации логического описания S-Box имеют ряд ограничений, например, работают только с S-Box небольших размеров, являются медленными или неэффективными, что в целом сдерживает использование bitsliced-подхода. В работе предложен новый эвристический метод bitsliced-описания произвольных криптографических S-Box и проведено сравнение его эффективности с существующими методами на примере S-Box шифра DES. Предложенный метод ориентирован на программную реализацию в логическом базисе AND, OR, XOR, NOT, что допускает имплементацию с использованием стандартных логических инструкций на любых 8/16/32/64-битных процессорах. Метод использует ряд эвристических техник, таких как, быстрые алгоритмы исчерпывающего поиска на небольшую глубину, гибкую процедуру планирования процесса поиска, поиск в глубину и т.п., которые в комплексе обеспечивают высокую эффективность и быстродействие. Это позволяет адаптировать его для минимизации 8×8 S-Box, что сегодня очень актуально для многих блочных шифров, в частности отечественного шифра «Калина». Предложенный подход к bitsliced-описанию произвольных S-Box устраняет ограничения известных методов такого представления, которые сдерживали использование bitcliced-подхода при совершенствовании программных реализаций блочных шифров для широкого круга процессорных архитектур.Bitsliced-підхід до імплементації блокових шифрів поєднує такі переваги як потенційно високу швидкодію, безпеку і невимогливість до обчислювальних ресурсів. Головною проблемою при переході до bitsliced-опису шифру є представлення S-Box мінімальною кількістю логічних операцій. Відомі методи мінімізації логічного опису S-Box мають низку обмежень, наприклад, працюють лише з S-Box невеликих розмірів, є повільними або неефективними, що загалом стримує використання bitsliced-підходу. У роботі запропоновано новий евристичний метод bitsliced-опису довільних криптографічних S-Box та здійснено порівняння його ефективності з існуючими методами на прикладі S-Box шифру DES. Запропонований метод орієнтований на програмну реалізацію в логічному базисі AND, OR, XOR, NOT, що допускає імплементацію з використанням стандартних логічних інструкцій на будь-яких 8/16/32/64-бітних процесорах. Метод використовує низку евристичних технік, таких як, швидкі алгоритми вичерпного пошуку на невелику глибину, гнучку процедуру планування процесу пошуку, пошук в глибину тощо, що в комплексі забезпечують високу ефективність і швидкодію. Це дає змогу адаптувати його для мінімізації 8×8 S-Box, що на сьогодні є дуже актуальним для багатьох блокових шифрів, зокрема вітчизняного шифру «Калина». Запропонований підхід до bitsliced-опису довільних S-Box усуває обмеження відомих методів такого подання, що стримували використання bitcliced-підходу при удосконаленні програмних реалізацій блокових шифрів для широкого кола процесорних архітектур

    Changing of the Guards: a simple and efficient method for achieving uniformity in threshold sharing

    Get PDF
    Since they were first proposed as a countermeasure against differential power analysis (DPA) in 2006, threshold schemes have attracted a lot of attention from the community concentrating on cryptographic implementations. What makes threshold schemes so attractive from an academic point of view is that they come with an information-theoretic proof of resistance against a specific subset of side-channel attacks: first-order DPA. From an industrial point of view they are attractive as a careful threshold implementation forces adversaries to DPA of higher order, with all its problems such a noise amplification. A threshold scheme that offers the mentioned provable security must exhibit three properties: correctness, incompleteness and uniformity. A threshold scheme becomes more expensive with the number of shares that must be implemented and the required number of shares is lower bound by the algebraic degree of the function being shared plus 1. Defining a correct and incomplete sharing of a function of degree d in d+1 shares is straightforward. However, up to now there is no generic method to achieve uniformity and finding uniform sharings of degree-d functions with d+1 shares is an active research area. In this paper we present a simple and relatively cheap method to find a correct, incomplete and uniform d+1-share threshold scheme for any S-box layer consisting of degree-d invertible S-boxes. The uniformity is not implemented in the sharings of the individual S-boxes but rather at the S-box layer level by the use of feed-forward and some expansion of shares. When applied to the Keccak-p nonlinear step Chi, its cost is very small

    Optimizing S-Box Implementations for Several Criteria Using SAT Solvers

    No full text

    Translation of Algorithmic Descriptions of Discrete Functions to SAT with Applications to Cryptanalysis Problems

    Full text link
    In the present paper, we propose a technology for translating algorithmic descriptions of discrete functions to SAT. The proposed technology is aimed at applications in algebraic cryptanalysis. We describe how cryptanalysis problems are reduced to SAT in such a way that it should be perceived as natural by the cryptographic community. In~the theoretical part of the paper we justify the main principles of general reduction to SAT for discrete functions from a class containing the majority of functions employed in cryptography. Then, we describe the Transalg software tool developed based on these principles with SAT-based cryptanalysis specifics in mind. We demonstrate the results of applications of Transalg to construction of a number of attacks on various cryptographic functions. Some of the corresponding attacks are state of the art. We compare the functional capabilities of the proposed tool with that of other domain-specific software tools which can be used to reduce cryptanalysis problems to SAT, and also with the CBMC system widely employed in symbolic verification. The paper also presents vast experimental data, obtained using the SAT solvers that took first places at the SAT competitions in the recent several years

    Tornado: Automatic Generation of Probing-Secure Masked Bitsliced Implementations

    Get PDF
    International audienceCryptographic implementations deployed in real world devices often aim at (provable) security against the powerful class of side-channel attacks while keeping reasonable performances. Last year at Asiacrypt, a new formal verification tool named tightPROVE was put forward to exactly determine whether a masked implementation is secure in the well-deployed probing security model for any given security order t. Also recently, a compiler named Usuba was proposed to automatically generate bitsliced implementations of cryptographic primitives.This paper goes one step further in the security and performances achievements with a new automatic tool named Tornado. In a nutshell, from the high-level description of a cryptographic primitive, Tornado produces a functionally equivalent bitsliced masked implementation at any desired order proven secure in the probing model, but additionally in the so-called register probing model which much better fits the reality of software implementations. This framework is obtained by the integration of Usuba with tightPROVE+, which extends tightPROVE with the ability to verify the security of implementations in the register probing model and to fix them with inserting refresh gadgets at carefully chosen locations accordingly.We demonstrate Tornado on the lightweight cryptographic primitives selected to the second round of the NIST competition and which somehow claimed to be masking friendly. It advantageously displays performances of the resulting masked implementations for several masking orders and prove their security in the register probing model

    Optimizing Implementations of Lightweight Building Blocks

    Get PDF
    We study the synthesis of small functions used as building blocks in lightweight cryptographic designs in terms of hardware implementations. This phase most notably appears during the ASIC implementation of cryptographic primitives. The quality of this step directly affects the output circuit, and while general tools exist to carry out this task, most of them belong to proprietary software suites and apply heuristics to any size of functions. In this work, we focus on small functions (4- and 8-bit mappings) and look for their optimal implementations on a specific weighted instructions set which allows fine tuning of the technology. We propose a tool named LIGHTER, based on two related algorithms, that produces optimized implementations of small functions. To demonstrate the validity and usefulness of our tool, we applied it to two practical cases: first, linear permutations that define diffusion in most of SPN ciphers; second, non-linear 4-bit permutations that are used in many lightweight block ciphers. For linear permutations, we exhibit several new MDS diffusion matrices lighter than the state-of-the-art, and we also decrease the implementation cost of several already known MDS matrices. As for non-linear permutations, LIGHTER outperforms the area-optimized synthesis of the state-of-the-art academic tool ABC. Smaller circuits can also be reached when ABC and LIGHTER are used jointly

    Performance Evaluation, Comparison and Improvement of the Hardware Implementations of the Advanced Encryption Standard S-box

    Get PDF
    The Advanced Encryption Standard (AES) is the most popular algorithm used in symmetric key cryptography. The efficient computation of AES is essential for many computing platforms. The S-box is the only nonlinear transformation step of the AES algorithm. Efficient implementation of the AES S-box is very crucial for AES hardware. The AES S-box could be implemented by using look-up table method or by using finite field arithmetic. The finite field arithmetic design approach to implement the AES S-box is superior in saving the hardware resources. The main objective of this thesis is to evaluate, compare and improve the hardware implementations of the forward, inverse and combined AES S-box in terms of area and/or delay. Both the composite field GF((2^4)^2) and the tower field GF(((2^2)^2)^2) are considered. Our first improvement is the optimization of the input and output linear mappings of the S- box in order to design a more compact circuit. Our second improvement aims at modifying the architecture of the S-box to achieve a higher speed. We used multiplication of the S-box input by an 8-bit binary field element to optimize the input and output transformation matrices of the S-box. A Matlab® search is then conducted to find more compact linear mappings for the S-box. We also modified the fast S-box architecture, in addition to optimizing and searching the extended linear input mappings to improve the speed of Reyhani et al. fast S-box. The improved fast S-box, Fast 3, is the fastest and most efficient (measured by area × delay) AES S-box available in the literature, up to our knowledge. We also improved the area and delay of the inversion circuit of the lightweight and fast S-boxes in [69], by slightly modifying the exponentiation block and designing a new subfield inverter block. The improved inversion circuit leads to a more compact and a faster lightweight S-box and it yields a lower area fast S-box. Moreover, we show that the “tech. XORs” concept proposed by Maximov et al. [54] to estimate the delay of the S-box is not accurate. We show how to use the logical effort method [74] instead to estimate and compare the delay of previous and improved S-boxes, regardless of the CMOS technology library used for the implementation. We verified all the codes at the RTL level using Mentor Graphics Modelsim®, by comparing against the legitimate S-box outputs. We synthesized the designs using STM 65nm CMOS standard cell library and we used VHDL coding as the design entry method to Synopsys Design Compiler®. The synthesis results confirm the lower area and / or delay of the improved S-box designs and match our space and timing analyses
    corecore