24 research outputs found

    On the Construction of Near-MDS Matrices

    Full text link
    The optimal branch number of MDS matrices makes them a preferred choice for designing diffusion layers in many block ciphers and hash functions. However, in lightweight cryptography, Near-MDS (NMDS) matrices with sub-optimal branch numbers offer a better balance between security and efficiency as a diffusion layer, compared to MDS matrices. In this paper, we study NMDS matrices, exploring their construction in both recursive and nonrecursive settings. We provide several theoretical results and explore the hardware efficiency of the construction of NMDS matrices. Additionally, we make comparisons between the results of NMDS and MDS matrices whenever possible. For the recursive approach, we study the DLS matrices and provide some theoretical results on their use. Some of the results are used to restrict the search space of the DLS matrices. We also show that over a field of characteristic 2, any sparse matrix of order n≥4n\geq 4 with fixed XOR value of 1 cannot be an NMDS when raised to a power of k≤nk\leq n. Following that, we use the generalized DLS (GDLS) matrices to provide some lightweight recursive NMDS matrices of several orders that perform better than the existing matrices in terms of hardware cost or the number of iterations. For the nonrecursive construction of NMDS matrices, we study various structures, such as circulant and left-circulant matrices, and their generalizations: Toeplitz and Hankel matrices. In addition, we prove that Toeplitz matrices of order n>4n>4 cannot be simultaneously NMDS and involutory over a field of characteristic 2. Finally, we use GDLS matrices to provide some lightweight NMDS matrices that can be computed in one clock cycle. The proposed nonrecursive NMDS matrices of orders 4, 5, 6, 7, and 8 can be implemented with 24, 50, 65, 96, and 108 XORs over F24\mathbb{F}_{2^4}, respectively

    A Framework with Improved Heuristics to Optimize Low-Latency Implementations of Linear Layers

    Get PDF
    In recent years, lightweight cryptography has been a hot field in symmetric cryptography. One of the most crucial problems is to find low-latency implementations of linear layers. The current main heuristic search methods include the Boyar-Peralta (BP) algorithm with depth limit and the backward search. In this paper we firstly propose two improved BP algorithms with depth limit mainly by minimizing the Euclidean norm of the new distance vector instead of maximizing it in the tie-breaking process of the BP algorithm. They can significantly increase the potential for finding better results. Furthermore, we give a new framework that combines forward search with backward search to expand the search space of implementations, where the forward search is one of the two improved BP algorithms. In the new framework, we make a minor adjustment of the priority of rules in the backward search process to enable the exploration of a significantly larger search space. As results, we find better results for the most of matrices studied in previous works. For example, we find an implementation of AES MixColumns of depth 3 with 99 XOR gates, which represents a substantial reduction of 3 XOR gates compared to the existing record of 102 XOR gates

    Improved Heuristics for Low-latency Implementations of Linear Layers

    Get PDF
    In many applications, low area and low latency are required for the chip-level implementation of cryptographic primitives. The low-cost implementations of linear layers usually play a crucial role for symmetric ciphers. Some heuristic methods, such as the forward search and the backward search, minimize the number of XOR gates of the linear layer under the minimum latency limitation. For the sake of achieving further optimization for such implementation of the linear layer, we put forward a new general search framework attaching the division optimization and extending base techniques in this paper. In terms of the number of XOR gates and the searching time, our new search algorithm is better than the previous heuristics, including the forward search and the backward search when testing matrices provided by them. We obtain an improved implementation of AES MixColumns requiring only 102 XORs under minimum latency, which outdoes the previous best record provided by the forward search

    Design of Lightweight Linear Diffusion Layers from Near-MDS Matrices

    Get PDF
    Near-MDS matrices provide better trade-offs between security and efficiency compared to constructions based on MDS matrices, which are favored for hardwareoriented designs. We present new designs of lightweight linear diffusion layers by constructing lightweight near-MDS matrices. Firstly generic n×n near-MDS circulant matrices are found for 5 ≤ n ≤9. Secondly, the implementation cost of instantiations of the generic near-MDS matrices is examined. Surprisingly, for n = 7, 8, it turns out that some proposed near-MDS circulant matrices of order n have the lowest XOR count among all near-MDS matrices of the same order. Further, for n = 5, 6, we present near-MDS matrices of order n having the lowest XOR count as well. The proposed matrices, together with previous construction of order less than five, lead to solutions of n×n near-MDS matrices with the lowest XOR count over finite fields F2m for 2 ≤ n ≤ 8 and 4 ≤ m ≤ 2048. Moreover, we present some involutory near-MDS matrices of order 8 constructed from Hadamard matrices. Lastly, the security of the proposed linear layers is studied by calculating lower bounds on the number of active S-boxes. It is shown that our linear layers with a well-chosen nonlinear layer can provide sufficient security against differential and linear cryptanalysis

    Shorter Linear Straight-Line Programs for MDS Matrices

    Get PDF
    Recently a lot of attention is paid to the search for efficiently implementable MDS matrices for lightweight symmetric primitives. Previous work concentrated on locally optimizing the multiplication with single matrix elements. Separate from this line of work, several heuristics were developed to find shortest linear straight-line programs. Solving this problem actually corresponds to globally optimizing multiplications by matrices. In this work we combine those, so far largely independent line of works. As a result, we achieve implementations of known, locally optimized, and new MDS matrices that significantly outperform all implementations from the literature. Interestingly, almost all previous locally optimized constructions behave very similar with respect to the globally optimized implementation. As a side effect, our work reveals the so far best implementation of the AES MixColumns operation with respect to the number of XOR operations needed

    The QARMA Block Cipher Family. Almost MDS Matrices Over Rings With Zero Divisors, Nearly Symmetric Even-Mansour Constructions With Non-Involutory Central Rounds, and Search Heuristics for Low-Latency S-Boxes

    Get PDF
    This paper introduces QARMA, a new family of lightweight tweakable block ciphers targeted at applications such as memory encryption, the generation of very short tags for hardware-assisted prevention of software exploitation, and the construction of keyed hash functions. QARMA is inspired by reflection ciphers such as PRINCE, to which it adds a tweaking input, and MANTIS. However, QARMA differs from previous reflector constructions in that it is a three-round Even-Mansour scheme instead of a FX-construction, and its middle permutation is non-involutory and keyed. We introduce and analyse a family of Almost MDS matrices defined over a ring with zero divisors that allows us to encode rotations in its operation while maintaining the minimal latency associated to {0, 1}-matrices. The purpose of all these design choices is to harden the cipher against various classes of attacks. We also describe new S-Box search heuristics aimed at minimising the critical path. QARMA exists in 64- and 128-bit block sizes, where block and tweak size are equal, and keys are twice as long as the blocks. We argue that QARMA provides sufficient security margins within the constraints determined by the mentioned applications, while still achieving best-in-class latency. Implementation results on a state-of-the art manufacturing process are reported. Finally, we propose a technique to extend the length of the tweak by using, for instance, a universal hash function, which can also be used to strengthen the security of QARMA

    Direct Construction of Lightweight Rotational-XOR MDS Diffusion Layers

    Get PDF
    As a core component of Substitution-Permutation Networks, diffusion layer is mainly introduced by matrices from maximum distance separable (MDS) codes. Surprisingly, up to now, most constructions of MDS matrices require to perform an equivalent or even exhaustive search. Especially, not many MDS proposals are known that obtain an excellent hardware efficiency and simultaneously guarantee a remarkable software implementation. In this paper, we study the cyclic structure of rotational-XOR diffusion layer, one of the commonly used linear layers over (F2b)n{(\mathbb{F}_{\rm{2}}^b)^n}, which consists of only rotation and XOR operations. First, we provide novel properties on this class of matrices, and prove the a lower bound on the number of rotations for n≥4n \ge 4 and show the tightness of the bound for n=4n=4. Next, by precisely characterizing the relation among sub-matrices for each possible form, we can eliminate all the other non-optimal cases. Finally, we present a direct construction of such MDS matrices, which allows to generate 4×44 \times 4 perfect instances for arbitrary b≥4b \ge 4. Every example contains the fewest possible rotations, so under this construction strategy, our proposal costs the minimum gate equivalents (resp. cyclic shift instructions) in the hardware (resp. software) implementation. To the best of our knowledge, it is the first time that rotational-XOR MDS diffusion layers have been constructed without any auxiliary search

    Construction of MDS Matrices from Generalized Feistel Structures

    Get PDF
    This paper investigates the construction of MDS matrices with generalized Feistel structures (GFS). The approach developed by this paper consists in deriving MDS matrices from the product of several sparser ones. This can be seen as a generalization to several matrices of the recursive construction which derives MDS matrices as the powers of a single companion matrix. The first part of this paper gives some theoretical results on the iteration of GFS. In second part, using GFS and primitive matrices, we propose some types of sparse matrices that are called extended primitive GFS (EGFS) matrices. Then, by applying binary linear functions to several round of EGFS matrices, lightweight 4×44\times 4, 6×66\times 6 and 8×88\times 8 MDS matrices are proposed which are implemented with 6767, 156156 and 260260 XOR for 88-bit input, respectively. The results match the best known lightweight 4×44\times 4 MDS matrix and improve the best known 6×66\times 6 and 8×88\times 8 MDS matrices. Moreover, we propose 8×88\times 8 Near-MDS matrices such that the implementation cost of the proposed matrices are 108108 and 204204 XOR for 4 and 88-bit input, respectively. Although none of the presented matrices are involutions, the implementation cost of the inverses of the proposed matrices is equal to the implementation cost of the given matrices. Furthermore, the construction presented in this paper is relatively general and can be applied for other matrix dimensions and finite fields as well

    Quantum Implementation of ASCON Linear Layer

    Get PDF
    In this paper, we show an in-place implementation of the ASCON linear layer. An in-place implementation is important in the context of quantum computing, we expect our work will be useful in quantum implementation of ASCON. In order to get the implementation, we first write the ASCON linear layer as a binary matrix; then apply two legacy algorithms (Gauss-Jordan elimination and PLU factorization) as well as our modified version of Xiang et al.\u27s algorithm/source-code (published in ToSC/FSE\u2720). Our in-place implementation takes 1595 CNOT gates and 119 quantum depth; and this is the first in-place implementation of the ASCON linear layer, to the best of our knowledge
    corecore