53,645 research outputs found
Easy scalar decompositions for efficient scalar multiplication on elliptic curves and genus 2 Jacobians
The first step in elliptic curve scalar multiplication algorithms based on
scalar decompositions using efficient endomorphisms-including
Gallant-Lambert-Vanstone (GLV) and Galbraith-Lin-Scott (GLS) multiplication, as
well as higher-dimensional and higher-genus constructions-is to produce a short
basis of a certain integer lattice involving the eigenvalues of the
endomorphisms. The shorter the basis vectors, the shorter the decomposed scalar
coefficients, and the faster the resulting scalar multiplication. Typically,
knowledge of the eigenvalues allows us to write down a long basis, which we
then reduce using the Euclidean algorithm, Gauss reduction, LLL, or even a more
specialized algorithm. In this work, we use elementary facts about quadratic
rings to immediately write down a short basis of the lattice for the GLV, GLS,
GLV+GLS, and Q-curve constructions on elliptic curves, and for genus 2 real
multiplication constructions. We do not pretend that this represents a
significant optimization in scalar multiplication, since the lattice reduction
step is always an offline precomputation---but it does give a better insight
into the structure of scalar decompositions. In any case, it is always more
convenient to use a ready-made short basis than it is to compute a new one
Easy scalar decompositions for efficient scalar multiplication on elliptic curves and genus 2 Jacobians
International audienceThe first step in elliptic curve scalar multiplication algorithms based on scalar decompositions using efficient endomorphisms---including Gallant--Lambert--Vanstone (GLV) and Galbraith--Lin--Scott (GLS) multiplication, as well as higher-dimensional and higher-genus constructions---is to produce a short basis of a certain integer lattice involving the eigenvalues of the endomorphisms. The shorter the basis vectors, the shorter the decomposed scalar coefficients, and the faster the resulting scalar multiplication. Typically, knowledge of the eigenvalues allows us to write down a long basis, which we then reduce using the Euclidean algorithm, Gauss reduction, LLL, or even a more specialized algorithm. In this work, we use elementary facts about quadratic rings to immediately write down a short basis of the lattice for the GLV, GLS, GLV+GLS, and Q-curve constructions on elliptic curves, and for genus 2 real multiplication constructions. We do not pretend that this represents a significant optimization in scalar multiplication, since the lattice reduction step is always an offline precomputation---but it does give a better insight into the structure of scalar decompositions. In any case, it is always more convenient to use a ready-made short basis than it is to compute a new one
High performance lattice reduction on heterogeneous computing platform
The final publication is available at Springer via http://dx.doi.org/10.1007/s11227-014-1201-2The lattice reduction (LR) technique has become very important in many engineering fields. However, its high complexity makes difficult its use in real-time applications, especially in applications that deal with large matrices. As a solution, the modified block LLL (MB-LLL) algorithm was introduced, where several levels of parallelism were exploited: (a) fine-grained parallelism was achieved through the cost-reduced all-swap LLL (CR-AS-LLL) algorithm introduced together with the MB-LLL by Jzsa et al. (Proceedings of the tenth international symposium on wireless communication systems, 2013) and (b) coarse-grained parallelism was achieved by applying the block-reduction concept presented by Wetzel (Algorithmic number theory. Springer, New York, pp 323-337, 1998). In this paper, we present the cost-reduced MB-LLL (CR-MB-LLL) algorithm, which allows to significantly reduce the computational complexity of the MB-LLL by allowing the relaxation of the first LLL condition while executing the LR of submatrices, resulting in the delay of the Gram-Schmidt coefficients update and by using less costly procedures during the boundary checks. The effects of complexity reduction and implementation details are analyzed and discussed for several architectures. A mapping of the CR-MB-LLL on a heterogeneous platform is proposed and it is compared with implementations running on a dynamic parallelism enabled GPU and a multi-core CPU. The mapping on the architecture proposed allows a dynamic scheduling of kernels where the overhead introduced is hidden by the use of several CUDA streams. Results show that the execution time of the CR-MB-LLL algorithm on the heterogeneous platform outperforms the multi-core CPU and it is more efficient than the CR-AS-LLL algorithm in case of large matrices.Financial support for this study was provided by grants TAMOP-4.2.1./B-11/2/KMR-2011-0002, TAMOP-4.2.2/B-10/1-2010-0014 from the Pazmany Peter Catholic University, European Union ERDF, Spanish Government through TEC2012-38142-C04-01 project and Generalitat Valenciana through PROMETEO/2009/013 project.Jozsa, CM.; Domene Oltra, F.; Vidal Maciá, AM.; Piñero Sipán, MG.; González Salvador, A. (2014). High performance lattice reduction on heterogeneous computing platform. Journal of Supercomputing. 70(2):772-785. https://doi.org/10.1007/s11227-014-1201-2S772785702Józsa CM, Domene F, Piñero G, González A, Vidal AM (2013) Efficient GPU implementation of lattice-reduction-aided multiuser precoding. In: Proceedings of the tenth international symposium on wireless communication systems (ISWCS 2013)Wetzel S (1998) An efficient parallel block-reduction algorithm. In: Buhler JP (ed) Algorithmic number theory. Lecture notes in computer science, vol 1423. Springer, Berlin, Heidelberg, pp 323–337Wubben D, Seethaler D, Jaldén J, Matz G (2011) Lattice reduction. Signal Process Mag IEEE 28(3):70–91Lenstra AK, Lenstra HW, Lovász L (1982) Factoring polynomials with rational coefficients. Math Ann 261(4):515–534Bremner MR (2012) Lattice basis reduction: an introduction to the LLL algorithm and its applications. CRC Press, USAWu D, Eilert J, Liu D (2008) A programmable lattice-reduction aided detector for MIMO-OFDMA. In: 4th IEEE international conference on circuits and systems for communications (ICCSC 2008), pp 293–297Barbero LG, Milliner DL, Ratnarajah T, Barry JR, Cowan C (2009) Rapid prototyping of Clarkson’s lattice reduction for MIMO detection. In: IEEE international conference on communications (ICC’09), pp 1–5Gestner B, Zhang W, Ma X, Anderson D (2011) Lattice reduction for MIMO detection: from theoretical analysis to hardware realization. IEEE Trans Circ Syst I Regul Pap 58(4):813–826Shabany M, Youssef A, Gulak G (2013) High-throughput 0.13- \upmu μ m CMOS lattice reduction core supporting 880 Mb/s detection. IEEE Trans Very Large Scale Integr (VLSI) Syst 21(5):848–861Luo Y, Qiao S (2011) A parallel LLL algorithm. In: Proceedings of the fourth international C* conference on computer science and software engineering, pp 93–101Backes W, Wetzel S (2011) Parallel lattice basis reduction—the road to many-core. In: IEEE 13th international conference on high performance computing and communications (HPCC)Ahmad U, Amin A, Li M, Pollin S, Van der Perre L, Catthoor F (2011) Scalable block-based parallel lattice reduction algorithm for an SDR baseband processor. In: 2011 IEEE international conference on communications (ICC)Villard G (1992) Parallel lattice basis reduction. In: Papers from the international symposium on symbolic and algebraic computation (ISSAC’92). ACM, New YorkDomene F, Józsa CM, Vidal AM, Piñero G, Gonzalez A (2013) Performance analysis of a parallel lattice reduction algorithm on many-core architectures. In: Proceedings of the 13th international conference on computational and mathematical methods in science and engineeringGestner B, Zhang W, Ma X, Anderson DV (2008) VLSI implementation of a lattice reduction algorithm for low-complexity equalization. In: 4th IEEE international conference on circuits and systems for communications (ICCSC 2008), pp 643–647Burg A, Seethaler D, Matz G (2007) VLSI implementation of a lattice-reduction algorithm for multi-antenna broadcast precoding. In: IEEE international symposium on circuits and systems (ISCAS 2007), pp 673–676Bruderer L, Studer C, Wenk M, Seethaler D, Burg A (2010) VLSI implementation of a low-complexity LLL lattice reduction algorithm for MIMO detection. In: Proceedings of 2010 IEEE international symposium on circuits and systems (ISCAS
Estimation of the Success Probability of Random Sampling by the Gram-Charlier Approximation
The lattice basis reduction algorithm is a method for solving the
Shortest Vector Problem (SVP) on lattices. There are many variants of
the lattice basis reduction algorithm such as LLL, BKZ, and RSR. Though
BKZ has been used most widely, it is shown recently that some variants
of RSR are quite efficient for solving a high-dimensional SVP (they
achieved many best scores in TU Darmstadt SVP challenge). RSR repeats
alternately the generation of new very short lattice vectors from the
current basis (we call this procedure ``random sampling\u27\u27) and the
improvement of the current basis by utilizing the generated very short
lattice vectors. Therefore, it is important for investigating and
ameliorating RSR to estimate the success probability of finding very
short lattice vectors by combining the current basis. In this paper,
we propose a new method for estimating the success probability by the
Gram-Charlier approximation, which is a basic asymptotic expansion of
any probability distribution by utilizing the higher order cumulants
such as the skewness and the kurtosis. The proposed method uses a
``parametric\u27\u27 model for estimating the probability, which gives a
closed-form expression with a few parameters. Therefore, the proposed
method is much more efficient than the previous methods using the
non-parametric estimation. This enables us to investigate the lattice
basis reduction algorithm intensively in various situations and clarify
its properties. Numerical experiments verified that the Gram-Charlier
approximation can estimate the actual distribution quite accurately.
In addition, we investigated RSR and its variants by the proposed
method. Consequently, the results showed that the weighted random
sampling is useful for generating shorter lattice vectors. They also
showed that it is crucial for solving the SVP to improve the current
basis periodically
Search-to-Decision Reductions for Lattice Problems with Approximation Factors (Slightly) Greater Than One
We show the first dimension-preserving search-to-decision reductions for
approximate SVP and CVP. In particular, for any ,
we obtain an efficient dimension-preserving reduction from -SVP to -GapSVP and an efficient dimension-preserving reduction
from -CVP to -GapCVP. These results generalize the known
equivalences of the search and decision versions of these problems in the exact
case when . For SVP, we actually obtain something slightly stronger
than a search-to-decision reduction---we reduce -SVP to
-unique SVP, a potentially easier problem than -GapSVP.Comment: Updated to acknowledge additional prior wor
A Modified KZ Reduction Algorithm
The Korkine-Zolotareff (KZ) reduction has been used in communications and
cryptography. In this paper, we modify a very recent KZ reduction algorithm
proposed by Zhang et al., resulting in a new algorithm, which can be much
faster and more numerically reliable, especially when the basis matrix is ill
conditioned.Comment: has been accepted by IEEE ISIT 201
Decoding by Sampling: A Randomized Lattice Algorithm for Bounded Distance Decoding
Despite its reduced complexity, lattice reduction-aided decoding exhibits a
widening gap to maximum-likelihood (ML) performance as the dimension increases.
To improve its performance, this paper presents randomized lattice decoding
based on Klein's sampling technique, which is a randomized version of Babai's
nearest plane algorithm (i.e., successive interference cancelation (SIC)). To
find the closest lattice point, Klein's algorithm is used to sample some
lattice points and the closest among those samples is chosen. Lattice reduction
increases the probability of finding the closest lattice point, and only needs
to be run once during pre-processing. Further, the sampling can operate very
efficiently in parallel. The technical contribution of this paper is two-fold:
we analyze and optimize the decoding radius of sampling decoding resulting in
better error performance than Klein's original algorithm, and propose a very
efficient implementation of random rounding. Of particular interest is that a
fixed gain in the decoding radius compared to Babai's decoding can be achieved
at polynomial complexity. The proposed decoder is useful for moderate
dimensions where sphere decoding becomes computationally intensive, while
lattice reduction-aided decoding starts to suffer considerable loss. Simulation
results demonstrate near-ML performance is achieved by a moderate number of
samples, even if the dimension is as high as 32
- …