3,172 research outputs found

    Solving Quadratic Equations with XL on Parallel Architectures - extended version

    Get PDF
    Solving a system of multivariate quadratic equations (MQ) is an NP-complete problem whose complexity estimates are relevant to many cryptographic scenarios. In some cases it is required in the best known attack; sometimes it is a generic attack (such as for the multivariate PKCs), and sometimes it determines a provable level of security (such as for the QUAD stream ciphers). Under reasonable assumptions, the best way to solve generic MQ systems is the XL algorithm implemented with a sparse matrix solver such as Wiedemann\u27s algorithm. Knowing how much time an implementation of this attack requires gives us a good idea of how future cryptosystems related to MQ can be broken, similar to how implementations of the General Number Field Sieve that factors smaller RSA numbers give us more insight into the security of actual RSA-based cryptosystems. This paper describes such an implementation of XL using the block Wiedemann algorithm. In 5 days we are able to solve a system with 32 variables and 64 equations over F16\mathbb{F}_{16} (a computation of about 260.32^{60.3} bit operations) on a small cluster of 8 nodes, with 8 CPU cores and 36 GB of RAM in each node. We do not expect system solvers of the F4_4/F5_5 family to accomplish this due to their much higher memory demand. Our software also offers implementations for F2\mathbb{F}_{2} and F31\mathbb{F}_{31} and can be easily adapted to other small fields. More importantly, it scales nicely for small clusters, NUMA machines, and a combination of both

    Fast, Exact and Multi-Scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs

    Get PDF
    In this work we propose a structured prediction technique that combines the virtues of Gaussian Conditional Random Fields (G-CRF) with Deep Learning: (a) our structured prediction task has a unique global optimum that is obtained exactly from the solution of a linear system (b) the gradients of our model parameters are analytically computed using closed form expressions, in contrast to the memory-demanding contemporary deep structured prediction approaches that rely on back-propagation-through-time, (c) our pairwise terms do not have to be simple hand-crafted expressions, as in the line of works building on the DenseCRF, but can rather be `discovered' from data through deep architectures, and (d) out system can trained in an end-to-end manner. Building on standard tools from numerical analysis we develop very efficient algorithms for inference and learning, as well as a customized technique adapted to the semantic segmentation task. This efficiency allows us to explore more sophisticated architectures for structured prediction in deep learning: we introduce multi-resolution architectures to couple information across scales in a joint optimization framework, yielding systematic improvements. We demonstrate the utility of our approach on the challenging VOC PASCAL 2012 image segmentation benchmark, showing substantial improvements over strong baselines. We make all of our code and experiments available at {https://github.com/siddharthachandra/gcrf}Comment: Our code is available at https://github.com/siddharthachandra/gcr

    An Alternating Trust Region Algorithm for Distributed Linearly Constrained Nonlinear Programs, Application to the AC Optimal Power Flow

    Get PDF
    A novel trust region method for solving linearly constrained nonlinear programs is presented. The proposed technique is amenable to a distributed implementation, as its salient ingredient is an alternating projected gradient sweep in place of the Cauchy point computation. It is proven that the algorithm yields a sequence that globally converges to a critical point. As a result of some changes to the standard trust region method, namely a proximal regularisation of the trust region subproblem, it is shown that the local convergence rate is linear with an arbitrarily small ratio. Thus, convergence is locally almost superlinear, under standard regularity assumptions. The proposed method is successfully applied to compute local solutions to alternating current optimal power flow problems in transmission and distribution networks. Moreover, the new mechanism for computing a Cauchy point compares favourably against the standard projected search as for its activity detection properties

    Parallel cryptanalysis

    Get PDF
    Most of today’s cryptographic primitives are based on computations that are hard to perform for a potential attacker but easy to perform for somebody who is in possession of some secret information, the key, that opens a back door in these hard computations and allows them to be solved in a small amount of time. To estimate the strength of a cryptographic primitive it is important to know how hard it is to perform the computation without knowledge of the secret back door and to get an understanding of how much money or time the attacker has to spend. Usually a cryptographic primitive allows the cryptographer to choose parameters that make an attack harder at the cost of making the computations using the secret key harder as well. Therefore designing a cryptographic primitive imposes the dilemma of choosing the parameters strong enough to resist an attack up to a certain cost while choosing them small enough to allow usage of the primitive in the real world, e.g. on small computing devices like smart phones. This thesis investigates three different attacks on particular cryptographic systems: Wagner’s generalized birthday attack is applied to the compression function of the hash function FSB. Pollard’s rho algorithm is used for attacking Certicom’s ECC Challenge ECC2K-130. The implementation of the XL algorithm has not been specialized for an attack on a specific cryptographic primitive but can be used for attacking some cryptographic primitives by solving multivariate quadratic systems. All three attacks are general attacks, i.e. they apply to various cryptographic systems; the implementations of Wagner’s generalized birthday attack and Pollard’s rho algorithm can be adapted for attacking other primitives than those given in this thesis. The three attacks have been implemented on different parallel architectures. XL has been parallelized using the Block Wiedemann algorithm on a NUMA system using OpenMP and on an Infiniband cluster using MPI. Wagner’s attack was performed on a distributed system of 8 multi-core nodes connected by an Ethernet network. The work on Pollard’s Rho algorithm is part of a large research collaboration with several research groups; the computations are embarrassingly parallel and are executed in a distributed fashion in several facilities with almost negligible communication cost. This dissertation presents implementations of the iteration function of Pollard’s Rho algorithm on Graphics Processing Units and on the Cell Broadband Engine

    An Efficient Interior-Point Decomposition Algorithm for Parallel Solution of Large-Scale Nonlinear Problems with Significant Variable Coupling

    Get PDF
    In this dissertation we develop multiple algorithms for efficient parallel solution of structured nonlinear programming problems by decomposition of the linear augmented system solved at each iteration of a nonlinear interior-point approach. In particular, we address large-scale, block-structured problems with a significant number of complicating, or coupling variables. This structure arises in many important problem classes including multi-scenario optimization, parameter estimation, two-stage stochastic programming, optimal control and power network problems. The structure of these problems induces a block-angular structure in the augmented system, and parallel solution is possible using a Schur-complement decomposition. Three major variants are implemented: a serial, full-space interior-point method, serial and parallel versions of an explicit Schur-complement decomposition, and serial and parallel versions of an implicit PCG-based Schur-complement decomposition. All of these algorithms have been implemented in C++ in an extensible software framework for nonlinear optimization. The explicit Schur-complement decomposition is typically effective for problems with a few hundred coupling variables. We demonstrate the performance of our implementation on an important problem in optimal power grid operation, the contingency-constrained AC optimal power ow problem. In this dissertation, we present a rectangular IV formulation for the contingency-constrained ACOPF problem and demonstrate that the explicit Schur-complement decomposition can dramatically reduce solution times for a problem with a large number of contingency scenarios. Moreover, a comparison of the explicit Schur-complement decomposition implementation and the Progressive Hedging approach provided by Pyomo is provided, showing that the internal decomposition approach is computationally favorable to the external approach. However, the explicit Schur-complement decomposition approach is not appropriate for problems with a large number of coupling variables because of the high computational cost associated with forming and solving the dense Schur-complement. We show that this bottleneck can be overcome by solving the Schur-complement equations implicitly using a quasi-Newton preconditioned conjugate gradient method. This new algorithm avoids explicit formation and factorization of the Schur-complement. The computational efficiency of the serial and parallel versions of this algorithm are compared with the serial full-space approach, and the serial and parallel explicit Schur-complement approach on a set of quadratic parameter estimation problems and nonlinear optimization problems. These results show that the PCG implicit Schur-complement approach dramatically reduces the computational expense for problems with many coupling variables
    • …
    corecore