94 research outputs found

    The geometry of quantum learning

    Full text link
    Concept learning provides a natural framework in which to place the problems solved by the quantum algorithms of Bernstein-Vazirani and Grover. By combining the tools used in these algorithms--quantum fast transforms and amplitude amplification--with a novel (in this context) tool--a solution method for geometrical optimization problems--we derive a general technique for quantum concept learning. We name this technique "Amplified Impatient Learning" and apply it to construct quantum algorithms solving two new problems: BATTLESHIP and MAJORITY, more efficiently than is possible classically.Comment: 20 pages, plain TeX with amssym.tex, related work at http://www.math.uga.edu/~hunziker/ and http://math.ucsd.edu/~dmeyer

    New cryptanalysis of LFSR-based stream ciphers and decoders for p-ary QC-MDPC codes

    Get PDF
    The security of modern cryptography is based on the hardness of solving certain problems. In this context, a problem is considered hard if there is no known polynomial time algorithm to solve it. Initially, the security assessment of cryptographic systems only considered adversaries with classical computational resources, i.e., digital computers. It is now known that there exist polynomial-time quantum algorithms that would render certain cryptosystems insecure if large-scale quantum computers were available. Thus, adversaries with access to such computers should also be considered. In particular, cryptosystems based on the hardness of integer factorisation or the discrete logarithm problem would be broken. For some others such as symmetric-key cryptosystems, the impact seems not to be as serious; it is recommended to at least double the key size of currently used systems to preserve their security level. The potential threat posed by sufficiently powerful quantum computers motivates the continued study and development of post-quantum cryptography, that is, cryptographic systems that are secure against adversaries with access to quantum computations. It is believed that symmetric-key cryptosystems should be secure from quantum attacks. In this manuscript, we study the security of one such family of systems; namely, stream ciphers. They are mainly used in applications where high throughput is required in software or low resource usage is required in hardware. Our focus is on the cryptanalysis of stream ciphers employing linear feedback shift registers (LFSRs). This is modelled as the problem of finding solutions to systems of linear equations with associated probability distributions on the set of right hand sides. To solve this problem, we first present a multivariate version of the correlation attack introduced by Siegenthaler. Building on the ideas of the multivariate attack, we propose a new cryptanalytic method with lower time complexity. Alongside this, we introduce the notion of relations modulo a matrix B, which may be seen as a generalisation of parity-checks used in fast correlation attacks. The latter are among the most important class of attacks against LFSR-based stream ciphers. Our new method is successfully applied to hard instances of the filter generator and requires a lower amount of keystream compared to other attacks in the literature. We also perform a theoretical attack against the Grain-v1 cipher and an experimental attack against a toy Grain-like cipher. Compared to the best previous attack, our technique requires less keystream bits but also has a higher time complexity. This is the result of joint work with Semaev. Public-key cryptosystems based on error-correcting codes are also believed to be secure against quantum attacks. To this end, we develop a new technique in code-based cryptography. Specifically, we propose new decoders for quasi-cyclic moderate density parity-check (QC-MDPC) codes. These codes were proposed by Misoczki et al.\ for use in the McEliece scheme. The use of QC-MDPC codes avoids attacks applicable when using low-density parity-check (LDPC) codes and also allows for keys with short size. Although we focus on decoding for a particular instance of the p-ary QC-MDPC scheme, our new decoding algorithm is also a general decoding method for p-ary MDPC-like schemes. This algorithm is a bit-flipping decoder, and its performance is improved by varying thresholds for the different iterations. Experimental results demonstrate that our decoders enjoy a very low decoding failure rate for the chosen p-ary QC-MDPC instance. This is the result of joint work with Guo and Johansson.Doktorgradsavhandlin

    The Nested Periodic Subspaces: Extensions of Ramanujan Sums for Period Estimation

    Get PDF
    In the year 1918, the Indian mathematician Srinivasa Ramanujan proposed a set of sequences called Ramanujan Sums as bases to expand arithmetic functions in number theory. Today, exactly a 100 years later, we will show that these sequences re-emerge as exciting tools in a completely different context: For the extraction of periodic patterns in data. Combined with the state-of-the-art techniques of DSP, Ramanujan Sums can be used as the starting point for developing powerful algorithms for periodicity applications. The primary inspiration for this thesis comes from a recent extension of Ramanujan sums to subspaces known as the Ramanujan subspaces. These subspaces were designed to span any sequence with integer periodicity, and have many interesting properties. Starting with Ramanujan subspaces, this thesis first develops an entire family of such subspace representations for periodic sequences. This family, called Nested Periodic Subspaces due to their unique structure, turns out to be the least redundant sets of subspaces that can span periodic sequences. Three classes of new algorithms are proposed using the Nested Periodic Subspaces: dictionaries, filter banks, and eigen-space methods based on the auto-correlation matrix of the signal. It will be shown that these methods are especially advantageous to use when the data-length is short, or when the signal is a mixture of multiple hidden periods. The dictionary techniques were inspired by recent advances in sparsity based compressed sensing. Apart from the l1 norm based convex programs currently used in other applications, our dictionaries can admit l2 norm formulations that have linear and closed form solutions, even when the systems is under-determined. A new filter bank is also proposed using the Ramanujan sums. This, named the Ramanujan Filter Bank, can accurately track the instantaneous period for signals that exhibit time varying periodic nature. The filters in the Ramanujan Filter Bank have simple integer valued coefficients, and directly tile the period vs time plane, unlike classical STFT (Short Time Fourier Transform) and wavelets, which tile the time-frequency plane. The third family of techniques developed here are a generalization of the classic MUSIC (MUltiple SIgnal Classification) algorithm for periodic signals. MUSIC is one of the most popular techniques today for line spectral estimation. However, periodic signals are not just any unstructured line spectral signals. There is a nice harmonic spacing between the lines which is not exploited by plain MUSIC. We will show that one can design much more accurate adaptations of MUSIC using Nested Periodic Subspaces. Compared to prior variants of MUSIC for the periodicity problem, our approach is much faster and yields much more accurate results for signals with integer periods. This work is also the first extension of MUSIC that uses simple integer valued basis vectors instead of using traditional complex-exponentials to span the signal subspace. The advantages of the new methods are demonstrated both on simulations, as well as real world applications such as DNA micro-satellites, protein repeats and absence seizures. Apart from practical contributions, the theory of Nested Periodic Subspaces offers answers to a number of fundamental questions that were previously unanswered. For example, what is the minimum contiguous data-length needed to be able to identify the period of a signal unambiguously? Notice that the answer we seek is a fundamental identifiability bound independent of any particular period estimation technique. Surprisingly, this basic question has never been answered before. In this thesis, we will derive precise expressions for the minimum necessary and sufficient datalengths for this question. We also extend these bounds to the context of mixtures of periodic signals. Once again, even though mixtures of periodic signals often occur in many applications, aspects such as the unique identifiability of the component periods were never rigorously analyzed before. We will present such an analysis as well. While the above question deals with the minimum contiguous datalength required for period estimation, one may ask a slightly different question: If we are allowed to pick the samples of a signal in a non-contiguous fashion, how should we pick them so that we can estimate the period using the least number of samples? This question will be shown to be quite difficult to answer in general. In this thesis, we analyze a smaller case in this regard, namely, that of resolving between two periods. It will be shown that the analysis is quite involved even in this case, and the optimal sampling pattern takes an interesting form of sparsely located bunches. This result can also be extended to the case of multi-dimensional periodic signals. We very briefly address multi-dimensional periodicity in this thesis. Most prior DSP literature on multi-dimensional discrete time periodic signals assumes the period to be parallelepipeds. But as shown by the artist M. C. Escher, one can tile the space using a much more diverse variety of shapes. Is it always possible to account for such other periodic shapes using the traditional notion of parallelepiped periods? An interesting analysis in this regard is presented towards the end of the thesis.</p

    Algorithms for Large-Scale Sparse Tensor Factorization

    Get PDF
    University of Minnesota Ph.D. dissertation. April 2019. Major: Computer Science. Advisor: George Karypis. 1 computer file (PDF); xiv, 153 pages.Tensor factorization is a technique for analyzing data that features interactions of data along three or more axes, or modes. Many fields such as retail, health analytics, and cybersecurity utilize tensor factorization to gain useful insights and make better decisions. The tensors that arise in these domains are increasingly large, sparse, and high dimensional. Factoring these tensors is computationally expensive, if not infeasible. The ubiquity of multi-core processors and large-scale clusters motivates the development of scalable parallel algorithms to facilitate these computations. However, sparse tensor factorizations often achieve only a small fraction of potential performance due to challenges including data-dependent parallelism and memory accesses, high memory consumption, and frequent fine-grained synchronizations among compute cores. This thesis presents a collection of algorithms for factoring sparse tensors on modern parallel architectures. This work is focused on developing algorithms that are scalable while being memory- and operation-efficient. We address a number of challenges across various forms of tensor factorizations and emphasize results on large, real-world datasets

    Large Scale Kernel Methods for Fun and Profit

    Get PDF
    Kernel methods are among the most flexible classes of machine learning models with strong theoretical guarantees. Wide classes of functions can be approximated arbitrarily well with kernels, while fast convergence and learning rates have been formally shown to hold. Exact kernel methods are known to scale poorly with increasing dataset size, and we believe that one of the factors limiting their usage in modern machine learning is the lack of scalable and easy to use algorithms and software. The main goal of this thesis is to study kernel methods from the point of view of efficient learning, with particular emphasis on large-scale data, but also on low-latency training, and user efficiency. We improve the state-of-the-art for scaling kernel solvers to datasets with billions of points using the Falkon algorithm, which combines random projections with fast optimization. Running it on GPUs, we show how to fully utilize available computing power for training kernel machines. To boost the ease-of-use of approximate kernel solvers, we propose an algorithm for automated hyperparameter tuning. By minimizing a penalized loss function, a model can be learned together with its hyperparameters, reducing the time needed for user-driven experimentation. In the setting of multi-class learning, we show that – under stringent but realistic assumptions on the separation between classes – a wide set of algorithms needs much fewer data points than in the more general setting (without assumptions on class separation) to reach the same accuracy. The first part of the thesis develops a framework for efficient and scalable kernel machines. This raises the question of whether our approaches can be used successfully in real-world applications, especially compared to alternatives based on deep learning which are often deemed hard to beat. The second part aims to investigate this question on two main applications, chosen because of the paramount importance of having an efficient algorithm. First, we consider the problem of instance segmentation of images taken from the iCub robot. Here Falkon is used as part of a larger pipeline, but the efficiency afforded by our solver is essential to ensure smooth human-robot interactions. In the second instance, we consider time-series forecasting of wind speed, analysing the relevance of different physical variables on the predictions themselves. We investigate different schemes to adapt i.i.d. learning to the time-series setting. Overall, this work aims to demonstrate, through novel algorithms and examples, that kernel methods are up to computationally demanding tasks, and that there are concrete applications in which their use is warranted and more efficient than that of other, more complex, and less theoretically grounded models

    Performance evaluation of T-transform based OFDM in underwater acoustic channels

    Get PDF
    PhD ThesisRecently there has been an increasing trend towards the implementation of orthogonal frequency division multiplexing (OFDM) based multicarrier communication systems in underwater acoustic communications. By dividing the available bandwidth into multiple sub-bands, OFDM systems enable reliable transmission over long range dispersive channels. However OFDM is prone to impairments such as severe frequency selective fading channels, motioned induced Doppler shift and high peak-to-average-power ratio (PAPR). In order to fully exploit the potential of OFDM in UWA channels, those issues have received a great deal of attention in recent research. With the aim of improving OFDM's performance in UWA channels, a T-transformed based OFDM system is introduced using a low computational complexity T-transform that combines the Walsh-Hadamard transform (WHT) and the discrete Fourier transform (DFT) into a single fast orthonormal unitary transform. Through real-world experiment, performance comparison between the proposed T-OFDM system and conventional OFDM system revealed that T-OFDM performs better than OFDM with high code rate in frequency selective fading channels. Furthermore, investigation of different equalizer techniques have shown that the limitation of ZF equalizers affect the T-OFDM more (one bad equalizer coefficient affects all symbols) and so developed a modified ZF equalizer with outlier detection which provides major performance gain without excessive computation load. Lastly, investigation of PAPR reduction methods delineated that T-OFDM has inherently lower PAPR and it is also far more tolerant of distortions introduced by the simple clipping method. As a result, lower PAPR can be achieved with minimal overhead and so outperforming OFDM for a given power limit at the transmitter
    • …
    corecore