19 research outputs found

    Cryptanalysis of Classic Ciphers Using Hidden Markov Models

    Get PDF
    Cryptanalysis is the study of identifying weaknesses in the implementation of cryptographic algorithms. This process would improve the complexity of such algo- rithms, making the system secure. In this research, we apply Hidden Markov Models (HMMs) to classic cryptanaly- sis problems. We show that with sufficient ciphertext, an HMM can be used to break a simple substitution cipher. We also show that when limited ciphertext is avail- able, using multiple random restarts for the HMM increases our chance of successful decryption

    Cryptanalysis of the Purple Cipher using Random Restarts

    Get PDF
    Cryptanalysis is the process of trying to analyze ciphers, cipher text, and crypto systems, which may exploit any loopholes or weaknesses in the systems, leading us to an understanding of the key used to encrypt the data. This project uses Expectation Maximization (EM) approach using numerous restarts to attack decipherment problems such as the Purple Cipher. In this research, we perform cryptanalysis of the Purple cipher using genetic algorithms and hidden Markov models (HMM). If the Purple cipher has a fixed plugboard, we show that genetic algorithms are successful in retrieving the plaintext from cipher text with high accuracy. On the other hand, if the cipher has a plugboard that is not fixed, we can decrypt the cipher text with increasing accuracy given an increase in population size and restarts. We performed the cryptanalysis of PseudoPurple, which is less complex but more powerful than Purple using HMMs. Though we could not decrypt cipher text produced by PseudoPurple with good accuracy, there is an increase in accuracy of the decrypted plaintext with an increase in the number of restarts

    Hidden Markov Models with Random Restarts vs Boosting for Malware Detection

    Full text link
    Effective and efficient malware detection is at the forefront of research into building secure digital systems. As with many other fields, malware detection research has seen a dramatic increase in the application of machine learning algorithms. One machine learning technique that has been used widely in the field of pattern matching in general-and malware detection in particular-is hidden Markov models (HMMs). HMM training is based on a hill climb, and hence we can often improve a model by training multiple times with different initial values. In this research, we compare boosted HMMs (using AdaBoost) to HMMs trained with multiple random restarts, in the context of malware detection. These techniques are applied to a variety of challenging malware datasets. We find that random restarts perform surprisingly well in comparison to boosting. Only in the most difficult "cold start" cases (where training data is severely limited) does boosting appear to offer sufficient improvement to justify its higher computational cost in the scoring phase

    Cryptanalysis of Homophonic Substitution Cipher Using Hidden Markov Models

    Get PDF
    We investigate the effectiveness of a Hidden Markov Model (HMM) with random restarts as a mean of breaking a homophonic substitution cipher. Based on extensive experiments, we find that such an HMM-based attack outperforms a previously de- veloped nested hill climb approach, particularly when the ciphertext message is short. We then consider a combination cipher, consisting of a homophonic substitution and a column transposition. We develop and analyze an attack on such a cipher. This attack employs an HMM (with random restarts), together with a hill climb to recover the column permutation. We show that this attack can succeed on relatively short ci- phertext messages. Finally, we test this combined attack on the unsolved Zodiac 340 cipher

    Generative Adversarial Networks for Classic Cryptanalysis

    Get PDF
    The necessity of protecting critical information has been understood for millennia. Although classic ciphers have inherent weaknesses in comparison to modern ciphers, many classic ciphers are extremely challenging to break in practice. Machine learning techniques, such as hidden Markov models (HMM), have recently been applied with success to various classic cryptanalysis problems. In this research, we consider the effectiveness of the deep learning technique CipherGAN---which is based on the well- established generative adversarial network (GAN) architecture---for classic cipher cryptanalysis. We experiment extensively with CipherGAN on a number of classic ciphers, and we compare our results to those obtained using HMMs

    Cryptanalysis of Homophonic Substitution-Transposition Cipher

    Get PDF
    Homophonic substitution ciphers employ a one-to-many key to encrypt plaintext. This is in contrast to a simple substitution cipher where a one-to-one mapping is used. The advantage of a homophonic substitution cipher is that it makes frequency analysis more difficult, due to a more even distribution of plaintext statistics. Classic transposition ciphers apply diffusion to the ciphertext by swapping the order of letters. Combined transposition-substitution ciphers can be more challenging to cryptanalyze than either cipher type separately. In this research, we propose a technique to break a combined simple substitution- column transposition cipher. We also consider the related problem of breaking a combination homophonic substitution-column transposition cipher. These attacks extend previous work on substitution ciphers. We thoroughly analyze our attacks and we apply the homophonic substitution-columnar transposition attack to the unsolved Zodiac-340 cipher

    A new look at old numbers, and what it reveals about numeration

    Get PDF
    In this study, the archaic counting systems of Mesopotamia as understood through the Neolithic tokens, numerical impressions, and proto-cuneiform notations were compared to the traditional number-words and counting methods of Polynesia as understood through contemporary and historical descriptions of vocabulary and behaviors. The comparison and associated analyses capitalized on the ability to understand well-known characteristics of Uruk-period numbers like object-specific counting, polyvalence, and context-dependence through historical observations of Polynesian counting methods and numerical language, evidence unavailable for ancient numbers. Similarities between the two number systems were then used to argue that archaic Mesopotamian numbers, like those of Polynesia, were highly elaborated and would have served as cognitively efficient tools for mental calculation. Their differences also show the importance of material technologies like tokens, impressions, and notations to developing mathematics. This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No. 785793

    Generative Non-Markov Models for Information Extraction

    Get PDF
    Learning from unlabeled data is a long-standing challenge in machine learning. A principled solution involves modeling the full joint distribution over inputs and the latent structure of interest, and imputing the missing data via marginalization. Unfortunately, such marginalization is expensive for most non-trivial problems, which places practical limits on the expressiveness of generative models. As a result, joint models often encode strict assumptions about the underlying process such as fixed-order Markovian assumptions and employ simple count-based features of the inputs. In contrast, conditional models, which do not directly model the observed data, are free to incorporate rich overlapping features of the input in order to predict the latent structure of interest. It would be desirable to develop expressive generative models that retain tractable inference. This is the topic of this thesis. In particular, we explore joint models which relax fixed-order Markov assumptions, and investigate the use of recurrent neural networks for automatic feature induction in the generative process. We focus on two structured prediction problems: (1) imputing labeled segmentions of input character sequences, and (2) imputing directed spanning trees relating strings in text corpora. These problems arise in many applications of practical interest, but we are primarily concerned with named-entity recognition and cross-document coreference resolution in this work. For named-entity recognition, we propose a generative model in which the observed characters originate from a latent non-Markov process over words, and where the characters are themselves produced via a non-Markov process: a recurrent neural network (RNN). We propose a sampler for the proposed model in which sequential Monte Carlo is used as a transition kernel for a Gibbs sampler. The kernel is amenable to a fast parallel implementation, and results in fast mixing in practice. For cross-document coreference resolution, we move beyond sequence modeling to consider string-to-string transduction. We stipulate a generative process for a corpus of documents in which entity names arise from copying---and optionally transforming---previous names of the same entity. Our proposed model is sensitive to both the context in which the names occur as well as their spelling. The string-to-string transformations correspond to systematic linguistic processes such as abbreviation, typos, and nicknaming, and by analogy to biology, we think of them as mutations along the edges of a phylogeny. We propose a novel block Gibbs sampler for this problem that alternates between sampling an ordering of the mentions and a spanning tree relating all mentions in the corpus

    Critique of Fantasy, Vol. 2

    Get PDF
    "In The Contest between B-Genres, the “Space Trilogy” by J.R.R. Tolkien’s friend and colleague C.S. Lewis and the roster of American science fictions that Gotthard Günther selected and glossed for the German readership in 1952 demarcate the ring in which the contestants face off. In carrying out in fiction the joust that Tolkien proclaimed in his manifesto essay “On Fairy-Stories,” Lewis challenged the visions of travel through time and space that were the mainstays of modern science fiction. In the facing corner, Günther recognized in American science fiction the first stirrings of a new mythic storytelling that would supplant the staple of an expiring metaphysics, the fairy-story basic to Tolkien and Lewis’s fantasy genre. The B-genres science fiction and fantasy were contemporaries of cinema’s emergence out of the scientific and experimental study and recording of motion made visible. In an early work like H.G. Wells’s The Time Machine, which Tolkien credited as work of fantasy, the transport through time – the ununderstood crux of this literary experiment – is conveyed through a cinematic–fantastic component in the narrative, reflecting optical innovations and forecasting the movies to come. Although the historical onset of the rivalry between the B-genres is packed with literary examples, adaptation (acknowledged or not) followed out the rebound of wish fantasy between literary descriptions of the ununderstood and their cinematic counterparts, visual and special effects. The arrival of the digital relation out of the crucible of the unknown and the special effect seemed at last to award the fantasy genre the trophy in its contest with science fiction. And yet, although science fiction indeed failed to predict the digital future, fantasy did not so much succeed as draw benefit from the mere resemblance of fantasying to the new relation. While it follows that digitization is the fantasy that is true (and not, as Tolkien had hoped, the Christian Gospel), the newly renewed B-genre without borders found support in another revaluation that was underway in the other B-genre. Once its future orientation was “history,” science fiction began indwelling the ruins of its faulty forecasts. By its new allegorical momentum, science fiction supplied captions of legibility and history to the reconfigured borderlands it cohabited with fantasy. The second volume also attends, then, to the hybrids that owed their formation to these changes, both anticipated and realized. Extending through the topography of the borderlands, works by J.G. Ballard, Ursula Le Guin, and John Boorman, among others, occupy and cathect a context of speculative fiction that suspended and blended the strict contest requirements constitutive of the separate B-genres

    The interpreter as intercultural mediator

    Get PDF
    This thesis looks at the role of the Slovak-English interpreter working in the consecutive mode in the business environment especially with regard to rendering cultural references from source texts, whether these are (British) English or Slovak. Since culture in this thesis is taken in the broad sense of the whole way of life, cultural references can also be wide-ranging. The strategy an interpreter will opt for when interpreting cultural references depends on the circumstances under which he or she operates. Interpreting puts constraints on interpreters which make their activity distinct from translation of written texts, where in cases of unknown cultural references, translators can resort to the use of notes. Interpreters are engaged in mediating communication between (two) clients who do not share the same language and who come from differing cultural backgrounds. Due to differences between the (British) English and the Slovak cultures - in their material, spiritual and behavioural aspects - as well as due to lack of knowledge of cultural references which the clients of English-Slovak interpreting have and which was caused historically, some intercultural mediation is needed. Its particular forms are the outcome of the weighing of the circumstances under which the English-Slovak consecutive interpreter works. Moreover, business interpreting contains challenges in the form of the vocabulary of business, a relatively new area for Slovak interpreters. An interpreter, under all the above mentioned constraints, has to fulfil his or her role: to establish and maintain communication between the two parties. Therefore some of the strategies used will try to prevent miscommunication, while others will try to deal with miscommunication once it has occurred
    corecore