175 research outputs found

    Guesswork, large deviations and Shannon entropy

    Get PDF
    How hard is it to guess a password? Massey showed that a simple function of the Shannon entropy of the distribution from which the password is selected is a lower bound on the expected number of guesses, but one which is not tight in general. In a series of subsequent papers under ever less restrictive stochastic assumptions, an asymptotic relationship as password length grows between scaled moments of the guesswork and specific R´enyi entropy was identified. Here we show that, when appropriately scaled, as the password length grows the logarithm of the guesswork satisfies a Large Deviation Principle (LDP), providing direct estimates of the guesswork distribution when passwords are long. The rate function governing the LDP possesses a specific, restrictive form that encapsulates underlying structure in the nature of guesswork. Returning to Massey’s original observation, a corollary to the LDP shows that expectation of the logarithm of the guesswork is the specific Shannon entropy of the password selection process

    Guessing a password over a wireless channel (on the effect of noise non-uniformity)

    Get PDF
    A string is sent over a noisy channel that erases some of its characters. Knowing the statistical properties of the string's source and which characters were erased, a listener that is equipped with an ability to test the veracity of a string, one string at a time, wishes to fill in the missing pieces. Here we characterize the influence of the stochastic properties of both the string's source and the noise on the channel on the distribution of the number of attempts required to identify the string, its guesswork. In particular, we establish that the average noise on the channel is not a determining factor for the average guesswork and illustrate simple settings where one recipient with, on average, a better channel than another recipient, has higher average guesswork. These results stand in contrast to those for the capacity of wiretap channels and suggest the use of techniques such as friendly jamming with pseudo-random sequences to exploit this guesswork behavior.Comment: Asilomar Conference on Signals, Systems & Computers, 201

    Centralized vs Decentralized Multi-Agent Guesswork

    Full text link
    We study a notion of guesswork, where multiple agents intend to launch a coordinated brute-force attack to find a single binary secret string, and each agent has access to side information generated through either a BEC or a BSC. The average number of trials required to find the secret string grows exponentially with the length of the string, and the rate of the growth is called the guesswork exponent. We compute the guesswork exponent for several multi-agent attacks. We show that a multi-agent attack reduces the guesswork exponent compared to a single agent, even when the agents do not exchange information to coordinate their attack, and try to individually guess the secret string using a predetermined scheme in a decentralized fashion. Further, we show that the guesswork exponent of two agents who do coordinate their attack is strictly smaller than that of any finite number of agents individually performing decentralized guesswork.Comment: Accepted at IEEE International Symposium on Information Theory (ISIT) 201

    Why Botnets Work: Distributed Brute-Force Attacks Need No Synchronization

    Full text link
    In September 2017, McAffee Labs quarterly report estimated that brute force attacks represent 20\% of total network attacks, making them the most prevalent type of attack ex-aequo with browser based vulnerabilities. These attacks have sometimes catastrophic consequences, and understanding their fundamental limits may play an important role in the risk assessment of password-secured systems, and in the design of better security protocols. While some solutions exist to prevent online brute-force attacks that arise from one single IP address, attacks performed by botnets are more challenging. In this paper, we analyze these distributed attacks by using a simplified model. Our aim is to understand the impact of distribution and asynchronization on the overall computational effort necessary to breach a system. Our result is based on Guesswork, a measure of the number of queries (guesses) required of an adversary before a correct sequence, such as a password, is found in an optimal attack. Guesswork is a direct surrogate for time and computational effort of guessing a sequence from a set of sequences with associated likelihoods. We model the lack of synchronization by a worst-case optimization in which the queries made by multiple adversarial agents are received in the worst possible order for the adversary, resulting in a min-max formulation. We show that, even without synchronization, and for sequences of growing length, the asymptotic optimal performance is achievable by using randomized guesses drawn from an appropriate distribution. Therefore, randomization is key for distributed asynchronous attacks. In other words, asynchronous guessers can asymptotically perform brute-force attacks as efficiently as synchronized guessers.Comment: Accepted to IEEE Transactions on Information Forensics and Securit

    Computational Security Subject to Source Constraints, Guesswork and Inscrutability

    Get PDF
    Guesswork forms the mathematical framework for quantifying computational security subject to brute-force determination by query. In this paper, we consider guesswork subject to a per-symbol Shannon entropy budget. We introduce inscrutability rate to quantify the asymptotic difficulty of guessing U out of V secret strings drawn from the string-source and prove that the inscrutability rate of any string-source supported on a finite alphabet X, if it exists, lies between the per-symbol Shannon entropy constraint and log |X|. We show that for a stationary string-source, the inscrutability rate of guessing any fraction (1 - ϵ) of the V strings for any fixed ϵ > 0, as V grows, approaches the per-symbol Shannon entropy constraint (which is equal to the Shannon entropy rate for the stationary string-source). This corresponds to the minimum inscrutability rate among all string-sources with the same per-symbol Shannon entropy. We further prove that the inscrutability rate of any finite-order Markov string-source with hidden statistics remains the same as the unhidden case, i.e., the asymptotic value of hiding the statistics per each symbol is vanishing. On the other hand, we show that there exists a string-source that achieves the upper limit on the inscrutability rate, i.e., log |X|, under the same Shannon entropy budget

    Tight Bounds on the R\'enyi Entropy via Majorization with Applications to Guessing and Compression

    Full text link
    This paper provides tight bounds on the R\'enyi entropy of a function of a discrete random variable with a finite number of possible values, where the considered function is not one-to-one. To that end, a tight lower bound on the R\'enyi entropy of a discrete random variable with a finite support is derived as a function of the size of the support, and the ratio of the maximal to minimal probability masses. This work was inspired by the recently published paper by Cicalese et al., which is focused on the Shannon entropy, and it strengthens and generalizes the results of that paper to R\'enyi entropies of arbitrary positive orders. In view of these generalized bounds and the works by Arikan and Campbell, non-asymptotic bounds are derived for guessing moments and lossless data compression of discrete memoryless sources.Comment: The paper was published in the Entropy journal (special issue on Probabilistic Methods in Information Theory, Hypothesis Testing, and Coding), vol. 20, no. 12, paper no. 896, November 22, 2018. Online available at https://www.mdpi.com/1099-4300/20/12/89

    Guessing Revisited: A Large Deviations Approach

    Full text link
    The problem of guessing a random string is revisited. A close relation between guessing and compression is first established. Then it is shown that if the sequence of distributions of the information spectrum satisfies the large deviation property with a certain rate function, then the limiting guessing exponent exists and is a scalar multiple of the Legendre-Fenchel dual of the rate function. Other sufficient conditions related to certain continuity properties of the information spectrum are briefly discussed. This approach highlights the importance of the information spectrum in determining the limiting guessing exponent. All known prior results are then re-derived as example applications of our unifying approach.Comment: 16 pages, to appear in IEEE Transaction on Information Theor
    corecore