24 research outputs found
Why Botnets Work: Distributed Brute-Force Attacks Need No Synchronization
In September 2017, McAffee Labs quarterly report estimated that brute force
attacks represent 20\% of total network attacks, making them the most prevalent
type of attack ex-aequo with browser based vulnerabilities. These attacks have
sometimes catastrophic consequences, and understanding their fundamental limits
may play an important role in the risk assessment of password-secured systems,
and in the design of better security protocols. While some solutions exist to
prevent online brute-force attacks that arise from one single IP address,
attacks performed by botnets are more challenging. In this paper, we analyze
these distributed attacks by using a simplified model. Our aim is to understand
the impact of distribution and asynchronization on the overall computational
effort necessary to breach a system. Our result is based on Guesswork, a
measure of the number of queries (guesses) required of an adversary before a
correct sequence, such as a password, is found in an optimal attack. Guesswork
is a direct surrogate for time and computational effort of guessing a sequence
from a set of sequences with associated likelihoods. We model the lack of
synchronization by a worst-case optimization in which the queries made by
multiple adversarial agents are received in the worst possible order for the
adversary, resulting in a min-max formulation. We show that, even without
synchronization, and for sequences of growing length, the asymptotic optimal
performance is achievable by using randomized guesses drawn from an appropriate
distribution. Therefore, randomization is key for distributed asynchronous
attacks. In other words, asynchronous guessers can asymptotically perform
brute-force attacks as efficiently as synchronized guessers.Comment: Accepted to IEEE Transactions on Information Forensics and Securit
Why Botnets Work: Distributed Brute-Force Attacks Need No Synchronization
In September 2017, McAffee Labs quarterly report estimated that brute force
attacks represent 20% of total network attacks, making them the most prevalent
type of attack ex-aequo with browser based vulnerabilities. These attacks have
sometimes catastrophic consequences, and understanding their fundamental limits
may play an important role in the risk assessment of password-secured systems,
and in the design of better security protocols. While some solutions exist to
prevent online brute-force attacks that arise from one single IP address,
attacks performed by botnets are more challenging. In this paper, we analyze
these distributed attacks by using a simplified model. Our aim is to understand
the impact of distribution and asynchronization on the overall computational
effort necessary to breach a system. Our result is based on Guesswork, a
measure of the number of password queries (guesses) before the correct one is
found in an optimal attack, which is a direct surrogate for the time and the
computational effort. We model the lack of synchronization by a worst-case
optimization in which the queries are received in the worst possible order,
resulting in a min-max formulation. We show that even without synchronization
and for sequences of growing length, the asymptotic optimal performance is
achievable by using randomized guesses drawn from an appropriate distribution.
Therefore, randomization is key for distributed asynchronous attacks. In other
words, asynchronous guessers can asymptotically perform brute-force attacks as
efficiently as synchronized guessers.Comment: 13 pages, 4 figure
Centralized vs Decentralized Targeted Brute-Force Attacks: Guessing with Side-Information
According to recent empirical studies, a majority of users have the same, or
very similar, passwords across multiple password-secured online services. This
practice can have disastrous consequences, as one password being compromised
puts all the other accounts at much higher risk. Generally, an adversary may
use any side-information he/she possesses about the user, be it demographic
information, password reuse on a previously compromised account, or any other
relevant information to devise a better brute-force strategy (so called
targeted attack). In this work, we consider a distributed brute-force attack
scenario in which adversaries, each observing some side information,
attempt breaching a password secured system. We compare two strategies: an
uncoordinated attack in which the adversaries query the system based on their
own side-information until they find the correct password, and a fully
coordinated attack in which the adversaries pool their side-information and
query the system together. For passwords of length , generated
independently and identically from a distribution , we establish an
asymptotic closed-form expression for the uncoordinated and coordinated
strategies when the side-information are generated
independently from passing through a memoryless channel ,
as the length of the password goes to infinity. We illustrate our results
for binary symmetric channels and binary erasure channels, two families of
side-information channels which model password reuse. We demonstrate that two
coordinated agents perform asymptotically better than any finite number of
uncoordinated agents for these channels, meaning that sharing side-information
is very valuable in distributed attacks
On Tilted Losses in Machine Learning: Theory and Applications
Exponential tilting is a technique commonly used in fields such as
statistics, probability, information theory, and optimization to create
parametric distribution shifts. Despite its prevalence in related fields,
tilting has not seen widespread use in machine learning. In this work, we aim
to bridge this gap by exploring the use of tilting in risk minimization. We
study a simple extension to ERM -- tilted empirical risk minimization (TERM) --
which uses exponential tilting to flexibly tune the impact of individual
losses. The resulting framework has several useful properties: We show that
TERM can increase or decrease the influence of outliers, respectively, to
enable fairness or robustness; has variance-reduction properties that can
benefit generalization; and can be viewed as a smooth approximation to the tail
probability of losses. Our work makes rigorous connections between TERM and
related objectives, such as Value-at-Risk, Conditional Value-at-Risk, and
distributionally robust optimization (DRO). We develop batch and stochastic
first-order optimization methods for solving TERM, provide convergence
guarantees for the solvers, and show that the framework can be efficiently
solved relative to common alternatives. Finally, we demonstrate that TERM can
be used for a multitude of applications in machine learning, such as enforcing
fairness between subgroups, mitigating the effect of outliers, and handling
class imbalance. Despite the straightforward modification TERM makes to
traditional ERM objectives, we find that the framework can consistently
outperform ERM and deliver competitive performance with state-of-the-art,
problem-specific approaches.Comment: arXiv admin note: substantial text overlap with arXiv:2007.0116