165 research outputs found

    Website Fingerprinting: Attacks and Defenses

    Get PDF
    Website fingerprinting attacks allow a local, passive eavesdropper to determine a client's web activity by leveraging features from her packet sequence. These attacks break the privacy expected by users of privacy technologies, including low-latency anonymity networks such as proxies, VPNs, or Tor. As a discipline, website fingerprinting is an application of machine learning techniques to the diverse field of privacy. To perform a website fingerprinting attack, the eavesdropping attacker passively records the time, direction, and size of the client's packets. Then, he uses a machine learning algorithm to classify the packet sequence so as to determine the web page it came from. In this work we construct and evaluate three new website fingerprinting attacks: Wa-OSAD, an attack using a modified edit distance as the kernel of a Support Vector Machine, achieving greater accuracy than attacks before it; Wa-FLev, an attack that quickly approximates an edit distance computation, allowing a low-resource attacker to deanonymize many clients at once; and Wa-kNN, the current state-of-the-art attack, which is effective and fast, with a very low false positive rate in the open-world scenario. While our new attacks perform well in theoretical scenarios, there are significant differences between the situation in the wild and in the laboratory. Specifically, we tackle concerns regarding the freshness of the training set, splitting packet sequences so that each part corresponds to one web page access (for easy classification), and removing misleading noise from the packet sequence. To defend ourselves against such attacks, we need defenses that are both efficient and provable. We rigorously define and motivate the notion of a provable defense in this work, and we present three new provable defenses: Tamaraw, which is a relatively efficient way to flood the channel with fixed-rate packet scheduling; Supersequence, which uses smallest common supersequences to save on bandwidth overhead; and Walkie-Talkie, which uses half-duplex communication to significantly reduce both bandwidth and time overhead, allowing a truly efficient yet provable defense

    How Unique is Your .onion? An Analysis of the Fingerprintability of Tor Onion Services

    Full text link
    Recent studies have shown that Tor onion (hidden) service websites are particularly vulnerable to website fingerprinting attacks due to their limited number and sensitive nature. In this work we present a multi-level feature analysis of onion site fingerprintability, considering three state-of-the-art website fingerprinting methods and 482 Tor onion services, making this the largest analysis of this kind completed on onion services to date. Prior studies typically report average performance results for a given website fingerprinting method or countermeasure. We investigate which sites are more or less vulnerable to fingerprinting and which features make them so. We find that there is a high variability in the rate at which sites are classified (and misclassified) by these attacks, implying that average performance figures may not be informative of the risks that website fingerprinting attacks pose to particular sites. We analyze the features exploited by the different website fingerprinting methods and discuss what makes onion service sites more or less easily identifiable, both in terms of their traffic traces as well as their webpage design. We study misclassifications to understand how onion service sites can be redesigned to be less vulnerable to website fingerprinting attacks. Our results also inform the design of website fingerprinting countermeasures and their evaluation considering disparate impact across sites.Comment: Accepted by ACM CCS 201

    k-fingerprinting: a Robust Scalable Website Fingerprinting Technique

    Get PDF
    Website fingerprinting enables an attacker to infer which web page a client is browsing through encrypted or anonymized network connections. We present a new website fingerprinting technique based on random decision forests and evaluate performance over standard web pages as well as Tor hidden services, on a larger scale than previous works. Our technique, k-fingerprinting, performs better than current state-of-the-art attacks even against website fingerprinting defenses, and we show that it is possible to launch a website fingerprinting attack in the face of a large amount of noisy data. We can correctly determine which of 30 monitored hidden services a client is visiting with 85% true positive rate (TPR), a false positive rate (FPR) as low as 0.02%, from a world size of 100,000 unmonitored web pages. We further show that error rates vary widely between web resources, and thus some patterns of use will be predictably more vulnerable to attack than others.Comment: 17 page

    Mockingbird: Defending Against Deep-Learning-Based Website Fingerprinting Attacks with Adversarial Traces

    Full text link
    Website Fingerprinting (WF) is a type of traffic analysis attack that enables a local passive eavesdropper to infer the victim's activity, even when the traffic is protected by a VPN or an anonymity system like Tor. Leveraging a deep-learning classifier, a WF attacker can gain over 98% accuracy on Tor traffic. In this paper, we explore a novel defense, Mockingbird, based on the idea of adversarial examples that have been shown to undermine machine-learning classifiers in other domains. Since the attacker gets to design and train his attack classifier based on the defense, we first demonstrate that at a straightforward technique for generating adversarial-example based traces fails to protect against an attacker using adversarial training for robust classification. We then propose Mockingbird, a technique for generating traces that resists adversarial training by moving randomly in the space of viable traces and not following more predictable gradients. The technique drops the accuracy of the state-of-the-art attack hardened with adversarial training from 98% to 42-58% while incurring only 58% bandwidth overhead. The attack accuracy is generally lower than state-of-the-art defenses, and much lower when considering Top-2 accuracy, while incurring lower bandwidth overheads.Comment: 18 pages, 13 figures and 8 Tables. Accepted in IEEE Transactions on Information Forensics and Security (TIFS
    • …
    corecore