661 research outputs found

    Padding Ain't Enough: Assessing the Privacy Guarantees of Encrypted DNS

    Get PDF
    DNS over TLS (DoT) and DNS over HTTPS (DoH) encrypt DNS to guard user privacy by hiding DNS resolutions from passive adversaries. Yet, past attacks have shown that encrypted DNS is still sensitive to traffic analysis. As a consequence, RFC 8467 proposes to pad messages prior to encryption, which heavily reduces the characteristics of encrypted traffic. In this paper, we show that padding alone is insufficient to counter DNS traffic analysis. We propose a novel traffic analysis method that combines size and timing information to infer the websites a user visits purely based on encrypted and padded DNS traces. To this end, we model DNS sequences that capture the complexity of websites that usually trigger dozens of DNS resolutions instead of just a single DNS transaction. A closed world evaluation based on the Alexa top-10k websites reveals that attackers can deanonymize at least half of the test traces in 80.2% of all websites, and even correctly label all traces for 32.0% of the websites. Our findings undermine the privacy goals of state-of-the-art message padding strategies in DoT/DoH. We conclude by showing that successful mitigations to such attacks have to remove the entropy of inter-arrival timings between query responses

    A distributed alerting service for open digital library software

    Get PDF
    Alerting for Digital Libraries (DL) is an important and useful feature for the library users. To date, two independent services and a few publisher-hosted proprietary services have been developed. Here, we address the problem of integrating alerting as functionality into open source software for distributed digital libraries. DL software is one application out of many that constitute so-called meta-software: software where its installation determines the properties of the actual running system (here: the Digital Library system). For this type of application, existing alerting solutions are insufficient; new ways have to be found for supporting a fragmented network of distributed digital library servers. We propose the design and usage of a distributed Directory Service. This paper also introduces our hybrid approach using two networks and a combination of different distributed routing strategies for event filtering

    IPv6-specific misconfigurations in the DNS

    Get PDF
    With the Internet transitioning from IPv4 to IPv6, the number of IPv6-specific DNS records (AAAA) increases. Misconfigurations in these records often go unnoticed, as most systems are provided with connectivity over both IPv4 and IPv6, and automatically fall back to IPv4 in case of connection problems. With IPv6-only networks on the rise, such misconfigurations result in servers or services rendered unreachable. Using long-term active DNS measurements over multiple zones, we qualify and quantify these IPv6-specific misconfigurations. Applying pattern matching on AAAA records revealed which configuration mistakes occur most, the distribution of faulty records per DNS operator, and how these numbers evolved over time. We show that more than 97% of invalid records can be categorized into one of our ten defined main configuration mistakes. Furthermore, we show that while the number and ratio of invalid records decreased over the last two years, the number of DNS operators with at least one faulty AAAA record increased. This emphasizes the need for easily applicable checks in DNS management systems, for which we provide recommendations in the conclusions of this work

    Rusty Clusters? Dusting an IPv6 Research Foundation

    Get PDF
    The long-running IPv6 Hitlist service is an important foundation for IPv6 measurement studies. It helps to overcome infeasible, complete address space scans by collecting valuable, unbiased IPv6 address candidates and regularly testing their responsiveness. However, the Internet itself is a quickly changing ecosystem that can affect longrunning services, potentially inducing biases and obscurities into ongoing data collection means. Frequent analyses but also updates are necessary to enable a valuable service to the community. In this paper, we show that the existing hitlist is highly impacted by the Great Firewall of China, and we offer a cleaned view on the development of responsive addresses. While the accumulated input shows an increasing bias towards some networks, the cleaned set of responsive addresses is well distributed and shows a steady increase. Although it is a best practice to remove aliased prefixes from IPv6 hitlists, we show that this also removes major content delivery networks. More than 98% of all IPv6 addresses announced by Fastly were labeled as aliased and Cloudflare prefixes hosting more than 10M domains were excluded. Depending on the hitlist usage, e.g., higher layer protocol scans, inclusion of addresses from these providers can be valuable. Lastly, we evaluate different new address candidate sources, including target generation algorithms to improve the coverage of the current IPv6 Hitlist. We show that a combination of different methodologies is able to identify 5.6M new, responsive addresses. This accounts for an increase by 174% and combined with the current IPv6 Hitlist, we identify 8.8M responsive addresses

    DNS weighted footprints for web browsing analytics

    Full text link
    The monetization of the large amount of data that ISPs have of their users is still in early stages. Specifically, the knowledge of the websites that specific users or aggregates of users visit opens new opportunities of business, after the convenient sanitization. However, the construction of accurate DNS-based web-user profiles on large networks is a challenge not only because the requirements that capturing traffic entails, but also given the use of DNS caches, the proliferation of botnets and the complexity of current websites (i.e., when a user visit a website a set of self-triggered DNS queries for banners, from both same company and third parties services, as well for some preloaded and prefetching contents are in place). In this way, we propose to count the intentional visits users make to websites by means of DNS weighted footprints. Such novel approach consists of considering that a website was actively visited if an empirical-estimated fraction of the DNS queries of both the own website and the set of self-triggered websites are found. This approach has been coded in a final system named DNSprints. After its parameterization (i.e., balancing the importance of a website in a footprint with respect to the total set of footprints), we have measured that our proposal is able to identify visits and their durations with false and true positives rates between 2 and 9% and over 90%, respectively, at throughputs between 800,000 and 1.4 million DNS packets per second in diverse scenarios, thus proving both its refinement and applicabilityThe authors would like to acknowledge funding received through TRAFICA (TEC2015-69417-C2-1-R) grant from the Spanish R&D programme. The authors thank Víctor Uceda for his collaboration in the early stages of this wor

    Inline detection of DGA domains using side information

    Get PDF
    Malware applications typically use a command and control (C&C) server to manage bots to perform malicious activities. Domain Generation Algorithms (DGAs) are popular methods for generating pseudo-random domain names that can be used to establish a communication between an infected bot and the C&C server. In recent years, machine learning based systems have been widely used to detect DGAs. There are several well known state-of-the-art classifiers in the literature that can detect DGA domain names in real-time applications with high predictive performance. However, these DGA classifiers are highly vulnerable to adversarial attacks in which adversaries purposely craft domain names to evade DGA detection classifiers. In our work, we focus on hardening DGA classifiers against adversarial attacks. To this end, we train and evaluate state-of-the-art deep learning and random forest (RF) classifiers for DGA detection using side information that is harder for adversaries to manipulate than the domain name itself. Additionally, the side information features are selected such that they are easily obtainable in practice to perform inline DGA detection. The performance and robustness of these models is assessed by exposing them to one day of real-traffic data as well as domains generated by adversarial attack algorithms. We found that the DGA classifiers that rely on both the domain name and side information have high performance and are more robust against adversaries

    Latency-Based Anycast Geolocation: Algorithms, Software, and Datasets

    Get PDF
    International audienceUse of IP-layer anycast has increased in the last few years beyond the DNS realm. Yet, existing measurement techniques to identify and enumerate anycast replicas exploit specifics of the DNS protocol, which limits their applicability to this particular service. With this paper, we not only propose and thoroughly validate a protocol-agnostic technique for anycast replicas discovery and geolocation, but also provide the community with open source software and datasets to replicate our experimental results, as well as facilitating the development of new techniques such as ours. In particular, our proposed method achieves thorough enumer-ation and city-level geolocalization of anycast instances from a set of known vantage points. The algorithm features an iterative workflow, pipelining enumeration (an optimization problem using latency as input) and geolocalization (a classification problem using side channel information such as city population) of anycast replicas. Results of a thorough validation campaign show our algorithm to be robust to measurement noise, and very lightweight as it requires only a handful of latency measurements
    corecore