Search CORE

5 research outputs found

Learning regexes to extract router names from hostnames

Author: Bartoli Alberto
Huffaker Bradley
Keys Ken
Luckie Matthew
Willinger Walter
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

We present the design, implementation, evaluation, and validation of a system that automatically learns to extract router names (router identifiers) from hostnames stored by network operators in different DNS zones, which we represent by regular expressions (regexes). Our supervised-learning approach evaluates automatically generated candidate regexes against sets of hostnames for IP addresses that other alias resolution techniques previously inferred to identify interfaces on the same router. Conceptually, if three conditions hold: (1) a regex extracts the same value from a set of hostnames associated with IP addresses on the same router; (2) the value is unique to that router; and (3) the regex extracts names for multiple routers in the suffix, then we conclude the regex accurately represents the naming convention for the suffix. We train our system using router aliases inferred from active probing to learn regexes for 2550 different suffixes. We then demonstrate the utility of this system by using the regexes to find 105% additional aliases for these suffixes. Regexes inferred in IPv4 perfectly predict aliases for ≈85% of suffixes with IPv6 aliases, i.e., IPv4 and IPv6 addresses representing the same underlying router, and find 9.0 times more routers in IPv6 than found by prior techniques

Crossref

Research Commons@Waikato

Saving Brian's Privacy: the Perils of Privacy Exposure through Reverse DNS

Author: Jonker Mattijs
Sommese Raffaele
Sperotto Anna
van der Toorn Olivier
van Rijswijk-Deij Roland
Publication venue
Publication date: 20/09/2022
Field of study

Given the importance of privacy, many Internet protocols are nowadays designed with privacy in mind (e.g., using TLS for confidentiality). Foreseeing all privacy issues at the time of protocol design is, however, challenging and may become near impossible when interaction out of protocol bounds occurs. One demonstrably not well understood interaction occurs when DHCP exchanges are accompanied by automated changes to the global DNS (e.g., to dynamically add hostnames for allocated IP addresses). As we will substantiate, this is a privacy risk: one may be able to infer device presence and network dynamics from virtually anywhere on the Internet -- and even identify and track individuals -- even if other mechanisms to limit tracking by outsiders (e.g., blocking pings) are in place. We present a first of its kind study into this risk. We identify networks that expose client identifiers in reverse DNS records and study the relation between the presence of clients and said records. Our results show a strong link: in 9 out of 10 cases, records linger for at most an hour, for a selection of academic, enterprise and ISP networks alike. We also demonstrate how client patterns and network dynamics can be learned, by tracking devices owned by persons named Brian over time, revealing shifts in work patterns caused by COVID-19 related work-from-home measures, and by determining a good time to stage a heist

arXiv.org e-Print Archive

University of Twente Research Information

vrfinder: Finding outbound addresses in traceroute

Author: claffy kc
Huffaker Bradley
Luckie Matthew John
Marder Alexander
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

Current methods to analyze the Internet's router-level topology with paths collected using traceroute assume that the source address for each router in the path is either an inbound or off-path address on each router. In this work, we show that outbound addresses are common in our Internet-wide traceroute dataset collected by CAIDA's Ark vantage points in January 2020, accounting for 1.7% - 5.8% of the addresses seen at some point before the end of a traceroute. This phenomenon can lead to mistakes in Internet topology analysis, such as inferring router ownership and identifying interdomain links. We hypothesize that the primary contributor to outbound addresses is Layer 3 Virtual Private Networks (L3VPNs), and propose vrfinder, a technique for identifying L3VPN outbound addresses in traceroute collections. We validate vrfinder against ground truth from two large research and education networks, demonstrating high precision (100.0%) and recall (82.1% - 95.3%). We also show the benefit of accounting for L3VPNs in traceroute analysis through extensions to bdrmapIT, increasing the accuracy of its router ownership inferences for L3VPN outbound addresses from 61.5% - 79.4% to 88.9% - 95.5%

Research Commons@Waikato

Characterizing the IoT ecosystem at scale

Author: Saidi Said Jawad
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2022
Field of study

Internet of Things (IoT) devices are extremely popular with home, business, and industrial users. To provide their services, they typically rely on a backend server in- frastructure on the Internet, which collectively form the IoT Ecosystem. This ecosys- tem is rapidly growing and offers users an increasing number of services. It also has been a source and target of significant security and privacy risks. One notable exam- ple is the recent large-scale coordinated global attacks, like Mirai, which disrupted large service providers. Thus, characterizing this ecosystem yields insights that help end-users, network operators, policymakers, and researchers better understand it, obtain a detailed view, and keep track of its evolution. In addition, they can use these insights to inform their decision-making process for mitigating this ecosystem’s security and privacy risks. In this dissertation, we characterize the IoT ecosystem at scale by (i) detecting the IoT devices in the wild, (ii) conducting a case study to measure how deployed IoT devices can affect users’ privacy, and (iii) detecting and measuring the IoT backend infrastructure. To conduct our studies, we collaborated with a large European Internet Service Provider (ISP) and a major European Internet eXchange Point (IXP). They rou- tinely collect large volumes of passive, sampled data, e.g., NetFlow and IPFIX, for their operational purposes. These data sources help providers obtain insights about their networks, and we used them to characterize the IoT ecosystem at scale. We start with IoT devices and study how to track and trace their activity in the wild. We developed and evaluated a scalable methodology to accurately detect and monitor IoT devices with limited, sparsely sampled data in the ISP and IXP. Next, we conduct a case study to measure how a myriad of deployed devices can affect the privacy of ISP subscribers. Unfortunately, we found that the privacy of a substantial fraction of IPv6 end-users is at risk. We noticed that a single device at home that encodes its MAC address into the IPv6 address could be utilized as a tracking identifier for the entire end-user prefix—even if other devices use IPv6 privacy extensions. Our results showed that IoT devices contribute the most to this privacy leakage. Finally, we focus on the backend server infrastructure and propose a methodology to identify and locate IoT backend servers operated by cloud services and IoT vendors. We analyzed their IoT traffic patterns as observed in the ISP. Our analysis sheds light on their diverse operational and deployment strategies. The need for issuing a priori unknown network-wide queries against large volumes of network flow capture data, which we used in our studies, motivated us to develop Flowyager. It is a system built on top of existing traffic capture utilities, and it relies on flow summarization techniques to reduce (i) the storage and transfer cost of flow captures and (ii) query response time. We deployed a prototype of Flowyager at both the IXP and ISP.Internet-of-Things-Geräte (IoT) sind aus vielen Haushalten, Büroräumen und In- dustrieanlagen nicht mehr wegzudenken. Um ihre Dienste zu erbringen, nutzen IoT- Geräte typischerweise auf eine Backend-Server-Infrastruktur im Internet, welche als Gesamtheit das IoT-Ökosystem bildet. Dieses Ökosystem wächst rapide an und bie- tet den Nutzern immer mehr Dienste an. Das IoT-Ökosystem ist jedoch sowohl eine Quelle als auch ein Ziel von signifikanten Risiken für die Sicherheit und Privatsphäre. Ein bemerkenswertes Beispiel sind die jüngsten groß angelegten, koordinierten globa- len Angriffe wie Mirai, durch die große Diensteanbieter gestört haben. Deshalb ist es wichtig, dieses Ökosystem zu charakterisieren, eine ganzheitliche Sicht zu bekommen und die Entwicklung zu verfolgen, damit Forscher, Entscheidungsträger, Endnutzer und Netzwerkbetreibern Einblicke und ein besseres Verständnis erlangen. Außerdem können alle Teilnehmer des Ökosystems diese Erkenntnisse nutzen, um ihre Entschei- dungsprozesse zur Verhinderung von Sicherheits- und Privatsphärerisiken zu verbes- sern. In dieser Dissertation charakterisieren wir die Gesamtheit des IoT-Ökosystems indem wir (i) IoT-Geräte im Internet detektieren, (ii) eine Fallstudie zum Einfluss von benutzten IoT-Geräten auf die Privatsphäre von Nutzern durchführen und (iii) die IoT-Backend-Infrastruktur aufdecken und vermessen. Um unsere Studien durchzuführen, arbeiten wir mit einem großen europäischen Internet- Service-Provider (ISP) und einem großen europäischen Internet-Exchange-Point (IXP) zusammen. Diese sammeln routinemäßig für operative Zwecke große Mengen an pas- siven gesampelten Daten (z.B. als NetFlow oder IPFIX). Diese Datenquellen helfen Netzwerkbetreibern Einblicke in ihre Netzwerke zu erlangen und wir verwendeten sie, um das IoT-Ökosystem ganzheitlich zu charakterisieren. Wir beginnen unsere Analysen mit IoT-Geräten und untersuchen, wie diese im Inter- net aufgespürt und verfolgt werden können. Dazu entwickelten und evaluierten wir eine skalierbare Methodik, um IoT-Geräte mit Hilfe von eingeschränkten gesampelten Daten des ISPs und IXPs präzise erkennen und beobachten können. Als Nächstes führen wir eine Fallstudie durch, in der wir messen, wie eine Unzahl von eingesetzten Geräten die Privatsphäre von ISP-Nutzern beeinflussen kann. Lei- der fanden wir heraus, dass die Privatsphäre eines substantiellen Teils von IPv6- Endnutzern bedroht ist. Wir entdeckten, dass bereits ein einzelnes Gerät im Haus, welches seine MAC-Adresse in die IPv6-Adresse kodiert, als Tracking-Identifikator für das gesamte Endnutzer-Präfix missbraucht werden kann — auch wenn andere Geräte IPv6-Privacy-Extensions verwenden. Unsere Ergebnisse zeigten, dass IoT-Geräte den Großteil dieses Privatsphäre-Verlusts verursachen. Abschließend fokussieren wir uns auf die Backend-Server-Infrastruktur und wir schla- gen eine Methodik zur Identifizierung und Lokalisierung von IoT-Backend-Servern vor, welche von Cloud-Diensten und IoT-Herstellern betrieben wird. Wir analysier- ten Muster im IoT-Verkehr, der vom ISP beobachtet wird. Unsere Analyse gibt Auf- schluss über die unterschiedlichen Strategien, wie IoT-Backend-Server betrieben und eingesetzt werden. Die Notwendigkeit a-priori unbekannte netzwerkweite Anfragen an große Mengen von Netzwerk-Flow-Daten zu stellen, welche wir in in unseren Studien verwenden, moti- vierte uns zur Entwicklung von Flowyager. Dies ist ein auf bestehenden Netzwerkverkehrs- Tools aufbauendes System und es stützt sich auf die Zusammenfassung von Verkehrs- flüssen, um (i) die Kosten für Archivierung und Transfer von Flow-Daten und (ii) die Antwortzeit von Anfragen zu reduzieren. Wir setzten einen Prototypen von Flowyager sowohl im IXP als auch im ISP ein

Universaar

Acronym

Improving Salience Retention and Identification in the Automated Filtering of Event Log Messages

Author: Radford Paul
Publication venue: 'Victoria University of Wellington Library'
Publication date: 01/01/2011
Field of study

Event log messages are currently the only genuine interface through which computer systems administrators can effectively monitor their systems and assemble a mental perception of system state. The popularisation of the Internet and the accompanying meteoric growth of business-critical systems has resulted in an overwhelming volume of event log messages, channeled through mechanisms whose designers could not have envisaged the scale of the problem. Messages regarding intrusion detection, hardware status, operating system status changes, database tablespaces, and so on, are being produced at the rate of many gigabytes per day for a significant computing environment. Filtering technologies have not been able to keep up. Most messages go unnoticed; no filtering whatsoever is performed on them, at least in part due to the difficulty of implementing and maintaining an effective filtering solution. The most commonly-deployed filtering alternatives rely on regular expressions to match pre-defi ned strings, with 100% accuracy, which can then become ineffective as the code base for the software producing the messages 'drifts' away from those strings. The exactness requirement means all possible failure scenarios must be accurately anticipated and their events catered for with regular expressions, in order to make full use of this technique. Alternatives to regular expressions remain largely academic. Data mining, automated corpus construction, and neural networks, to name the highest-profi le ones, only produce probabilistic results and are either difficult or impossible to alter in any deterministic way. Policies are therefore not supported under these alternatives. This thesis explores a new architecture which utilises rich metadata in order to avoid the burden of message interpretation. The metadata itself is based on an intention to improve end-to-end communication and reduce ambiguity. A simple yet effective filtering scheme is also presented which fi lters log messages through a short and easily-customisable set of rules. With such an architecture, it is envisaged that systems administrators could signi ficantly improve their awareness of their systems while avoiding many of the false-positives and -negatives which plague today's fi ltering solutions

Victoria University of Wellington

ResearchArchive at Victoria University of Wellington