9 research outputs found

    Characterizing the IoT ecosystem at scale

    Get PDF
    Internet of Things (IoT) devices are extremely popular with home, business, and industrial users. To provide their services, they typically rely on a backend server in- frastructure on the Internet, which collectively form the IoT Ecosystem. This ecosys- tem is rapidly growing and offers users an increasing number of services. It also has been a source and target of significant security and privacy risks. One notable exam- ple is the recent large-scale coordinated global attacks, like Mirai, which disrupted large service providers. Thus, characterizing this ecosystem yields insights that help end-users, network operators, policymakers, and researchers better understand it, obtain a detailed view, and keep track of its evolution. In addition, they can use these insights to inform their decision-making process for mitigating this ecosystem’s security and privacy risks. In this dissertation, we characterize the IoT ecosystem at scale by (i) detecting the IoT devices in the wild, (ii) conducting a case study to measure how deployed IoT devices can affect users’ privacy, and (iii) detecting and measuring the IoT backend infrastructure. To conduct our studies, we collaborated with a large European Internet Service Provider (ISP) and a major European Internet eXchange Point (IXP). They rou- tinely collect large volumes of passive, sampled data, e.g., NetFlow and IPFIX, for their operational purposes. These data sources help providers obtain insights about their networks, and we used them to characterize the IoT ecosystem at scale. We start with IoT devices and study how to track and trace their activity in the wild. We developed and evaluated a scalable methodology to accurately detect and monitor IoT devices with limited, sparsely sampled data in the ISP and IXP. Next, we conduct a case study to measure how a myriad of deployed devices can affect the privacy of ISP subscribers. Unfortunately, we found that the privacy of a substantial fraction of IPv6 end-users is at risk. We noticed that a single device at home that encodes its MAC address into the IPv6 address could be utilized as a tracking identifier for the entire end-user prefix—even if other devices use IPv6 privacy extensions. Our results showed that IoT devices contribute the most to this privacy leakage. Finally, we focus on the backend server infrastructure and propose a methodology to identify and locate IoT backend servers operated by cloud services and IoT vendors. We analyzed their IoT traffic patterns as observed in the ISP. Our analysis sheds light on their diverse operational and deployment strategies. The need for issuing a priori unknown network-wide queries against large volumes of network flow capture data, which we used in our studies, motivated us to develop Flowyager. It is a system built on top of existing traffic capture utilities, and it relies on flow summarization techniques to reduce (i) the storage and transfer cost of flow captures and (ii) query response time. We deployed a prototype of Flowyager at both the IXP and ISP.Internet-of-Things-GerĂ€te (IoT) sind aus vielen Haushalten, BĂŒrorĂ€umen und In- dustrieanlagen nicht mehr wegzudenken. Um ihre Dienste zu erbringen, nutzen IoT- GerĂ€te typischerweise auf eine Backend-Server-Infrastruktur im Internet, welche als Gesamtheit das IoT-Ökosystem bildet. Dieses Ökosystem wĂ€chst rapide an und bie- tet den Nutzern immer mehr Dienste an. Das IoT-Ökosystem ist jedoch sowohl eine Quelle als auch ein Ziel von signifikanten Risiken fĂŒr die Sicherheit und PrivatsphĂ€re. Ein bemerkenswertes Beispiel sind die jĂŒngsten groß angelegten, koordinierten globa- len Angriffe wie Mirai, durch die große Diensteanbieter gestört haben. Deshalb ist es wichtig, dieses Ökosystem zu charakterisieren, eine ganzheitliche Sicht zu bekommen und die Entwicklung zu verfolgen, damit Forscher, EntscheidungstrĂ€ger, Endnutzer und Netzwerkbetreibern Einblicke und ein besseres VerstĂ€ndnis erlangen. Außerdem können alle Teilnehmer des Ökosystems diese Erkenntnisse nutzen, um ihre Entschei- dungsprozesse zur Verhinderung von Sicherheits- und PrivatsphĂ€rerisiken zu verbes- sern. In dieser Dissertation charakterisieren wir die Gesamtheit des IoT-Ökosystems indem wir (i) IoT-GerĂ€te im Internet detektieren, (ii) eine Fallstudie zum Einfluss von benutzten IoT-GerĂ€ten auf die PrivatsphĂ€re von Nutzern durchfĂŒhren und (iii) die IoT-Backend-Infrastruktur aufdecken und vermessen. Um unsere Studien durchzufĂŒhren, arbeiten wir mit einem großen europĂ€ischen Internet- Service-Provider (ISP) und einem großen europĂ€ischen Internet-Exchange-Point (IXP) zusammen. Diese sammeln routinemĂ€ĂŸig fĂŒr operative Zwecke große Mengen an pas- siven gesampelten Daten (z.B. als NetFlow oder IPFIX). Diese Datenquellen helfen Netzwerkbetreibern Einblicke in ihre Netzwerke zu erlangen und wir verwendeten sie, um das IoT-Ökosystem ganzheitlich zu charakterisieren. Wir beginnen unsere Analysen mit IoT-GerĂ€ten und untersuchen, wie diese im Inter- net aufgespĂŒrt und verfolgt werden können. Dazu entwickelten und evaluierten wir eine skalierbare Methodik, um IoT-GerĂ€te mit Hilfe von eingeschrĂ€nkten gesampelten Daten des ISPs und IXPs prĂ€zise erkennen und beobachten können. Als NĂ€chstes fĂŒhren wir eine Fallstudie durch, in der wir messen, wie eine Unzahl von eingesetzten GerĂ€ten die PrivatsphĂ€re von ISP-Nutzern beeinflussen kann. Lei- der fanden wir heraus, dass die PrivatsphĂ€re eines substantiellen Teils von IPv6- Endnutzern bedroht ist. Wir entdeckten, dass bereits ein einzelnes GerĂ€t im Haus, welches seine MAC-Adresse in die IPv6-Adresse kodiert, als Tracking-Identifikator fĂŒr das gesamte Endnutzer-PrĂ€fix missbraucht werden kann — auch wenn andere GerĂ€te IPv6-Privacy-Extensions verwenden. Unsere Ergebnisse zeigten, dass IoT-GerĂ€te den Großteil dieses PrivatsphĂ€re-Verlusts verursachen. Abschließend fokussieren wir uns auf die Backend-Server-Infrastruktur und wir schla- gen eine Methodik zur Identifizierung und Lokalisierung von IoT-Backend-Servern vor, welche von Cloud-Diensten und IoT-Herstellern betrieben wird. Wir analysier- ten Muster im IoT-Verkehr, der vom ISP beobachtet wird. Unsere Analyse gibt Auf- schluss ĂŒber die unterschiedlichen Strategien, wie IoT-Backend-Server betrieben und eingesetzt werden. Die Notwendigkeit a-priori unbekannte netzwerkweite Anfragen an große Mengen von Netzwerk-Flow-Daten zu stellen, welche wir in in unseren Studien verwenden, moti- vierte uns zur Entwicklung von Flowyager. Dies ist ein auf bestehenden Netzwerkverkehrs- Tools aufbauendes System und es stĂŒtzt sich auf die Zusammenfassung von Verkehrs- flĂŒssen, um (i) die Kosten fĂŒr Archivierung und Transfer von Flow-Daten und (ii) die Antwortzeit von Anfragen zu reduzieren. Wir setzten einen Prototypen von Flowyager sowohl im IXP als auch im ISP ein

    Consistent SDNs through Network State Fuzzing

    Full text link
    The conventional wisdom is that a software-defined network (SDN) operates under the premise that the logically centralized control plane has an accurate representation of the actual data plane state. Unfortunately, bugs, misconfigurations, faults or attacks can introduce inconsistencies that undermine correct operation. Previous work in this area, however, lacks a holistic methodology to tackle this problem and thus, addresses only certain parts of the problem. Yet, the consistency of the overall system is only as good as its least consistent part. Motivated by an analogy of network consistency checking with program testing, we propose to add active probe-based network state fuzzing to our consistency check repertoire. Hereby, our system, PAZZ, combines production traffic with active probes to periodically test if the actual forwarding path and decision elements (on the data plane) correspond to the expected ones (on the control plane). Our insight is that active traffic covers the inconsistency cases beyond the ones identified by passive traffic. PAZZ prototype was built and evaluated on topologies of varying scale and complexity. Our results show that PAZZ requires minimal network resources to detect persistent data plane faults through fuzzing and localize them quickly while outperforming baseline approaches.Comment: Added three extra relevant references, the arXiv later was accepted in IEEE Transactions of Network and Service Management (TNSM), 2019 with the title "Towards Consistent SDNs: A Case for Network State Fuzzing

    Flowtree: Enabling Distributed Flow Summarization at Scale

    Get PDF
    © ACM 2018. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM SIGCOMM 2018, http://dx.doi.org/10.1145/3234200.3234225.NetFlow and IPFIX raw ow captures are insightful yet, due to their large volume, challenging to timely analyze and query. In particular, if these captures span long time periods or are collected at remote locations, storing or transferring them for analysis becomes increasingly expensive. Enabling efficient execution of a large range of queries over ow captures while reducing storage and transfer volume requires working with mergeable succinct summaries that capture the most essential features of flows dynamically. However, the problem of building such structures is yet unmet. In this work, we introduce a self-adjusting data structure of generalized flows, called Flowtree, that (1) re- duces the storage requirements by more than 95% while providing highly accurate answers for popular hierarchical flows, (2) minimizes transfer cost of ow summaries, and (3) supports several operators with distributed execution and summarization across time and multiple sites. The evaluation of our solution on different network traces confirms that Flowtree can accurately and promptly answer questions about flows using different feature sets.EC/H2020/679158/EU/Resolving the Tussle in the Internet: Mapping, Architecture, and Policy Making/ResolutioNetDFG, FE 570/4-1, Gottfried Wilhelm Leibniz-Preis 2011BMBF, 01IS14013A, BBDC - Berliner Kompetenzzentrum fĂŒr Big Dat

    Deep Dive into the IoT Backend Ecosystem

    Get PDF
    Internet of Things (IoT) devices are becoming increasingly ubiquitous, e.g., at home, in enterprise environments, and in production lines. To support the advanced functionalities of IoT devices, IoT vendors as well as service and cloud companies operate IoT backends -- the focus of this paper. We propose a methodology to identify and locate them by (a) compiling a list of domains used exclusively by major IoT backend providers and (b) then identifying their server IP addresses. We rely on multiple sources, including IoT backend provider documentation, passive DNS data, and active scanning. For analyzing IoT traffic patterns, we rely on passive network flows from a major European ISP. Our analysis focuses on the top IoT backends and unveils diverse operational strategies -- from operating their own infrastructure to utilizing the public cloud. We find that the majority of the top IoT backend providers are located in multiple locations and countries. Still, a handful are located only in one country, which could raise regulatory scrutiny as the client IoT devices are located in other regions. Indeed, our analysis shows that up to 35% of IoT traffic is exchanged with IoT backend servers located in other continents. We also find that at least six of the top IoT backends rely on other IoT backend providers. We also evaluate if cascading effects among the IoT backend providers are possible in the event of an outage, a misconfiguration, or an attack

    Position paper : A systematic framework for categorising IoT device fingerprinting mechanisms

    Get PDF
    The popularity of the Internet of Things (IoT) devices makes it increasingly important to be able to fingerprint them, for example in order to detect if there are misbehaving or even malicious IoT devices in one's network. However, there are many challenges faced in the task of fingerprinting IoT devices, mainly due to the huge variety of the devices involved. At the same time, the task can potentially be improved by applying machine learning techniques for better accuracy and efficiency. The aim of this paper is to provide a systematic categorisation of machine learning augmented techniques that can be used for fingerprinting IoT devices. This can serve as a baseline for comparing various IoT fingerprinting mechanisms, so that network administrators can choose one or more mechanisms that are appropriate for monitoring and maintaining their network. We carried out an extensive literature review of existing papers on fingerprinting IoT devices -- paying close attention to those with machine learning features. This is followed by an extraction of important and comparable features among the mechanisms outlined in those papers. As a result, we came up with a key set of terminologies that are relevant both in the fingerprinting context and in the IoT domain. This enabled us to construct a framework called IDWork, which can be used for categorising existing IoT fingerprinting mechanisms in a way that will facilitate a coherent and fair comparison of these mechanisms. We found that the majority of the IoT fingerprinting mechanisms take a passive approach -- mainly through network sniffing -- instead of being intrusive and interactive with the device of interest. Additionally, a significant number of the surveyed mechanisms employ both static and dynamic approaches, in order to benefit from complementary features that can be more robust against certain attacks such as spoofing and replay attacks

    One bad apple can spoil your IPv6 privacy

    No full text
    IPv6 is being more and more adopted, in part to facilitate the millions of smart devices that have already been installed at home. Unfortunately, we find that the privacy of a substantial fraction of end-users is still at risk, despite the efforts by ISPs and electronic vendors to improve end-user security, e.g., by adopting prefix rotation and IPv6 privacy extensions. By analyzing passive data from a large ISP, we find that around 19% of end-users' privacy can be at risk. When we investigate the root causes, we notice that a single device at home that encodes its MAC address into the IPv6 address can be utilized as a tracking identifier for the entire end-user prefix-even if other devices use IPv6 privacy extensions. Our results show that IoT devices contribute the most to this privacy leakage and, to a lesser extent, personal computers and mobile devices. To our surprise, some of the most popular IoT manufacturers have not yet adopted privacy extensions that could otherwise mitigate this privacy risk. Finally, we show that third-party providers, e.g., hypergiants, can track up to 17% of subscriber lines in our study.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Cyber Securit

    Exploring Network-Wide Flow Data with Flowyager

    Get PDF
    Many network operations, ranging from attack investigation and mitigation to traffic management, require answering network-wide flow queries in seconds. Although flow records are collected at each router, using available traffic capture utilities, querying the resulting datasets from hundreds of routers across sites and over time, remains a significant challenge due to the sheer traffic volume and distributed nature of flow records. In this paper, we investigate how to improve the response time for a priori unknown network-wide queries. We present Flowyager, a system that is built on top of existing traffic capture utilities. Flowyager generates and analyzes tree data structures, that we call Flowtrees, which are succinct summaries of the raw flow data available by capture utilities. Flowtrees are self-adjusted data structures that drastically reduce space and transfer requirements, by 75% to 95%, compared to raw flow records. Flowyager manages the storage and transfers of Flowtrees, supports Flowtree operators, and provides a structured query language for answering flow queries across sites and time periods. By deploying a Flowyager prototype at both a large Internet Exchange Point and a Tier-1 Internet Service Provider, we showcase its capabilities for networks with hundreds of router interfaces. Our results show that the query response time can be reduced by an order of magnitude when compared with alternative data analytics platforms. Thus, Flowyager enables interactive network-wide queries and offers unprecedented drill-down capabilities to, e.g., identify DDoS culprits, pinpoint the involved sites, and determine the length of the attack.EC/H2020/679158/EU/Resolving the Tussle in the Internet: Mapping, Architecture, and Policy Making/ResolutioNetBMBF, 01IS18025A, Verbundprojekt BIFOLD-BBDC: Berlin Institute for the Foundations of Learning and DataBMBF, 01IS18037A, Verbundprojekt BIFOLD-BZML: Berlin Institute for the Foundations of Learning and Dat

    Fifteen Months in the Life of a Honeyfarm

    No full text
    Honeypots have been used for decades to detect, monitor, and understand attempts of unauthorized use of information systems. Previous studies focused on characterizing the spread of malware, e.g., Mirai and other attacks, or proposed stealthy and interactive architectures to improve honeypot efficiency.In this paper, we present insights and benefits gained from collaborating with an operational honeyfarm, i.e., a set of honeypots distributed around the globe with centralized data collection. We analyze data of about 400 million sessions over a 15-month period, gathered from a globally distributed honeyfarm consisting of 221 honeypots deployed in 55 countries. Our analysis unveils stark differences among the activity seen by the honeypots-some are contacted millions of times while others only observe a few thousand sessions. We also analyze the behavior of scouters and intruders of these honeypots. Again, some honeypots report orders of magnitude more interactions with command execution than others. Still, diversity is needed since even if we focus on the honeypots with the highest visibility, they see only a small fraction of the intrusions, including only 5% of the files. Thus, although around 2% of intrusions are visible by most of the honeypots in our honeyfarm, the rest are only visible to a few. We conclude with a discussion of the findings of work.Cyber Securit

    A systematic framework for categorising IoT device fingerprinting mechanisms

    Get PDF
    The popularity of the Internet of Things (IoT) devices makes it increasingly important to be able to fingerprint them, for example in order to detect if there are misbehaving or even malicious IoT devices in one’s network. However, there are many challenges faced in the task of fingerprinting IoT devices, mainly due to the huge variety of the devices involved. At the same time, the task can potentially be improved by applying machine learning techniques for better accuracy and efficiency. The aim of this paper is to provide a systematic categorisation of machine learning augmented techniques that can be used for fingerprinting IoT devices. This can serve as a baseline for comparing various IoT fingerprinting mechanisms, so that network administrators can choose one or more mechanisms that are appropriate for monitoring and maintaining their network. We carried out an extensive literature review of existing papers on fingerprinting IoT devices – paying close attention to those with machine learning features. This is followed by an extraction of important and comparable features among the mechanisms outlined in those papers. As a result, we came up with a key set of terminologies that are relevant both in the fingerprinting context and in the IoT domain. This enabled us to construct a framework called IDWork, which can be used for categorising existing IoT fingerprinting mechanisms in a way that will facilitate a coherent and fair comparison of these mechanisms. We found that the majority of the IoT fingerprinting mechanisms take a passive approach – mainly through network sniffing – instead of being intrusive and interactive with the device of interest. Additionally, a significant number of the surveyed mechanisms employ both static and dynamic approaches, in order to benefit from complementary features that can be more robust against certain attacks such as spoofing and replay attacks
    corecore