56 research outputs found

    Towards Informative Statistical Flow Inversion

    Get PDF
    This is the accepted version of 'Towards Informative Statistical Flow Inversion', archived originally at arXiv:0705.1939v1 [cs.NI] 14 May 2007.A problem which has recently attracted research attention is that of estimating the distribution of flow sizes in internet traffic. On high traffic links it is sometimes impossible to record every packet. Researchers have approached the problem of estimating flow lengths from sampled packet data in two separate ways. Firstly, different sampling methodologies can be tried to more accurately measure the desired system parameters. One such method is the sample-and-hold method where, if a packet is sampled, all subsequent packets in that flow are sampled. Secondly, statistical methods can be used to ``invert'' the sampled data and produce an estimate of flow lengths from a sample. In this paper we propose, implement and test two variants on the sample-and-hold method. In addition we show how the sample-and-hold method can be inverted to get an estimation of the genuine distribution of flow sizes. Experiments are carried out on real network traces to compare standard packet sampling with three variants of sample-and-hold. The methods are compared for their ability to reconstruct the genuine distribution of flow sizes in the traffic

    Small degree BitTorrent

    Get PDF
    It is well-known that the BitTorrent file sharing protocol is responsible for a significant portion of the Internet traffic. A large amount of work has been devoted to reducing the footprint of the protocol in terms of the amount of traffic, however, its flow level footprint has not been studied in depth. We argue in this paper that the large amount of flows that a BitTorrent client maintains will not scale over a certain point. To solve this problem, we first examine the flow structure through realistic simulations. We find that only a few TCP connections are used frequently for data transfer, while most of the connections are used mostly for signaling. This makes it possible to separate the data and signaling paths. We propose that, as the signaling traffic provides little overhead, it should be transferred on a separate dedicated small degree overlay while the data traffic should utilize temporal TCP sockets active only during the data transfer. Through simulation we show that this separation has no significant effect on the performance of the BitTorrent protocol while we can drastically reduce the number of actual flows

    Characterizing the IoT ecosystem at scale

    Get PDF
    Internet of Things (IoT) devices are extremely popular with home, business, and industrial users. To provide their services, they typically rely on a backend server in- frastructure on the Internet, which collectively form the IoT Ecosystem. This ecosys- tem is rapidly growing and offers users an increasing number of services. It also has been a source and target of significant security and privacy risks. One notable exam- ple is the recent large-scale coordinated global attacks, like Mirai, which disrupted large service providers. Thus, characterizing this ecosystem yields insights that help end-users, network operators, policymakers, and researchers better understand it, obtain a detailed view, and keep track of its evolution. In addition, they can use these insights to inform their decision-making process for mitigating this ecosystem’s security and privacy risks. In this dissertation, we characterize the IoT ecosystem at scale by (i) detecting the IoT devices in the wild, (ii) conducting a case study to measure how deployed IoT devices can affect users’ privacy, and (iii) detecting and measuring the IoT backend infrastructure. To conduct our studies, we collaborated with a large European Internet Service Provider (ISP) and a major European Internet eXchange Point (IXP). They rou- tinely collect large volumes of passive, sampled data, e.g., NetFlow and IPFIX, for their operational purposes. These data sources help providers obtain insights about their networks, and we used them to characterize the IoT ecosystem at scale. We start with IoT devices and study how to track and trace their activity in the wild. We developed and evaluated a scalable methodology to accurately detect and monitor IoT devices with limited, sparsely sampled data in the ISP and IXP. Next, we conduct a case study to measure how a myriad of deployed devices can affect the privacy of ISP subscribers. Unfortunately, we found that the privacy of a substantial fraction of IPv6 end-users is at risk. We noticed that a single device at home that encodes its MAC address into the IPv6 address could be utilized as a tracking identifier for the entire end-user prefix—even if other devices use IPv6 privacy extensions. Our results showed that IoT devices contribute the most to this privacy leakage. Finally, we focus on the backend server infrastructure and propose a methodology to identify and locate IoT backend servers operated by cloud services and IoT vendors. We analyzed their IoT traffic patterns as observed in the ISP. Our analysis sheds light on their diverse operational and deployment strategies. The need for issuing a priori unknown network-wide queries against large volumes of network flow capture data, which we used in our studies, motivated us to develop Flowyager. It is a system built on top of existing traffic capture utilities, and it relies on flow summarization techniques to reduce (i) the storage and transfer cost of flow captures and (ii) query response time. We deployed a prototype of Flowyager at both the IXP and ISP.Internet-of-Things-Geräte (IoT) sind aus vielen Haushalten, Büroräumen und In- dustrieanlagen nicht mehr wegzudenken. Um ihre Dienste zu erbringen, nutzen IoT- Geräte typischerweise auf eine Backend-Server-Infrastruktur im Internet, welche als Gesamtheit das IoT-Ökosystem bildet. Dieses Ökosystem wächst rapide an und bie- tet den Nutzern immer mehr Dienste an. Das IoT-Ökosystem ist jedoch sowohl eine Quelle als auch ein Ziel von signifikanten Risiken für die Sicherheit und Privatsphäre. Ein bemerkenswertes Beispiel sind die jüngsten groß angelegten, koordinierten globa- len Angriffe wie Mirai, durch die große Diensteanbieter gestört haben. Deshalb ist es wichtig, dieses Ökosystem zu charakterisieren, eine ganzheitliche Sicht zu bekommen und die Entwicklung zu verfolgen, damit Forscher, Entscheidungsträger, Endnutzer und Netzwerkbetreibern Einblicke und ein besseres Verständnis erlangen. Außerdem können alle Teilnehmer des Ökosystems diese Erkenntnisse nutzen, um ihre Entschei- dungsprozesse zur Verhinderung von Sicherheits- und Privatsphärerisiken zu verbes- sern. In dieser Dissertation charakterisieren wir die Gesamtheit des IoT-Ökosystems indem wir (i) IoT-Geräte im Internet detektieren, (ii) eine Fallstudie zum Einfluss von benutzten IoT-Geräten auf die Privatsphäre von Nutzern durchführen und (iii) die IoT-Backend-Infrastruktur aufdecken und vermessen. Um unsere Studien durchzuführen, arbeiten wir mit einem großen europäischen Internet- Service-Provider (ISP) und einem großen europäischen Internet-Exchange-Point (IXP) zusammen. Diese sammeln routinemäßig für operative Zwecke große Mengen an pas- siven gesampelten Daten (z.B. als NetFlow oder IPFIX). Diese Datenquellen helfen Netzwerkbetreibern Einblicke in ihre Netzwerke zu erlangen und wir verwendeten sie, um das IoT-Ökosystem ganzheitlich zu charakterisieren. Wir beginnen unsere Analysen mit IoT-Geräten und untersuchen, wie diese im Inter- net aufgespürt und verfolgt werden können. Dazu entwickelten und evaluierten wir eine skalierbare Methodik, um IoT-Geräte mit Hilfe von eingeschränkten gesampelten Daten des ISPs und IXPs präzise erkennen und beobachten können. Als Nächstes führen wir eine Fallstudie durch, in der wir messen, wie eine Unzahl von eingesetzten Geräten die Privatsphäre von ISP-Nutzern beeinflussen kann. Lei- der fanden wir heraus, dass die Privatsphäre eines substantiellen Teils von IPv6- Endnutzern bedroht ist. Wir entdeckten, dass bereits ein einzelnes Gerät im Haus, welches seine MAC-Adresse in die IPv6-Adresse kodiert, als Tracking-Identifikator für das gesamte Endnutzer-Präfix missbraucht werden kann — auch wenn andere Geräte IPv6-Privacy-Extensions verwenden. Unsere Ergebnisse zeigten, dass IoT-Geräte den Großteil dieses Privatsphäre-Verlusts verursachen. Abschließend fokussieren wir uns auf die Backend-Server-Infrastruktur und wir schla- gen eine Methodik zur Identifizierung und Lokalisierung von IoT-Backend-Servern vor, welche von Cloud-Diensten und IoT-Herstellern betrieben wird. Wir analysier- ten Muster im IoT-Verkehr, der vom ISP beobachtet wird. Unsere Analyse gibt Auf- schluss über die unterschiedlichen Strategien, wie IoT-Backend-Server betrieben und eingesetzt werden. Die Notwendigkeit a-priori unbekannte netzwerkweite Anfragen an große Mengen von Netzwerk-Flow-Daten zu stellen, welche wir in in unseren Studien verwenden, moti- vierte uns zur Entwicklung von Flowyager. Dies ist ein auf bestehenden Netzwerkverkehrs- Tools aufbauendes System und es stützt sich auf die Zusammenfassung von Verkehrs- flüssen, um (i) die Kosten für Archivierung und Transfer von Flow-Daten und (ii) die Antwortzeit von Anfragen zu reduzieren. Wir setzten einen Prototypen von Flowyager sowohl im IXP als auch im ISP ein

    Autonomic Parameter Tuning of Anomaly-Based IDSs: an SSH Case Study

    Get PDF
    Anomaly-based intrusion detection systems classify network traffic instances by comparing them with a model of the normal network behavior. To be effective, such systems are expected to precisely detect intrusions (high true positive rate) while limiting the number of false alarms (low false positive rate). However, there exists a natural trade-off between detecting all anomalies (at the expense of raising alarms too often), and missing anomalies (but not issuing any false alarms). The parameters of a detection system play a central role in this trade-off, since they determine how responsive the system is to an intrusion attempt. Despite the importance of properly tuning the system parameters, the literature has put little emphasis on the topic, and the task of adjusting such parameters is usually left to the expertise of the system manager or expert IT personnel. In this paper, we present an autonomic approach for tuning the parameters of anomaly-based intrusion detection systems in case of SSH traffic. We propose a procedure that aims to automatically tune the system parameters and, by doing so, to optimize the system performance. We validate our approach by testing it on a flow-based probabilistic detection system for the detection of SSH attacks

    Botnet detection : a numerical and heuristic analysis

    Get PDF
    Dissertação de mestrado em Engenharia de InformáticaInternet security has been targeted in innumerous ways throughout the ages and Internet cyber criminality has been changing its ways since the old days where attacks were greatly motivated by recognition and glory. A new era of cyber criminals are on the move. Real armies of robots (bots) swarm the internet perpetrating precise, objective and coordinated attacks on individuals and organizations. Many of these bots are now coordinated by real cybercrime organizations in an almost open-source driven development resulting in the fast proliferation of many bot variants with refined capabilities and increased detection complexity. One example of such open-source development could be found during the year 2011 in the Russian criminal underground. The release of the Zeus botnet framework source-code led to the development of, at least, a new and improved botnet framework: Ice IX. Concerning attack tools, the combination of many well-known techniques has been making botnets an untraceable, effective, dynamic and powerful mean to perpetrate all kinds of malicious activities such as Distributed Denial of Service (DDoS) attacks, espionage, email spam, malware spreading, data theft, click and identity frauds, among others. Economical and reputation damages are difficult to quantify but the scale is widening. It’s up to one’s own imagination to figure out how much was lost in April of 2007 when Estonia suffered a well-known distributed attack on its internet country-wide infrastructure. Among the techniques available to mitigate the botnet threat, detection plays an important role. Despite recent year’s evolution in botnet detection technology, a definitive solution is far from being found. New constantly appearing bot and worm developments in areas such as host infection, deployment, maintenance, control and dissimulation of bots are permanently changing the detection vectors thought and developed. In that way, research and implementation of anomaly-based botnet detection systems are fundamental to pinpoint and track all the continuously changing polymorphic botnets variants, which are impossible to identify by simple signature-based systems

    Linkages between U.S Cross-border Portfolio Equity Flows and Equity Markets

    Get PDF
    There is an ongoing debate over the role that equity markets play in determining and influencing international equity flows. The first chapter of this dissertation describes the large portfolio equity flows into China and India, in order to understand the buying behavior of US investors. The rapid growth of the Chinese and Indian economies, coupled with the recent development and liberalization of their financial markets has attracted significant portfolio investment from U.S. investors. It is commonly assumed that domestic investors have an informational advantage over foreign investors; however, some recent empirical literature has questioned this assumption. Essay one dissects the nature of the relationship between foreign equity flows, equity returns, and related variables. The results of my empirical investigation provides evidence that U.S. institutional investors are making investment decisions based on long-run determinants of value rather than responding to price signals or ‘chasing returns\u27. I anticipate that the strong relationship between equity flows and fundamentals will strengthen as information asymmetries decline and US investors continue to develop more sophisticated methods of assessing underlying value in China and India. The second essay of this dissertation explores a new panel data set based on US gross cross-border equity flows to 20 industrialized nations combined with measures of market valuation for the period of 1977-2005. Empirical evidence of imperfect integration across world equity markets indicates that valuation matters. Consistent with relative value trading as a determinant of equity flow patterns, I find that equity flows decrease sharply with host-country market valuations—in particular the component of valuation that is forecasted to revert the following year. I also find that equity flows increase sharply with US equity market valuations. These results suggest the existence of a valuation channel for cross-border equity flows. The findings of this chapter show that US investors are informed about both domestic markets and foreign markets. Peripheral findings of this essay confirm the findings of other researches, but with a longer sample period. Consistent with existing literature, I find a negative influence of interest rates spreads, and information asymmetries on cross-border trade in equities

    User-Centric Traffic Engineering in Software Defined Networks

    Get PDF
    Software defined networking (SDN) is a relatively new paradigm that decouples individual network elements from the control logic, offering real-time network programmability, translating high level policy abstractions into low level device configurations. The framework comprises of the data (forwarding) plane incorporating network devices, while the control logic and network services reside in the control and application planes respectively. Operators can optimize the network fabric to yield performance gains for individual applications and services utilizing flow metering and application-awareness, the default traffic management method in SDN. Existing approaches to traffic optimization, however, do not explicitly consider user application trends. Recent SDN traffic engineering designs either offer improvements for typical time-critical applications or focus on devising monitoring solutions aimed at measuring performance metrics of the respective services. The performance caveats of isolated service differentiation on the end users may be substantial considering the growth in Internet and network applications on offer and the resulting diversity in user activities. Application-level flow metering schemes therefore, fall short of fully exploiting the real-time network provisioning capability offered by SDN instead relying on rather static traffic control primitives frequent in legacy networking. For individual users, SDN may lead to substantial improvements if the framework allows operators to allocate resources while accounting for a user-centric mix of applications. This thesis explores the user traffic application trends in different network environments and proposes a novel user traffic profiling framework to aid the SDN control plane (controller) in accurately configuring network elements for a broad spectrum of users without impeding specific application requirements. This thesis starts with a critical review of existing traffic engineering solutions in SDN and highlights recent and ongoing work in network optimization studies. Predominant existing segregated application policy based controls in SDN do not consider the cost of isolated application gains on parallel SDN services and resulting consequence for users having varying application usage. Therefore, attention is given to investigating techniques which may capture the user behaviour for possible integration in SDN traffic controls. To this end, profiling of user application traffic trends is identified as a technique which may offer insight into the inherent diversity in user activities and offer possible incorporation in SDN based traffic engineering. A series of subsequent user traffic profiling studies are carried out in this regard employing network flow statistics collected from residential and enterprise network environments. Utilizing machine learning techniques including the prominent unsupervised k-means cluster analysis, user generated traffic flows are cluster analysed and the derived profiles in each networking environment are benchmarked for stability before integration in SDN control solutions. In parallel, a novel flow-based traffic classifier is designed to yield high accuracy in identifying user application flows and the traffic profiling mechanism is automated. The core functions of the novel user-centric traffic engineering solution are validated by the implementation of traffic profiling based SDN network control applications in residential, data center and campus based SDN environments. A series of simulations highlighting varying traffic conditions and profile based policy controls are designed and evaluated in each network setting using the traffic profiles derived from realistic environments to demonstrate the effectiveness of the traffic management solution. The overall network performance metrics per profile show substantive gains, proportional to operator defined user profile prioritization policies despite high traffic load conditions. The proposed user-centric SDN traffic engineering framework therefore, dynamically provisions data plane resources among different user traffic classes (profiles), capturing user behaviour to define and implement network policy controls, going beyond isolated application management

    Measuring named data networks

    Get PDF
    2020 Spring.Includes bibliographical references.Named Data Networking (NDN) is a promising information-centric networking (ICN) Internet architecture that addresses the content directly rather than addressing servers. NDN provides new features, such as content-centric security, stateful forwarding, and in-network caches, to better satisfy the needs of today's applications. After many years of technological research and experimentation, the community has started to explore the deployment path for NDN. One NDN deployment challenge is measurement. Unlike IP, which has a suite of measurement approaches and tools, NDN only has a few achievements. NDN routing and forwarding are based on name prefixes that do not refer to individual endpoints. While rich NDN functionalities facilitate data distribution, they also break the traditional end-to-end probing based measurement methods. In this dissertation, we present our work to investigate NDN measurements and fill some research gaps in the field. Our thesis of this dissertation states that we can capture a substantial amount of useful and actionable measurements of NDN networks from end hosts. We start by comparing IP and NDN to propose a conceptual framework for NDN measurements. We claim that NDN can be seen as a superset of IP. NDN supports similar functionalities provided by IP, but it has unique features to facilitate data retrieval. The framework helps identify that NDN lacks measurements in various aspects. This dissertation focuses on investigating the active measurements from end hosts. We present our studies in two directions to support the thesis statement. We first present the study to leverage the similarities to replicate IP approaches in NDN networks. We show the first work to measure the NDN-DPDK forwarder, a high-speed NDN forwarder designed and implemented by the National Institute of Standards and Technology (NIST), in a real testbed. The results demonstrate that Data payload sizes dominate the forwarding performance, and efficiently using every fragment to improve the goodput. We then present the first work to replicate packet dispersion techniques in NDN networks. Based on the findings in the NDN-DPDK forwarder benchmark, we devise the techniques to measure interarrivals for Data packets. The results show that the techniques successfully estimate the capacity on end hosts when 1Gbps network cards are used. Our measurements also indicate the NDN-DPDK forwarder introduces variance in Data packet interarrivals. We identify the potential bottlenecks and the possible causes of the variance. We then address the NDN specific measurements, measuring the caching state in NDN networks from end hosts. We propose a novel method to extract fingerprints for various caching decision mechanisms. Our simulation results demonstrate that the method can detect caching decisions in a few rounds. We also show that the method is not sensitive to cross-traffic and can be deployed on real topologies for caching policy detection

    Engineering Enterprise Networks with SDN

    Get PDF
    Today’s networks are growing in terms of bandwidth, number of devices, variety of applications, and various front-end and back-end technologies. Current network architecture is not sufficient for scaling, managing and monitoring them. In this thesis, we explore SDN to address scalability and monitoring issue in growing networks such as IITH campus network. SDN architecture separates the control plane and data plane of a networking device. SDN provides a single control plane (or centralized way) to configure, manage and monitor them more effectively. Scalability of Ethernet is a known issue where communication is disturbed by a large number of nodes in a single broadcast domain. This thesis proposes Extensible Transparent Filter (ETF) for Ethernet using SDN. ETF suppresses broadcast traffic in a broadcast domain by forwarding the broadcast packet to only selected port of a switch through which the target host of that packet is reachable. ETF maintains both consistent functionality and backward compatibility with existing protocols that work with broadcast of a packet. Nowadays, flow-level details of network traffic are the major requirements of many network monitoring applications such as anomaly detection, traffic accounting etc. Packet sampling based solutions (such as NetFlow) provide flow-level details of network traffic. However, they are inad- equate for several monitoring applications. This thesis proposes Network Monitor (NetMon) for OpenFlow networks, which includes the implementation of a few flow-based metrics to determine the state of the network and a Device Logger. NetMon uses a push-based approach to achieve its goals with complete flow-level details. NetMon determines the fraction of useful flows for each host in the network. It calculates out-degree and in-degree based on the IP address, for each hosts in the network. NetMon classifies the host as a client, server or peer-to-peer node, based on the number of source ports and active flows. Device Logger records the device (MAC address and IP address) and its location (Switch DPID and Port No). Device Logger helps to identify owners (devices) of an IP address within a particular time period. This thesis also discusses the practical deployment and operation of SDN. A small SDN network has been deployed in IIT Hyderabad campus. Both, ETF and NetMon are functional in the SDN network. ETF and NetMon were developed using Floodlight which is an open source SDN controller. ETF and NetMon improve scalability and monitoring of enterprise networks as an enhancement to existing networks using SDN
    corecore