Search CORE

7 research outputs found

Beyond Counting: New Perspectives on the Active IPv4 Address Space

Author: Adrian D.
Antonakakis M.
Durumeric Z.
Hao S.
Katz-Bassett E.
Moura G. C. M.
Wong B.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/09/2016
Field of study

In this study, we report on techniques and analyses that enable us to capture Internet-wide activity at individual IP address-level granularity by relying on server logs of a large commercial content delivery network (CDN) that serves close to 3 trillion HTTP requests on a daily basis. Across the whole of 2015, these logs recorded client activity involving 1.2 billion unique IPv4 addresses, the highest ever measured, in agreement with recent estimates. Monthly client IPv4 address counts showed constant growth for years prior, but since 2014, the IPv4 count has stagnated while IPv6 counts have grown. Thus, it seems we have entered an era marked by increased complexity, one in which the sole enumeration of active IPv4 addresses is of little use to characterize recent growth of the Internet as a whole. With this observation in mind, we consider new points of view in the study of global IPv4 address activity. Our analysis shows significant churn in active IPv4 addresses: the set of active IPv4 addresses varies by as much as 25% over the course of a year. Second, by looking across the active addresses in a prefix, we are able to identify and attribute activity patterns to network restructurings, user behaviors, and, in particular, various address assignment practices. Third, by combining spatio-temporal measures of address utilization with measures of traffic volume, and sampling-based estimates of relative host counts, we present novel perspectives on worldwide IPv4 address activity, including empirical observation of under-utilization in some areas, and complete utilization, or exhaustion, in others.Comment: in Proceedings of ACM IMC 201

arXiv.org e-Print Archive

Crossref

連続的なPTRレコードの設定を考慮した動的IPアドレスブロックの検出

Author: 中森朋郁
Publication venue: 後藤, 滋樹
Publication date: 01/02/2019
Field of study

Waseda University Repository

Measuring the Internet during Covid-19 to Evaluate Work-from-Home

Author: Heidemann John
Song Xiao
Publication venue
Publication date: 16/02/2021
Field of study

The Covid-19 pandemic has radically changed our lives. Under different circumstances, people react to it in various ways. One way is to work-from-home since lockdown has been announced in many regions around the world. For some places, however, we don't know if people really work from home due to the lack of information. Since there are lots of uncertainties, it would be helpful for us to understand what really happen in these places if we can detect the reaction to the Covid-19 pandemic. Working from home indicates that people have changed the way they interact with the Internet. People used to access the Internet in the company or at school during the day. Now it is more likely that they access the Internet at home in the daytime. Therefore, the network usage changes in one place can be used to indicate if people in this place actually work from home. In this work, we reuse and analyze Trinocular outages data (around 5.1M responsive /24 blocks) over 6 months to find network usage changes by a new designed algorithm. We apply the algorithm to sets of /24 blocks in several cities and compare the detected network usage changes with real world covid-19 events to verify if the algorithm can capture the changes reacting to the Covid-19 pandemic. By applying the algorithm to all measurable /24 blocks to detect network usages changes, we conclude that network usage can be an indicator of the reaction to the Covid-19 pandemic

arXiv.org e-Print Archive

The End of the Canonical IoT Botnet: A Measurement Study of Mirai's Descendants

Author: Böck Leon
Fusari Isabella
Karuppayah Shankar
Levin Dave
Mühlhäuser Max
Sundermann Valentin
Publication venue
Publication date: 03/09/2023
Field of study

Since the burgeoning days of IoT, Mirai has been established as the canonical IoT botnet. Not long after the public release of its code, researchers found many Mirai variants compete with one another for many of the same vulnerable hosts. Over time, the myriad Mirai variants evolved to incorporate unique vulnerabilities, defenses, and regional concentrations. In this paper, we ask: have Mirai variants evolved to the point that they are fundamentally distinct? We answer this question by measuring two of the most popular Mirai descendants: Hajime and Mozi. To actively scan both botnets simultaneously, we developed a robust measurement infrastructure, BMS, and ran it for more than eight months. The resulting datasets show that these two popular botnets have diverged in their evolutions from their common ancestor in multiple ways: they have virtually no overlapping IP addresses, they exhibit different behavior to network events such as diurnal rate limiting in China, and more. Collectively, our results show that there is no longer one canonical IoT botnet. We discuss the implications of this finding for researchers and practitioners

arXiv.org e-Print Archive

Recommended from our members

Efficient Latent Semantic Extraction from Cross Domain Data with Declarative Language

Author: Li Mingda
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

With large amounts of data continuously generated by intelligence devices, efficient analysis of huge data collections to unearth valuable insights has become one of the most elusive challenges for both academia and industry. The key elements to establishing a scalable analyzing framework should involve (1) an intuitive interface to describe the desired outcome, (2) a well-crafted model that integrates all available information sources to derive the optimal outcome and (3) an efficient algorithm that performs the data integration and extraction within a reasonable amount of time. In this dissertation, we address these challenges by proposing (1) a cross-language interface for a succinct expression of recursive queries, (2) a domain specific neural network model that can incorporate information of multiple modalities, and (3) a sample efficient training method that can be used even for extremely-large output-class classifiers. Our contributions in this thesis are thus threefold: First, for the ubiquitous recursive queries in advanced data analytics, on top of BigDatalog and Apache Spark, we design a succinct and expressive analytics tool encapsulating the functionality and classical algorithms of Datalog, a quintessential logic programming language. We provide the Logical Library (LLib), a Spark MLlib-like high-level API supporting a wide range of recursive algorithms and the Logical DataFrame (LFrame), an extension to Spark DataFrame supporting both relational and logical operations. The LLib and LFrame enable smooth collaborations between logical applications and other Spark libraries and cross-language logical programming in Scala, Java, or Python. Second, we utilize variants of recurrent neural network (RNN) to incorporate some enlightening sequential information overlooked by the conventional works in two different domains including Spoken Language Understanding (SLU) and Internet Embedding (IE). In SLU, we address the problem caused by solely relying on the first best interpretation (hypothesis) of an audio command through a series of new architectures comprising bidirectional LSTM and pooling layers to jointly utilize the other hypotheses' texts or embedding vectors, which are neglected but with valuable information missed by the first best hypothesis. In IE, we propose the DIP, an extension of RNN, to build up the internet coordinate system with the IP address sequences, which are also unnoticed in conventional distance-based internet embedding algorithms but encode structural information of the network. Both DIP and the integration of all hypotheses bring significant performance improvements for the corresponding downstream tasks. Finally, we investigate the training algorithm for multi-class classifiers with a large output-class size, which is common in deep neural networks and typically implemented as a softmax final layer with one output neuron per each class. To avoid expensive computing the intractable normalizing constant of softmax for each training data point, we analyze the well-known negative sampling and improve it to the amplified negative sampling algorithm, which gains much higher performance with lower training cost

eScholarship - University of California

Analyzing Internet reliability remotely with probing-based techniques

Author: Padmanabhan Ramakrishna
Publication venue
Publication date: 01/01/2018
Field of study

Internet reliability for home users is increasingly important as a variety of services that we use migrate to the Internet. Yet, we lack authoritative measures of residential Internet reliability. Measuring reliability requires the detection of Internet outage events experienced by home users. But residential Internet outages are rare events. Further, they can affect relatively few users. Thus, detecting residential Internet outages requires broad and longitudinal measurements of individual users' Internet connections. However, such measurements of Internet reliability are challenging to obtain accurately and at scale. Probing-based remote outage detection techniques can scale but their accuracy is questionable. These techniques detect Internet outages across time as well as across the IPv4 address space by sending active probes, such as pings and traceroutes, to users' IP addresses and use probe responses to infer Internet connectivity. However, they can infer false outages since their foundational assumption can sometimes be invalid: that the lack of response to an active probe is indicative of failure. In this dissertation, I show how to use probing-based techniques to measure residential Internet reliability by defending the following thesis: It is possible to remotely and accurately detect substantial outages experienced by any device with a stable public IP address that typically responds to active probes and use these outages to compare reliability across ISPs, media-types, geographical areas, and weather conditions. In the first part of the dissertation, I address the inaccuracy of probing-based techniques' detected outages and show how to use probe responses to correctly detect outages. I illustrate two scenarios where the lack of response to an active probe is not indicative of failure. In the first scenario, responses are delayed beyond the prober's timeout, leading these techniques to infer packet-loss instead of delay. In the second scenario, these techniques can falsely infer packet-loss when the address they are probing gets dynamically reassigned. I examine how often delayed responses and dynamic reassignment occur across ISPs to quantify the inaccuracy of these techniques. I show how outages can be inferred correctly even in networks with dynamic reassignment using complementary datasets that can reveal whether an address was dynamically reassigned before, during, and after a detected outage for that address. In the second part of the dissertation, I motivate why the detection of individual addresses' outages is necessary for analyzing residential reliability. An individual address typically represents one residential customer; therefore, detecting outages for individual addresses can allow capturing even small outages. Prior probing-based techniques focus upon the detection of edge network outages affecting a substantial set of addresses belonging to a BGP prefix or to a /24 address block. Here, I quantitatively demonstrate the extent to which prior techniques can miss residential outages. I show that even individual address outages occur rarely in most networks. When multiple simultaneous outages of related individual addresses occur, there is likely a common underlying cause. With this insight, I develop and evaluate an approach to find outage events that are statistically unlikely to have occurred independently. I show that the majority of such events do not affect entire /24 address blocks or BGP prefixes, and are therefore not likely to be detected by existing techniques which look for outages at these granularities. In the final part of the dissertation, I show how to use individual addresses' outages detected by probing-based techniques to assess Internet reliability across media-types, geographical areas, and weather conditions. Individual outages are not direct measures of reliability: they can occur independently because users disable equipment or can be observed falsely due to dynamic address renumbering. I use the insight that the statistical change in outage rate in different challenging environments (e.g., thunderstorm) can quantitatively expose actual outage “inflation”. I show how to study the effect of challenging environments upon the reliability of a group of addresses by analyzing the inflation in outage rate for that group during its presence. This dissertation's contributions will help achieve comprehensive measurements of Internet reliability that can be used to identify vulnerable networks and their challenges, inform which enhancements can help networks improve reliability, and evaluate the efficacy of deployed enhancements over time

Digital Repository at the University of Maryland