61 research outputs found

    Uncovering the big players of the web

    Get PDF
    In this paper we aim at observing how today the Internet large organizations deliver web content to end users. Using one-week long data sets collected at three vantage points aggregating more than 30,000 Internet customers, we characterize the offered services precisely quantifying and comparing the performance of different players. Results show that today 65% of the web traffic is handled by the top 10 organiza- tions. We observe that, while all of them serve the same type of content, different server architectures have been adopted considering load bal- ancing schemes, servers number and location: some organizations handle thousands of servers with the closest being few milliseconds far away from the end user, while others manage few data centers. Despite this, the performance of bulk transfer rate offered to end users are typically good, but impairment can arise when content is not readily available at the server and has to be retrieved from the CDN back-en

    Web User Session Characterization via Clustering Techniques

    Get PDF
    We focus on the identification and definition of "Web user-sessions", an aggregation of several TCP connections generated by the same source host on the basis of TCP connection opening time. The identification of a user session is non trivial; traditional approaches rely on threshold based mechanisms, which are very sensitive to the value assumed for the threshold and may be difficult to correctly set. By applying clustering techniques, we define a novel methodology to identify Web user-sessions without requiring an a priori definition of threshold values. We analyze the characteristics of user sessions extracted from real traces, studying the statistical properties of the identified sessions. From the study it emerges that Web user-sessions tend to be Poisson, but correlation may arise during periods of network/hosts anomalous functioning

    DoWitcher: Effective Worm Detection and Containment in the Internet Core

    Get PDF
    Enterprise networks are increasingly offloading the responsibility for worm detection and containment to the carrier networks. However, current approaches to the zero-day worm detection problem such as those based on content similarity of packet payloads are not scalable to the carrier link speeds (OC-48 and up-wards). In this paper, we introduce a new system, namely DoWitcher, which in contrast to previous approaches is scalable as well as able to detect the stealthiest worms that employ low-propagation rates or polymorphisms to evade detection. DoWitcher uses an incremental approach toward worm detection: First, it examines the layer-4 traffic features to discern the presence of a worm anomaly; Next, it determines a flow-filter mask that can be applied to isolate the suspect worm flows and; Finally, it enables full-packet capture of only those flows that match the mask, which are then processed by a longest common subsequence algorithm to extract the worm content signature. Via a proof-of-concept implementation on a commercially available network analyzer processing raw packets from an OC-48 link, we demonstrate the capability of DoWitcher to detect low-rate worms and extract signatures for even the polymorphic worm

    Inferring undesirable behavior from P2P traffic analysis

    Get PDF
    While peer-to-peer (P2P) systems have emerged in popularity in recent years, their large-scale and complexity make them difficult to reason about. In this paper, we argue that systematic analysis of traffic characteristics of P2P systems can reveal a wealth of information about their behavior, and highlight potential undesirable activities that such systems may exhibit. As a first step to this end, we present an offline and semi-automated approach to detect undesirable behavior. Our analysis is applied on real traffic traces collected from a Point-of-Presence (PoP) of a national-wide ISP in which over 70% of the total traffic is due to eMule, a popular P2P file-sharing system. Flow-level measurements are aggregated into "samples" referring to the activity of each host during a time interval. We then employ a clustering technique to automatically and coarsely identify similar behavior across samples, and extensively use domain knowledge to interpret and analyze the resulting clusters. Our analysis shows several examples of undesirable behavior including evidence of DDoS attacks exploiting live P2P clients, significant amounts of unwanted traffic that may harm network performance, and instances where the performance of participating peers may be subverted due to maliciously deployed servers. Identification of such patterns can benefit network operators, P2P system developers, and actual end-user

    Automatic parsing of binary-based application protocols using network traffic

    Get PDF
    A method for analyzing a binary-based application protocol of a network. The method includes obtaining conversations from the network, extracting content of a candidate field from a message in each conversation, calculating a randomness measure of the content to represent a level of randomness of the content across all conversation, calculating a correlation measure of the content to represent a level of correlation, across all of conversations, between the content and an attribute of a corresponding conversation where the message containing the candidate field is located, and selecting, based on the randomness measure and the correlation measure, and using a pre-determined field selection criterion, the candidate offset from a set of candidate offsets as the offset defined by the protocol

    Editorial: Selected papers from the Second International Workshop on QoS in Multiservice IP Networks (QoS-IP 2003)

    No full text
    This is the editorial for a special issue of Computer Networks devoted to selected papers, originally presented at the 2nd international workshop on QoS in multiservice IP networks (QoS-IP 2003), held in Milan, Italy, in February 2003. Papers were significantly revised and extended with respect to the version presented at the workshop, and subject to a new peer review process

    Towards web service classification using addresses and DNS

    Get PDF
    The identification of the services that generate traffic is crucial for ISPs and companies to plan and monitor the network. The widespread deployment of encryption and the convergence of the web services towards HTTP/HTTPS challenge traditional classification techniques. Algorithms to classify traffic are left with little information, such as server IP addresses, flow characteristics and queries performed at the DNS. Moreover, due to the usage of Content Delivery Networks and cloud infrastructure, it is unclear whether such coarse metadata is sufficient to differentiate the traffic. This paper studies to what extent basic information visible at flow-level measurements is useful for traffic classification on the web. By analyzing a large dataset of flow measurements, we quantify how often the same server IP address is used by different services, and how services use hostnames. Our results show that a very simple classifier that relies only on server IP addresses and on lists of hostnames can distinguish up to 55% of the traffic volume. Yet, collisions of names and addresses are common among popular services, calling for more ingenuity. This paper is a preliminary step in the evaluation of classification algorithms that are suitable for the modern Internet, where only minimal metadata collection will be possible in the network

    Sleep Mode at the Edge: How Much Room is There?

    No full text
    Access networks contribute to a large portion of the energy consumed in telecommunication networks. Up to now, several approaches have been proposed to reduce their energy consumption. For example, ADSL2 standards define a low power state that modem can enter to save energy. Yet, none or little deployment of these technologies has been done. The aim of this work is to understand how much energy savings these technologies would allow to reach when deployed in real scenarios. To this goal we consider a large data set of ADSL line traffic profiles out of which we evaluate the efficiency of sleep mode policies. Results show that, on average, users are typically inactive for long periods of time. This permits to achieve significant savings even with very simple and not aggressive policies, with little or marginal impact on the QoS perceived by the users
    corecore