4,472 research outputs found

    Traffic measurement and analysis

    Get PDF
    Measurement and analysis of real traffic is important to gain knowledge about the characteristics of the traffic. Without measurement, it is impossible to build realistic traffic models. It is recent that data traffic was found to have self-similar properties. In this thesis work traffic captured on the network at SICS and on the Supernet, is shown to have this fractal-like behaviour. The traffic is also examined with respect to which protocols and packet sizes are present and in what proportions. In the SICS trace most packets are small, TCP is shown to be the predominant transport protocol and NNTP the most common application. In contrast to this, large UDP packets sent between not well-known ports dominates the Supernet traffic. Finally, characteristics of the client side of the WWW traffic are examined more closely. In order to extract useful information from the packet trace, web browsers use of TCP and HTTP is investigated including new features in HTTP/1.1 such as persistent connections and pipelining. Empirical probability distributions are derived describing session lengths, time between user clicks and the amount of data transferred due to a single user click. These probability distributions make up a simple model of WWW-sessions

    Systems with Session-based Workloads: Assessing Performance and Reliability

    Get PDF
    Many systems, including the Web and Software as a Service (SaaS), are best characterized with session-based workloads. Empirical studies have shown that Web session arrivals exhibit long range dependence and that the number of requests in a session is well modeled with skewed or heavy-tailed distributions. However, models that account for session workloads characterized by empirically observed phenomena and studies of their impact on performance and reliability metrics are lacking.;For assessing performance, we use a feedback queue to account for session-based workloads in a physically meaningful way and use simulation to analyze the behavior of the Web system under Long Range Dependent (LRD) session arrival process and skewed distribution for the number of requests in a session. Our results show that the percentage of dropped sessions, mean queue length, mean waiting time, and the useful server utilization are all affected by the LRD session arrivals and the statistics of the number of requests within a session. The impact is higher in the case of more prominent long-range dependence. Interestingly, both the request arrival process and the request departure process are long-range dependent, even in the case when session arrivals are Poisson. This indicates that the LRD at the request level can be a result of the existence of sessions.;For assessing reliability, we propose a framework which integrates (1) the Web workloads defined in term of user sessions, (2) the user navigation patterns through the Web site, and (3) the reliability estimates of the Web requests based on the system architecture; then, we give a detailed reliability model of a Web system based on the proposed framework. We recognize the difficulty of solving the proposed model and use simulation to obtain the results. And last but not least, we use statistical design of experiment to quantify the results and to determine which factors have the highest impact on the system\u27s reliability. Our results show that some two-way and three-way interactions are very important for the session reliability of Web systems

    Catalog Dynamics: Impact of Content Publishing and Perishing on the Performance of a LRU Cache

    Full text link
    The Internet heavily relies on Content Distribution Networks and transparent caches to cope with the ever-increasing traffic demand of users. Content, however, is essentially versatile: once published at a given time, its popularity vanishes over time. All requests for a given document are then concentrated between the publishing time and an effective perishing time. In this paper, we propose a new model for the arrival of content requests, which takes into account the dynamical nature of the content catalog. Based on two large traffic traces collected on the Orange network, we use the semi-experimental method and determine invariants of the content request process. This allows us to define a simple mathematical model for content requests; by extending the so-called "Che approximation", we then compute the performance of a LRU cache fed with such a request process, expressed by its hit ratio. We numerically validate the good accuracy of our model by comparison to trace-based simulation.Comment: 13 Pages, 9 figures. Full version of the article submitted to the ITC 2014 conference. Small corrections in the appendix from the previous versio

    Realistic Traffic Generation for Web Robots

    Full text link
    Critical to evaluating the capacity, scalability, and availability of web systems are realistic web traffic generators. Web traffic generation is a classic research problem, no generator accounts for the characteristics of web robots or crawlers that are now the dominant source of traffic to a web server. Administrators are thus unable to test, stress, and evaluate how their systems perform in the face of ever increasing levels of web robot traffic. To resolve this problem, this paper introduces a novel approach to generate synthetic web robot traffic with high fidelity. It generates traffic that accounts for both the temporal and behavioral qualities of robot traffic by statistical and Bayesian models that are fitted to the properties of robot traffic seen in web logs from North America and Europe. We evaluate our traffic generator by comparing the characteristics of generated traffic to those of the original data. We look at session arrival rates, inter-arrival times and session lengths, comparing and contrasting them between generated and real traffic. Finally, we show that our generated traffic affects cache performance similarly to actual traffic, using the common LRU and LFU eviction policies.Comment: 8 page

    AWAIT: Efficient Overload Management for Busy Multi-tier Web Services under Bursty Workloads

    Get PDF
    The problem of service differentiation and admission control in web services that utilize a multi-tier architecture is more challenging than in a single-tiered one, especially in the presence of bursty conditions, i.e., when arrivals of user web sessions to the system are characterized by temporal surges in their arrival intensities and demands. We demonstrate that classic techniques for a session based admission control that are triggered by threshold violations are ineffective under bursty workload conditions, as user-perceived performance metrics rapidly and dramatically deteriorate, inadvertently leading the system to reject requests from already accepted user sessions, resulting in business loss. Here, as a solution for service differentiation of accepted user sessions we promote a methodology that is based on blocking, i.e., when the system operates in overload, requests from accepted sessions are not rejected but are instead stored in a blocking queue that effectively acts as a waiting room. The requests in the blocking queue implicitly become of higher priority and are served immediately after load subsides. Residence in the blocking queue comes with a performance cost as blocking time adds to the perceived end-to-end user response time. We present a novel autonomic session based admission control policy, called AWAIT, that adaptively adjusts the capacity of the blocking queue as a function of workload burstiness in order to meet predefined user service level objectives while keeping the portion of aborted accepted sessions to a minimum. Detailed simulations illustrate the effectiveness of AWAIT under different workload burstiness profiles and therefore strongly argue for its effectiveness

    Towards Autonomic Service Provisioning Systems

    Full text link
    This paper discusses our experience in building SPIRE, an autonomic system for service provision. The architecture consists of a set of hosted Web Services subject to QoS constraints, and a certain number of servers used to run session-based traffic. Customers pay for having their jobs run, but require in turn certain quality guarantees: there are different SLAs specifying charges for running jobs and penalties for failing to meet promised performance metrics. The system is driven by an utility function, aiming at optimizing the average earned revenue per unit time. Demand and performance statistics are collected, while traffic parameters are estimated in order to make dynamic decisions concerning server allocation and admission control. Different utility functions are introduced and a number of experiments aiming at testing their performance are discussed. Results show that revenues can be dramatically improved by imposing suitable conditions for accepting incoming traffic; the proposed system performs well under different traffic settings, and it successfully adapts to changes in the operating environment.Comment: 11 pages, 9 Figures, http://www.wipo.int/pctdb/en/wo.jsp?WO=201002636

    Four Months in DailyMotion: Dissecting User Video Requests

    Get PDF
    International audienceThe growth of User-Generated Content (UGC) traffic makes the understanding of its nature a priority for network operators, content providers and equipment suppliers. In this paper, we study a four-month dataset that logs all video requests to DailyMotion made by a fixed subset of users. We were able to infer user sessions from raw data, to propose a Markovian model of these sessions, and to study video popularity and its evolution over time. The presented results are a first step for synthesizing an artificial (but realistic) traffic that could be used in simulations or experimental testbeds
    • …
    corecore