4,472 research outputs found
Traffic measurement and analysis
Measurement and analysis of real traffic is important to gain knowledge
about the characteristics of the traffic. Without measurement, it is
impossible to build realistic traffic models. It is recent that data
traffic was found to have self-similar properties. In this thesis work
traffic captured on the network at SICS and on the Supernet, is shown to
have this fractal-like behaviour. The traffic is also examined with
respect to which protocols and packet sizes are present and in what
proportions. In the SICS trace most packets are small, TCP is shown to be
the predominant transport protocol and NNTP the most common application.
In contrast to this, large UDP packets sent between not well-known ports
dominates the Supernet traffic. Finally, characteristics of the client
side of the WWW traffic are examined more closely. In order to extract
useful information from the packet trace, web browsers use of TCP and HTTP
is investigated including new features in HTTP/1.1 such as persistent
connections and pipelining. Empirical probability distributions are
derived describing session lengths, time between user clicks and the
amount of data transferred due to a single user click. These probability
distributions make up a simple model of WWW-sessions
Systems with Session-based Workloads: Assessing Performance and Reliability
Many systems, including the Web and Software as a Service (SaaS), are best characterized with session-based workloads. Empirical studies have shown that Web session arrivals exhibit long range dependence and that the number of requests in a session is well modeled with skewed or heavy-tailed distributions. However, models that account for session workloads characterized by empirically observed phenomena and studies of their impact on performance and reliability metrics are lacking.;For assessing performance, we use a feedback queue to account for session-based workloads in a physically meaningful way and use simulation to analyze the behavior of the Web system under Long Range Dependent (LRD) session arrival process and skewed distribution for the number of requests in a session. Our results show that the percentage of dropped sessions, mean queue length, mean waiting time, and the useful server utilization are all affected by the LRD session arrivals and the statistics of the number of requests within a session. The impact is higher in the case of more prominent long-range dependence. Interestingly, both the request arrival process and the request departure process are long-range dependent, even in the case when session arrivals are Poisson. This indicates that the LRD at the request level can be a result of the existence of sessions.;For assessing reliability, we propose a framework which integrates (1) the Web workloads defined in term of user sessions, (2) the user navigation patterns through the Web site, and (3) the reliability estimates of the Web requests based on the system architecture; then, we give a detailed reliability model of a Web system based on the proposed framework. We recognize the difficulty of solving the proposed model and use simulation to obtain the results. And last but not least, we use statistical design of experiment to quantify the results and to determine which factors have the highest impact on the system\u27s reliability. Our results show that some two-way and three-way interactions are very important for the session reliability of Web systems
Catalog Dynamics: Impact of Content Publishing and Perishing on the Performance of a LRU Cache
The Internet heavily relies on Content Distribution Networks and transparent
caches to cope with the ever-increasing traffic demand of users. Content,
however, is essentially versatile: once published at a given time, its
popularity vanishes over time. All requests for a given document are then
concentrated between the publishing time and an effective perishing time.
In this paper, we propose a new model for the arrival of content requests,
which takes into account the dynamical nature of the content catalog. Based on
two large traffic traces collected on the Orange network, we use the
semi-experimental method and determine invariants of the content request
process. This allows us to define a simple mathematical model for content
requests; by extending the so-called "Che approximation", we then compute the
performance of a LRU cache fed with such a request process, expressed by its
hit ratio. We numerically validate the good accuracy of our model by comparison
to trace-based simulation.Comment: 13 Pages, 9 figures. Full version of the article submitted to the ITC
2014 conference. Small corrections in the appendix from the previous versio
Realistic Traffic Generation for Web Robots
Critical to evaluating the capacity, scalability, and availability of web
systems are realistic web traffic generators. Web traffic generation is a
classic research problem, no generator accounts for the characteristics of web
robots or crawlers that are now the dominant source of traffic to a web server.
Administrators are thus unable to test, stress, and evaluate how their systems
perform in the face of ever increasing levels of web robot traffic. To resolve
this problem, this paper introduces a novel approach to generate synthetic web
robot traffic with high fidelity. It generates traffic that accounts for both
the temporal and behavioral qualities of robot traffic by statistical and
Bayesian models that are fitted to the properties of robot traffic seen in web
logs from North America and Europe. We evaluate our traffic generator by
comparing the characteristics of generated traffic to those of the original
data. We look at session arrival rates, inter-arrival times and session
lengths, comparing and contrasting them between generated and real traffic.
Finally, we show that our generated traffic affects cache performance similarly
to actual traffic, using the common LRU and LFU eviction policies.Comment: 8 page
AWAIT: Efficient Overload Management for Busy Multi-tier Web Services under Bursty Workloads
The problem of service differentiation and admission control in web services that utilize a multi-tier architecture is more challenging than in a single-tiered one, especially in the presence of bursty conditions, i.e., when arrivals of user web sessions to the system are characterized by temporal surges in their arrival intensities and demands. We demonstrate that classic techniques for a session based admission control that are triggered by threshold violations are ineffective under bursty workload conditions, as user-perceived performance metrics rapidly and dramatically deteriorate, inadvertently leading the system to reject requests from already accepted user sessions, resulting in business loss. Here, as a solution for service differentiation of accepted user sessions we promote a methodology that is based on blocking, i.e., when the system operates in overload, requests from accepted sessions are not rejected but are instead stored in a blocking queue that effectively acts as a waiting room. The requests in the blocking queue implicitly become of higher priority and are served immediately after load subsides. Residence in the blocking queue comes with a performance cost as blocking time adds to the perceived end-to-end user response time. We present a novel autonomic session based admission control policy, called AWAIT, that adaptively adjusts the capacity of the blocking queue as a function of workload burstiness in order to meet predefined user service level objectives while keeping the portion of aborted accepted sessions to a minimum. Detailed simulations illustrate the effectiveness of AWAIT under different workload burstiness profiles and therefore strongly argue for its effectiveness
Towards Autonomic Service Provisioning Systems
This paper discusses our experience in building SPIRE, an autonomic system
for service provision. The architecture consists of a set of hosted Web
Services subject to QoS constraints, and a certain number of servers used to
run session-based traffic. Customers pay for having their jobs run, but require
in turn certain quality guarantees: there are different SLAs specifying charges
for running jobs and penalties for failing to meet promised performance
metrics. The system is driven by an utility function, aiming at optimizing the
average earned revenue per unit time. Demand and performance statistics are
collected, while traffic parameters are estimated in order to make dynamic
decisions concerning server allocation and admission control. Different utility
functions are introduced and a number of experiments aiming at testing their
performance are discussed. Results show that revenues can be dramatically
improved by imposing suitable conditions for accepting incoming traffic; the
proposed system performs well under different traffic settings, and it
successfully adapts to changes in the operating environment.Comment: 11 pages, 9 Figures,
http://www.wipo.int/pctdb/en/wo.jsp?WO=201002636
Four Months in DailyMotion: Dissecting User Video Requests
International audienceThe growth of User-Generated Content (UGC) traffic makes the understanding of its nature a priority for network operators, content providers and equipment suppliers. In this paper, we study a four-month dataset that logs all video requests to DailyMotion made by a fixed subset of users. We were able to infer user sessions from raw data, to propose a Markovian model of these sessions, and to study video popularity and its evolution over time. The presented results are a first step for synthesizing an artificial (but realistic) traffic that could be used in simulations or experimental testbeds
- …