36,182 research outputs found

    Traffic Model and Performance Evaluation of Web Servers

    Get PDF
    In this paper we present a new model of Web traffic and its applications in the performance evaluation of Web servers. We consider typical behavior of a user's hypertext navigation within a Web server. We propose a traffic model at the session level, formulated as a stochastic marked point process, which describes when users arrive and how they browse the server. We provide results of statistical analyses and goodness-of-fit of various simple parametric distributions and of their mixtures. We developed a Web server benchmark: WAGON (Web trAffic GeneratOr and beNchmark), and we validated the traffic model by comparing various characteristics of the synthetic traffic generated by WAGON against measurements. We then report benchmark and analysis results on the Apache server, the currently most used Web server software. We analyze the impact of the traffic parameters on the HTTP request arrival process and the packet arrival process. We also show that the aggregate traffic is self-similar in most cases, and that, more importantly, the Hurst parameter is increasing in the traffic intensity. We further provide performance comparison results between HTTP1.0 and HTTP1.1 and show that HTTP1.1 could be much worse for users as well as for servers if some control parameters of the server and of browsers are incorrectly set. Indeed, when the server load is relatively high, the page response time under HTTP1.1 increases exponentially fast with the number of parallel persistent connections and with the timeout value used in Apache for persistent connections. We investigate the impact of user network conditions on the server performance. We also propose a queueing model to analyze the workload of persistent connections on the Apache server, and we establish optimal solution of the timeout parameter for the minimization of workload. Based on our analyses, we suggest the following practical guidelines. It is usually beneficial for both Web servers and Web clients to use HTTP1.1 instead of HTTP1.0. When HTTP1.1 is used, it should be used with pipeline. In terms of the management of persistent connections, it is useful for browsers to implement Early Close policy which combines the advantages of both HTTP1.0 and HTTP1.1. Browsers should in general, except for users with low bandwidth network connections (such as Modem), avoid establishing multiple parallel persistent connections from one browser window to the same Web server. On the server side, servers should set small timeout values for persistent connections if fixed timeout control mechanism is used (as in Apache) or if dynamic timeout control mechanism is used and the measured workload is high

    Realistic Traffic Generation for Web Robots

    Full text link
    Critical to evaluating the capacity, scalability, and availability of web systems are realistic web traffic generators. Web traffic generation is a classic research problem, no generator accounts for the characteristics of web robots or crawlers that are now the dominant source of traffic to a web server. Administrators are thus unable to test, stress, and evaluate how their systems perform in the face of ever increasing levels of web robot traffic. To resolve this problem, this paper introduces a novel approach to generate synthetic web robot traffic with high fidelity. It generates traffic that accounts for both the temporal and behavioral qualities of robot traffic by statistical and Bayesian models that are fitted to the properties of robot traffic seen in web logs from North America and Europe. We evaluate our traffic generator by comparing the characteristics of generated traffic to those of the original data. We look at session arrival rates, inter-arrival times and session lengths, comparing and contrasting them between generated and real traffic. Finally, we show that our generated traffic affects cache performance similarly to actual traffic, using the common LRU and LFU eviction policies.Comment: 8 page

    Separation of timescales in a two-layered network

    Full text link
    We investigate a computer network consisting of two layers occurring in, for example, application servers. The first layer incorporates the arrival of jobs at a network of multi-server nodes, which we model as a many-server Jackson network. At the second layer, active servers at these nodes act now as customers who are served by a common CPU. Our main result shows a separation of time scales in heavy traffic: the main source of randomness occurs at the (aggregate) CPU layer; the interactions between different types of nodes at the other layer is shown to converge to a fixed point at a faster time scale; this also yields a state-space collapse property. Apart from these fundamental insights, we also obtain an explicit approximation for the joint law of the number of jobs in the system, which is provably accurate for heavily loaded systems and performs numerically well for moderately loaded systems. The obtained results for the model under consideration can be applied to thread-pool dimensioning in application servers, while the technique seems applicable to other layered systems too.Comment: 8 pages, 2 figures, 1 table, ITC 24 (2012

    Towards Autonomic Service Provisioning Systems

    Full text link
    This paper discusses our experience in building SPIRE, an autonomic system for service provision. The architecture consists of a set of hosted Web Services subject to QoS constraints, and a certain number of servers used to run session-based traffic. Customers pay for having their jobs run, but require in turn certain quality guarantees: there are different SLAs specifying charges for running jobs and penalties for failing to meet promised performance metrics. The system is driven by an utility function, aiming at optimizing the average earned revenue per unit time. Demand and performance statistics are collected, while traffic parameters are estimated in order to make dynamic decisions concerning server allocation and admission control. Different utility functions are introduced and a number of experiments aiming at testing their performance are discussed. Results show that revenues can be dramatically improved by imposing suitable conditions for accepting incoming traffic; the proposed system performs well under different traffic settings, and it successfully adapts to changes in the operating environment.Comment: 11 pages, 9 Figures, http://www.wipo.int/pctdb/en/wo.jsp?WO=201002636

    Survey of End-to-End Mobile Network Measurement Testbeds, Tools, and Services

    Full text link
    Mobile (cellular) networks enable innovation, but can also stifle it and lead to user frustration when network performance falls below expectations. As mobile networks become the predominant method of Internet access, developer, research, network operator, and regulatory communities have taken an increased interest in measuring end-to-end mobile network performance to, among other goals, minimize negative impact on application responsiveness. In this survey we examine current approaches to end-to-end mobile network performance measurement, diagnosis, and application prototyping. We compare available tools and their shortcomings with respect to the needs of researchers, developers, regulators, and the public. We intend for this survey to provide a comprehensive view of currently active efforts and some auspicious directions for future work in mobile network measurement and mobile application performance evaluation.Comment: Submitted to IEEE Communications Surveys and Tutorials. arXiv does not format the URL references correctly. For a correctly formatted version of this paper go to http://www.cs.montana.edu/mwittie/publications/Goel14Survey.pd

    A self-adapting latency/power tradeoff model for replicated search engines

    Get PDF
    For many search settings, distributed/replicated search engines deploy a large number of machines to ensure efficient retrieval. This paper investigates how the power consumption of a replicated search engine can be automatically reduced when the system has low contention, without compromising its efficiency. We propose a novel self-adapting model to analyse the trade-off between latency and power consumption for distributed search engines. When query volumes are high and there is contention for the resources, the model automatically increases the necessary number of active machines in the system to maintain acceptable query response times. On the other hand, when the load of the system is low and the queries can be served easily, the model is able to reduce the number of active machines, leading to power savings. The model bases its decisions on examining the current and historical query loads of the search engine. Our proposal is formulated as a general dynamic decision problem, which can be quickly solved by dynamic programming in response to changing query loads. Thorough experiments are conducted to validate the usefulness of the proposed adaptive model using historical Web search traffic submitted to a commercial search engine. Our results show that our proposed self-adapting model can achieve an energy saving of 33% while only degrading mean query completion time by 10 ms compared to a baseline that provisions replicas based on a previous day's traffic
    • …
    corecore