279 research outputs found

    Characterization of Web server workload

    Get PDF
    Realistic and formal mathematical description of web-server workload forms a fundamental step in the design of synthetic workload generators, capacity planning and accurate predictions of performance measures. In this thesis we perform detailed empirical analysis of the web workload by analyzing access logs of nine web-servers. Unlike most previous work that focused on request-based workload characterization, we analyze both request and session characteristics. We perform rigorous statistical analysis to determine the self-similarity of web traffic and heavy-tailedness of the distribution of different session parameters. Our analysis shows that web traffic is self-similar and the degree of self-similarity is proportional to the workload intensity. To increase the confidence in our analysis we use several methods for estimating the degree of self-similarity and heavy-tailedness. Additionally we point out specific problems associated with these methods. Finally, we analyze the impact of robots sessions on the heavy-tailedness of the distribution

    Identifying Long-range Dependent Network Traffic through Autocorrelation Functions

    Get PDF
    For over a decade researchers have been reporting the impact of self-similar long-range dependent network traffic. Long-range dependence (LRD) is of great significance in traffic engineering problems such as measurement, queuing strategy, buffer sizing and admission and congestion control. In this research, in order to determine the existence of LRD, we apply three different robust versions of the autocorrelation function (ACF), namely weighted ACF (WACF), trimmed ACF (TACF) and variance-ratio of differences and sums, known as the D/S variance estimator (DACF), in conjunction with the sample ACF (which is moment based). Here we define the moment based ACF as MACF. In telecommunications, LRD traffic defines that a similar pattern of traffic persists for a longer span of time. Through ACF, it is possible to detect how long the traffic lasts. The aim of this research is to investigate the performance of ACF in identifying the existence of LRD traffic

    Workload characterization, modeling, and prediction in grid Computing

    Get PDF
    Workloads play an important role in experimental performance studies of computer systems. This thesis presents a comprehensive characterization of real workloads on production clusters and Grids. A variety of correlation structures and rich scaling behavior are identified in workload attributes such as job arrivals and run times, including pseudo-periodicity, long range dependence, and strong temporal locality. Based on the analytic results workload models are developed to fit the real data. For job arrivals three different kinds of autocorrelations are investigated. For short to middle range dependent data, Markov modulated Poisson processes (MMPP) are good models because they can capture correlations between interarrival times while remaining analytically tractable. For long range dependent and multifractal processes, the multifractal wavelet model (MWM) is able to reconstruct the scaling behavior and it provides a coherent wavelet framework for analysis and synthesis. Pseudo-periodicity is a special kind of autocorrelation and it can be modeled by a matching pursuit approach. For workload attributes such as run time a new model is proposed that can fit not only the marginal distribution but also the second order statistics such as the autocorrelation function (ACF). The development of workload models enable the simulation studies of Grid scheduling strategies. By using the synthetic traces, the performance impacts of workload correlations in Grid scheduling is quantitatively evaluated. The results indicate that autocorrelations in workload attributes can cause performance degradation, in some situations the difference can be up to several orders of magnitude. The larger the autocorrelation, the worse the performance, it is proved both at the cluster and Grid level. This study shows the importance of realistic workload models in performance evaluation studies. Regarding performance predictions, this thesis treats the targeted resources as a ``black box'' and takes a statistical approach. It is shown that statistical learning based methods, after a well-thought and fine-tuned design, are able to deliver good accuracy and performance.UBL - phd migration 201

    SLA Translation in Multi-Layered Service Oriented Architectures: Status and Challenges

    Get PDF

    Workload Modeling for Computer Systems Performance Evaluation

    Full text link

    Application of the Empirical Mode Decomposition On the Characterization and Forecasting of the Arrival Data of an Enterprise Cluster

    Get PDF
    Characterization and forecasting are two important processes in capacity planning. While they are closely related, their approaches have been different. In this research, a decomposition method called Empirical Mode Decomposition (EMD) has been applied as a preprocessing tool in order to bridge the input of both characterization and forecasting processes of the job arrivals of an enterprise cluster. Based on the facts that an enterprise cluster follows a standard preset working schedule and that EMD has the capability to extract hidden patterns within a data stream, we have developed a set of procedures that can preprocess the data for characterization as well as for forecasting. This comprehensive empirical study demonstrates that the addition of the preprocessing step is an improvement over the standard approaches in both characterization and forecasting. In addition, it is also shown that EMD is better than the popular wavelet-based decomposition in term of extracting different patterns from within a data stream

    Highly-cited papers in software engineering: The top 100

    Get PDF
    Context: According to the search reported in this paper, as of this writing (May 2015), a very large number of papers (more than 70,000) have been published in the area of Software Engineering (SE) since its inception in 1968. Citations are crucial in any research area to position the work and to build on the work of others. Identification and characterization of highly-cited papers are common and are regularly reported in various disciplines. Objective: The objective of this study is to identify the papers in the area of SE that have influenced others the most as measured by citation count. Studying highly-cited SE papers helps researchers to see the type of approaches and research methods presented and applied in such papers, so as to be able to learn from them to write higher quality papers which will likely receive high citations. Method: To achieve the above objective, we conducted a study, comprised of five research questions, to identify and classify the top-100 highly-cited SE papers in terms of two metrics: total number of citations and average annual number of citations. Results: By total number of citations, the top paper is "A metrics suite for object-oriented design", cited 1,817 times and published in 1994. By average annual number of citations, the top paper is "QoS-aware middleware for Web services composition", cited 154.2 times on average annually and published in 2004. Conclusion: It is concluded that it is important to identify the highly-cited SE papers and also to characterize the overall citation landscape in the SE field. We hope that this paper will encourage further discussions in the SE community towards further analysis and formal characterization of the highly-cited SE papers.Vahid Garousi was partially supported by several internal grants provided by the Hacettepe University. The authors would like to thank the anonymous reviewers for their insightful comments

    Identifying Long-range Dependent Network Traffic through Autocorrelation Functions

    Full text link
    For over a decade researchers have been reporting the impact of self-similar long-range dependent network traffic. Long-range dependence (LRD) is of great significance in traffic engineering problems such as measurement, queuing strategy, buffer sizing and admission and congestion control. In this research, in order to determine the existence of LRD, we apply three different robust versions of the autocorrelation function (ACF), namely weighted ACF (WACF), trimmed ACF (TACF) and variance-ratio of differences and sums, known as the D/S variance estimator (DACF), in conjunction with the sample ACF (which is moment based). Here we define the moment based ACF as MACF. In telecommunications, LRD traffic defines that a similar pattern of traffic persists for a longer span of time. Through ACF, it is possible to detect how long the traffic lasts. The aim of this research is to investigate the performance of ACF in identifying the existence of LRD traffic
    corecore