3,029 research outputs found

    Database server workload characterization in an e-commerce environment

    Get PDF
    A typical E-commerce system that is deployed on the Internet has multiple layers that include Web users, Web servers, application servers, and a database server. As the system use and user request frequency increase, Web/application servers can be scaled up by replication. A load balancing proxy can be used to route user requests to individual machines that perform the same functionality. To address the increasing workload while avoiding replicating the database server, various dynamic caching policies have been proposed to reduce the database workload in E-commerce systems. However, the nature of the changes seen by the database server as a result of dynamic caching remains unknown. A good understanding of this change is fundamental for tuning a database server to get better performance. In this study, the TPC-W (a transactional Web E-commerce benchmark) workloads on a database server are characterized under two different dynamic caching mechanisms, which are generalized and implemented as query-result cache and table cache. The characterization focuses on response time, CPU computation, buffer pool references, disk I/O references, and workload classification. This thesis combines a variety of analysis techniques: simulation, real time measurement and data mining. The experimental results in this thesis reveal some interesting effects that the dynamic caching has on the database server workload characteristics. The main observations include: (a) dynamic cache can considerably reduce the CPU usage of the database server and the number of database page references when it is heavily loaded; (b) dynamic cache can also reduce the database reference locality, but to a smaller degree than that reported in file servers. The data classification results in this thesis show that with dynamic cache, the database server sees TPC-W profiles more like on-line transaction processing workloads

    Inside Dropbox: Understanding Personal Cloud Storage Services

    Get PDF
    Personal cloud storage services are gaining popularity. With a rush of providers to enter the market and an increasing of- fer of cheap storage space, it is to be expected that cloud storage will soon generate a high amount of Internet traffic. Very little is known about the architecture and the perfor- mance of such systems, and the workload they have to face. This understanding is essential for designing efficient cloud storage systems and predicting their impact on the network. This paper presents a characterization of Dropbox, the leading solution in personal cloud storage in our datasets. By means of passive measurements, we analyze data from four vantage points in Europe, collected during 42 consecu- tive days. Our contributions are threefold: Firstly, we are the first to study Dropbox, which we show to be the most widely-used cloud storage system, already accounting for a volume equivalent to around one third of the YouTube traffic at campus networks on some days. Secondly, we characterize the workload typical users in different environments gener- ate to the system, highlighting how this reflects on network traffic. Lastly, our results show possible performance bot- tlenecks caused by both the current system architecture and the storage protocol. This is exacerbated for users connected far from control and storage data-center

    A Scalable Cluster-based Infrastructure for Edge-computing Services

    Get PDF
    In this paper we present a scalable and dynamic intermediary infrastruc- ture, SEcS (acronym of BScalable Edge computing Services’’), for developing and deploying advanced Edge computing services, by using a cluster of heterogeneous machines. Our goal is to address the challenges of the next-generation Internet services: scalability, high availability, fault-tolerance and robustness, as well as programmability and quick prototyping. The system is written in Java and is based on IBM’s Web Based Intermediaries (WBI) [71] developed at IBM Almaden Research Center

    Experimental Performance Evaluation of Cloud-Based Analytics-as-a-Service

    Full text link
    An increasing number of Analytics-as-a-Service solutions has recently seen the light, in the landscape of cloud-based services. These services allow flexible composition of compute and storage components, that create powerful data ingestion and processing pipelines. This work is a first attempt at an experimental evaluation of analytic application performance executed using a wide range of storage service configurations. We present an intuitive notion of data locality, that we use as a proxy to rank different service compositions in terms of expected performance. Through an empirical analysis, we dissect the performance achieved by analytic workloads and unveil problems due to the impedance mismatch that arise in some configurations. Our work paves the way to a better understanding of modern cloud-based analytic services and their performance, both for its end-users and their providers.Comment: Longer version of the paper in Submission at IEEE CLOUD'1

    Empirical study of error behavior in Web servers

    Get PDF
    The World Wide Web has been a huge success, bringing the Internet to widespread popularity. For Web based systems to deal effectively with increasing number of Web clients, it is very important to understand the basic fundamentals of Web workload and error characteristics. In this thesis we focus on detailed empirical analysis of Web server error characteristics and reliability based on the data extracted from eleven different web servers. First, we address the data collection process and describe the methods for extraction of workload and error data from Web logs. Then, we analyze the Web error characteristics which include unique errors, frequency of occurrence of unique errors and top files causing errors. Furthermore, we analyze the relationship between errors among Web workload and estimate request-based and session-based reliability. The discussion presented in this thesis shows the sessions-based reliability is better indicator of user perception of Web quality than request-based reliability. Finally, we analyze and develop heuristic search criteria to identify sessions which indicate unusual server behavior, such as extremely long sessions and sessions with large number of server errors. The results of our study provide valuable measures for tuning and maintaining of Web servers

    Internet performance modeling: the state of the art at the turn of the century

    Get PDF
    Seemingly overnight, the Internet has gone from an academic experiment to a worldwide information matrix. Along the way, computer scientists have come to realize that understanding the performance of the Internet is a remarkably challenging and subtle problem. This challenge is all the more important because of the increasingly significant role the Internet has come to play in society. To take stock of the field of Internet performance modeling, the authors organized a workshop at Schloß Dagstuhl. This paper summarizes the results of discussions, both plenary and in small groups, that took place during the four-day workshop. It identifies successes, points to areas where more work is needed, and poses “Grand Challenges” for the performance evaluation community with respect to the Internet

    Web workload analysis and session characterization using clustering

    Get PDF
    Web servers have a significant presence in today\u27s Internet. Corporations want to achieve high availability, scalability, and consistent performance for respective Web systems, maintaining high customer service standards. Web Workload characterization and the analysis of Web log files are the basis on which Web server modeling for efficiency, scalability and availability can be planned. This thesis analyzes the Web access logs of six public Web sites: Department of Computer Science and Electrical Engineering at West Virginia University, West Virginia University, three NASA IVV servers, and Clarknet server. In addition, three private NASA IVV servers are also analyzed.;We characterize sessions using several attributes such as number of request per session, session length in time units, number of bytes transferred per session, and number of erroneous requests per session. We use clustering, as unsupervised learning methods, to classify Web server sessions. Unlike most other studies which were focused on building user profiles based on their navigational patterns, we use session attributes as basis for clustering. We also study the effectiveness of the Principal Component Analysis on session classification based on clustering
    • 

    corecore