43 research outputs found

    Resource Utilization Prediction: A Proposal for Information Technology Research

    Get PDF
    Research into predicting long-term resource needs has been faced with a very difficult problem of extending the accuracy period beyond the immediate future. Business forecasting has overcome this limitation by successfully incorporating the concept of human interaction as the basis of prediction patterns at the hourly, daily, weekly, monthly, and yearly time frames. Computer resource utilization is also impacted by human interaction therefore influencing research into predictability of resource usage based on human access patterns. Emulated human web server access data was captured in a feasibility study that used time series analysis to predict future resource usage. For prediction beyond several minutes, results indicate that the majority of projected resource usage was within an 80% confidence level thus supporting the foundation of future resource prediction work in this area

    Datacenter Traffic Control: Understanding Techniques and Trade-offs

    Get PDF
    Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

    Some topics in web performance analysis

    Get PDF
    This thesis consists of four papers on web performance analysis. In the first paper we investigate the performance of overload control through queue length for two different web server architectures. The simulation result suggests that the benefit of request prioritization is noticeable only when the capacities of the sub-systems match each other. In the second paper we present an M/G/1/K*PS queueing model of a web server. We obtain closed form expressions for web server performance metrics such as average response time, throughput and blocking probability. The model is validated through real measurements. The third paper studies a queueing system with a load balancer and a pool of identical FCFS queues in parallel. By taking the number of servers to infinite, we show that the average waiting time for the system is not always minimized by routing each customer to the expected shortest queue when the information used for decision is stale. In the last paper we consider the problem of admission control to an M/M/1 queue under periodic observations with average cost criterion. The problem is formulated as a discrete time Markov decision process whose states are fully observable. A proof of the existence of the average optimal policy by the vanishing discounted approach is provided. We also show that the optimal policy is nonincreasing with respect to the observed number of customers in the system

    Heavy-Traffic Optimal Size- and State-Aware Dispatching

    Full text link
    Dispatching systems, where arriving jobs are immediately assigned to one of multiple queues, are ubiquitous in computer systems and service systems. A natural and practically relevant model is one in which each queue serves jobs in FCFS (First-Come First-Served) order. We consider the case where the dispatcher is size-aware, meaning it learns the size (i.e. service time) of each job as it arrives; and state-aware, meaning it always knows the amount of work (i.e. total remaining service time) at each queue. While size- and state-aware dispatching to FCFS queues has been extensively studied, little is known about optimal dispatching for the objective of minimizing mean delay. A major obstacle is that no nontrivial lower bound on mean delay is known, even in heavy traffic (i.e. the limit as load approaches capacity). This makes it difficult to prove that any given policy is optimal, or even heavy-traffic optimal. In this work, we propose the first size- and state-aware dispatching policy that provably minimizes mean delay in heavy traffic. Our policy, called CARD (Controlled Asymmetry Reduces Delay), keeps all but one of the queues short, then routes as few jobs as possible to the one long queue. We prove an upper bound on CARD's mean delay, and we prove the first nontrivial lower bound on the mean delay of any size- and state-aware dispatching policy. Both results apply to any number of servers. Our bounds match in heavy traffic, implying CARD's heavy-traffic optimality. In particular, CARD's heavy-traffic performance improves upon that of LWL (Least Work Left), SITA (Size Interval Task Assignment), and other policies from the literature whose heavy-traffic performance is known.Comment: ACM SIGMETRICS / IFIP Performance 202

    Resource allocation in wireless access network : A queueing theoretic approach

    Get PDF
    To meet its performance targets, the future 5G networks need to greatly optimize the Radio Access Networks (RANs), which connect the end users to the core network. In this thesis, we develop mathematical models to study three aspects of the operation of the RAN in modern wireless systems. The models are analyzed using  the techniques borrowed mainly from queueing theory and stochastic control. Also, simulations are extensively used to gain further insights. First, we provide a detailed Markov model of the random access process in LTE. From this, we observe that the bottleneck in the signaling channel causes congestion in the  access  when a large number of M2M devices attempt to enter the network. Then, in the context of the so-called Heterogeneous networks (HetNets), we suggest  dynamic load balancing schemes that alleviate this congestion and reduce the overall access delay. We then use flow-level models for elastic data traffic to study the problem of coordinating the activities of the neighboring base stations.  We seek to minimize the flow-level delay when there are various classes of users. We classify the users based on their locations, or, in dynamic TDD systems, on the direction of service the network is providing to them. Using interacting queues and different operating policies of running such queues, we study the amount of gain the dynamic policies can provide over the static probabilistic policies. Our results show that simple dynamic policies can  provide very good performance in the cases considered. Finally, we consider the problem of opportunistically scheduling the flows of users with time-varying channels  taking into account   the size of data they need to transfer. Using flow-level models in a system with homogeneous channels, we provide the optimal scheduling policy when there are  no new job arrivals. We also suggest the method to implement such a policy in a time-slotted system. With heterogeneous channels, the problem is intractable for the flow-level techniques. Therefore, we utilize the framework of the restless-multi-armed-bandit (RMAB) problems employing the so-called Whittle index approach. The Whittle index approach, by relaxing the scheduling constraints, makes the problem separable, and thereby provides an exact solution to the modified problem. Our simulations suggest that when  this solution is applied as a heuristic to the original problem, it gives good performance, even with dynamic job arrivals

    Aggregate matrix-analytic techniques and their applications

    Get PDF
    The complexity of computer systems affects the complexity of modeling techniques that can be used for their performance analysis. In this dissertation, we develop a set of techniques that are based on tractable analytic models and enable efficient performance analysis of computer systems. Our approach is three pronged: first, we propose new techniques to parameterize measurement data with Markovian-based stochastic processes that can be further used as input into queueing systems; second, we propose new methods to efficiently solve complex queueing models; and third, we use the proposed methods to evaluate the performance of clustered Web servers and propose new load balancing policies based on this analysis.;We devise two new techniques for fitting measurement data that exhibit high variability into Phase-type (PH) distributions. These techniques apply known fitting algorithms in a divide-and-conquer fashion. We evaluate the accuracy of our methods from both the statistics and the queueing systems perspective. In addition, we propose a new methodology for fitting measurement data that exhibit long-range dependence into Markovian Arrival Processes (MAPs).;We propose a new methodology, ETAQA, for the exact solution of M/G/1-type processes, (GI/M/1-type processes, and their intersection, i.e., quasi birth-death (QBD) processes. ETAQA computes an aggregate steady state probability distribution and a set of measures of interest. E TAQA is numerically stable and computationally superior to alternative solution methods. Apart from ETAQA, we propose a new methodology for the exact solution of a class of GI/G/1-type processes based on aggregation/decomposition.;Finally, we demonstrate the applicability of the proposed techniques by evaluating load balancing policies in clustered Web servers. We address the high variability in the service process of Web servers by dedicating the servers of a cluster to requests of similar sizes and propose new, content-aware load balancing policies. Detailed analysis shows that the proposed policies achieve high user-perceived performance and, by continuously adapting their scheduling parameters to the current workload characteristics, provide good performance under conditions of transient overload

    Resource Allocation for Cellular/WLAN Integrated Networks

    Get PDF
    The next-generation wireless communications have been envisioned to be supported by heterogeneous networks using various wireless access technologies. The popular cellular networks and wireless local area networks (WLANs) present perfectly complementary characteristics in terms of service capacity, mobility support, and quality-of-service (QoS) provisioning. The cellular/WLAN interworking is thus an effective way to promote the evolution of wireless networks. As an essential aspect of the interworking, resource allocation is vital for efficient utilization of the overall resources. Specially, multi-service provisioning can be enhanced with cellular/WLAN interworking by taking advantage of the complementary network strength and an overlay structure. Call assignment/reassignment strategies and admission control policies are effective resource allocation mechanisms for the cellular/WLAN integrated network. Initially, the incoming calls are distributed to the overlay cell or WLAN according to call assignment strategies, which are enhanced with admission control policies in the target network. Further, call reassignment can be enabled to dynamically transfer the traffic load between the overlay cell and WLAN via vertical handoff. By these means, the multi-service traffic load can be properly shared between the interworked systems. In this thesis, we investigate the load sharing problem for this heterogeneous wireless overlay network. Three load sharing schemes with different call assignment/reassignment strategies and admission control policies are proposed and analyzed. Effective analytical models are developed to evaluate the QoS performance and determine the call admission and assignment parameters. First, an admission control scheme with service-differentiated call assignment is studied to gain insights on the effects of load sharing on interworking effectiveness. Then, the admission scheme is extended by using randomized call assignment to enable distributed implementation. Also, we analyze the impact of user mobility and data traffic variability. Further, an enhanced call assignment strategy is developed to exploit the heavy-tailedness of data call size. Last, the study is extended to a multi-service scenario. The overall resource utilization and QoS satisfaction are improved substantially by taking into account the multi-service traffic characteristics, such as the delay-sensitivity of voice traffic, elasticity and heavy-tailedness of data traffic, and rate-adaptiveness of video streaming traffic