461 research outputs found

    A Framework for QoS-aware Execution of Workflows over the Cloud

    Full text link
    The Cloud Computing paradigm is providing system architects with a new powerful tool for building scalable applications. Clouds allow allocation of resources on a "pay-as-you-go" model, so that additional resources can be requested during peak loads and released after that. However, this flexibility asks for appropriate dynamic reconfiguration strategies. In this paper we describe SAVER (qoS-Aware workflows oVER the Cloud), a QoS-aware algorithm for executing workflows involving Web Services hosted in a Cloud environment. SAVER allows execution of arbitrary workflows subject to response time constraints. SAVER uses a passive monitor to identify workload fluctuations based on the observed system response time. The information collected by the monitor is used by a planner component to identify the minimum number of instances of each Web Service which should be allocated in order to satisfy the response time constraint. SAVER uses a simple Queueing Network (QN) model to identify the optimal resource allocation. Specifically, the QN model is used to identify bottlenecks, and predict the system performance as Cloud resources are allocated or released. The parameters used to evaluate the model are those collected by the monitor, which means that SAVER does not require any particular knowledge of the Web Services and workflows being executed. Our approach has been validated through numerical simulations, whose results are reported in this paper

    A Study of Very Short Intermittent DDoS Attacks on the Performance of Web Services in Clouds

    Get PDF
    Distributed Denial-of-Service (DDoS) attacks for web applications such as e-commerce are increasing in size, scale, and frequency. The emerging elastic cloud computing cannot defend against ever-evolving new types of DDoS attacks, since they exploit various newly discovered network or system vulnerabilities even in the cloud platform, bypassing not only the state-of-the-art defense mechanisms but also the elasticity mechanisms of cloud computing. In this dissertation, we focus on a new type of low-volume DDoS attack, Very Short Intermittent DDoS Attacks, which can hurt the performance of web applications deployed in the cloud via transiently saturating the critical bottleneck resource of the target systems by means of external attack HTTP requests outside the cloud or internal resource contention inside the cloud. We have explored external attacks by modeling the n-tier web applications with queuing network theory and implementing the attacking framework based-on feedback control theory. We have explored internal attacks by investigating and exploiting resource contention and performance interference to locate a target VM (virtual machine) and degrade its performance

    Maximum Likelihood Estimation of Closed Queueing Network Demands from Queue Length Data

    Get PDF
    Resource demand estimation is essential for the application of analyical models, such as queueing networks, to real-world systems. In this paper, we investigate maximum likelihood (ML) estimators for service demands in closed queueing networks with load-independent and load-dependent service times. Stemming from a characterization of necessary conditions for ML estimation, we propose new estimators that infer demands from queue-length measurements, which are inexpensive metrics to collect in real systems. One advantage of focusing on queue-length data compared to response times or utilizations is that confidence intervals can be rigorously derived from the equilibrium distribution of the queueing network model. Our estimators and their confidence intervals are validated against simulation and real system measurements for a multi-tier application

    Effective Resource and Workload Management in Data Centers

    Get PDF
    The increasing demand for storage, computation, and business continuity has driven the growth of data centers. Managing data centers efficiently is a difficult task because of the wide variety of datacenter applications, their ever-changing intensities, and the fact that application performance targets may differ widely. Server virtualization has been a game-changing technology for IT, providing the possibility to support multiple virtual machines (VMs) simultaneously. This dissertation focuses on how virtualization technologies can be utilized to develop new tools for maintaining high resource utilization, for achieving high application performance, and for reducing the cost of data center management.;For multi-tiered applications, bursty workload traffic can significantly deteriorate performance. This dissertation proposes an admission control algorithm AWAIT, for handling overloading conditions in multi-tier web services. AWAIT places on hold requests of accepted sessions and refuses to admit new sessions when the system is in a sudden workload surge. to meet the service-level objective, AWAIT serves the requests in the blocking queue with high priority. The size of the queue is dynamically determined according to the workload burstiness.;Many admission control policies are triggered by instantaneous measurements of system resource usage, e.g., CPU utilization. This dissertation first demonstrates that directly measuring virtual machine resource utilizations with standard tools cannot always lead to accurate estimates. A directed factor graph (DFG) model is defined to model the dependencies among multiple types of resources across physical and virtual layers.;Virtualized data centers always enable sharing of resources among hosted applications for achieving high resource utilization. However, it is difficult to satisfy application SLOs on a shared infrastructure, as application workloads patterns change over time. AppRM, an automated management system not only allocates right amount of resources to applications for their performance target but also adjusts to dynamic workloads using an adaptive model.;Server consolidation is one of the key applications of server virtualization. This dissertation proposes a VM consolidation mechanism, first by extending the fair load balancing scheme for multi-dimensional vector scheduling, and then by using a queueing network model to capture the service contentions for a particular virtual machine placement

    Optimization and Regulation of Performance for Computing Systems

    Get PDF
    The current demands of computing applications, the advent of technological advances related to hardware and software, the contractual relationship between users and cloud service providers and current ecological demands, require the re\ufb01nement of performance regulation on computing systems. Powerful mathematical tools such as control systems theory, discrete event systems (DES) and randomized algorithms (RAs) have o\ufb00ered improvements in e\ufb03ciency and performance in computer scenarios where the traditional approach has been the application of well founded common sense and heuristics. The comprehensive concept of computing systems is equally related to a microprocessor unit, a set of microprocessor units in a server, a set of servers interconnected in a data center or even a network of data centers forming a cloud of virtual resources. In this dissertation, we explore theoretical approaches in order to optimize and regulate performance measures in di\ufb00erent computing systems. In several cases, such as cloud services, this optimization would allow the fair negotiation of service level agreements (SLAs) between a user and a cloud service provider, that may be objectively measured for the bene\ufb01t of both negotiators. Although DES are known to be suitable for modeling computing systems, we still \ufb01nd that traditional control theory approaches, such as passivity analysis, may o\ufb00er solutions that are worth being explored. Moreover, as the size of the problem increases, so does its complexity. RAs o\ufb00er good alternatives to make decisions on the design of the solutions of such complex problems based on given values of con\ufb01dence and accuracy. In this dissertation, we propose the development of: a) a methodology to optimize performance on a many-core processor system, b) a methodology to optimize and regulate performance on a multitier server, c) some corrections to a previously proposed passivity analysis of a market-oriented cloud model, and d) a decentralized methodology to optimize cloud performance. In all the aforementioned systems, we are interested in developing optimization methods strongly supported on DES theory, speci\ufb01cally In\ufb01nitesimal Perturbation Analysis (IPA) and RAs based on sample complexity to guarantee that these computing systems will satisfy the required optimal performance on the average

    End-to-End Application Cloning for Distributed Cloud Microservices with Ditto

    Full text link
    We present Ditto, an automated framework for cloning end-to-end cloud applications, both monolithic and microservices, which captures I/O and network activity, as well as kernel operations, in addition to application logic. Ditto takes a hierarchical approach to application cloning, starting with capturing the dependency graph across distributed services, to recreating each tier's control/data flow, and finally generating system calls and assembly that mimics the individual applications. Ditto does not reveal the logic of the original application, facilitating publicly sharing clones of production services with hardware vendors, cloud providers, and the research community. We show that across a diverse set of single- and multi-tier applications, Ditto accurately captures their CPU and memory characteristics as well as their high-level performance metrics, is portable across platforms, and facilitates a wide range of system studies

    QMLE: a methodology for statistical inference of service demands from queueing data

    Get PDF
    Estimating the demands placed by services on physical resources is an essential step for the definition of performance models. For example, scalability analysis relies on these parameters to predict queueing delays under increasing loads. In this paper, we investigate maximum likelihood (ML) estimators for demands at load-independent and load-dependent resources in systems with parallelism constraints. We define a likelihood function based on state measurements and derive necessary conditions for its maximization. We then obtain novel estimators that accurately and inexpensively obtain service demands using only aggregate state data. With our approach, and also thanks to approximation methods for computing marginal and joint distributions for the load-dependent case, confidence intervals can be rigorously derived, explicitly taking into account both topology and concurrency levels of the services. Our estimators and their confidence intervals are validated against simulations and real system measurements for two multi-tier applications, showing high accuracy also in the presence of load-dependent resources
    • …
    corecore