461 research outputs found
A Framework for QoS-aware Execution of Workflows over the Cloud
The Cloud Computing paradigm is providing system architects with a new
powerful tool for building scalable applications. Clouds allow allocation of
resources on a "pay-as-you-go" model, so that additional resources can be
requested during peak loads and released after that. However, this flexibility
asks for appropriate dynamic reconfiguration strategies. In this paper we
describe SAVER (qoS-Aware workflows oVER the Cloud), a QoS-aware algorithm for
executing workflows involving Web Services hosted in a Cloud environment. SAVER
allows execution of arbitrary workflows subject to response time constraints.
SAVER uses a passive monitor to identify workload fluctuations based on the
observed system response time. The information collected by the monitor is used
by a planner component to identify the minimum number of instances of each Web
Service which should be allocated in order to satisfy the response time
constraint. SAVER uses a simple Queueing Network (QN) model to identify the
optimal resource allocation. Specifically, the QN model is used to identify
bottlenecks, and predict the system performance as Cloud resources are
allocated or released. The parameters used to evaluate the model are those
collected by the monitor, which means that SAVER does not require any
particular knowledge of the Web Services and workflows being executed. Our
approach has been validated through numerical simulations, whose results are
reported in this paper
A Study of Very Short Intermittent DDoS Attacks on the Performance of Web Services in Clouds
Distributed Denial-of-Service (DDoS) attacks for web applications such as e-commerce are increasing in size, scale, and frequency. The emerging elastic cloud computing cannot defend against ever-evolving new types of DDoS attacks, since they exploit various newly discovered network or system vulnerabilities even in the cloud platform, bypassing not only the state-of-the-art defense mechanisms but also the elasticity mechanisms of cloud computing.
In this dissertation, we focus on a new type of low-volume DDoS attack, Very Short Intermittent DDoS Attacks, which can hurt the performance of web applications deployed in the cloud via transiently saturating the critical bottleneck resource of the target systems by means of external attack HTTP requests outside the cloud or internal resource contention inside the cloud. We have explored external attacks by modeling the n-tier web applications with queuing network theory and implementing the attacking framework based-on feedback control theory. We have explored internal attacks by investigating and exploiting resource contention and performance interference to locate a target VM (virtual machine) and degrade its performance
Maximum Likelihood Estimation of Closed Queueing Network Demands from Queue Length Data
Resource demand estimation is essential for the application of analyical models, such as queueing networks, to real-world systems. In this paper, we investigate maximum likelihood (ML) estimators for service demands in closed queueing networks with load-independent and load-dependent service times. Stemming from a characterization of necessary conditions for ML estimation, we propose new estimators that infer demands from queue-length measurements, which are inexpensive metrics to collect in real systems. One advantage of focusing on queue-length data compared to response times or utilizations is that confidence intervals can be rigorously derived from the equilibrium distribution of the queueing network model. Our estimators and their confidence intervals are validated against simulation and real system measurements for a multi-tier application
Effective Resource and Workload Management in Data Centers
The increasing demand for storage, computation, and business continuity has driven the growth of data centers. Managing data centers efficiently is a difficult task because of the wide variety of datacenter applications, their ever-changing intensities, and the fact that application performance targets may differ widely. Server virtualization has been a game-changing technology for IT, providing the possibility to support multiple virtual machines (VMs) simultaneously. This dissertation focuses on how virtualization technologies can be utilized to develop new tools for maintaining high resource utilization, for achieving high application performance, and for reducing the cost of data center management.;For multi-tiered applications, bursty workload traffic can significantly deteriorate performance. This dissertation proposes an admission control algorithm AWAIT, for handling overloading conditions in multi-tier web services. AWAIT places on hold requests of accepted sessions and refuses to admit new sessions when the system is in a sudden workload surge. to meet the service-level objective, AWAIT serves the requests in the blocking queue with high priority. The size of the queue is dynamically determined according to the workload burstiness.;Many admission control policies are triggered by instantaneous measurements of system resource usage, e.g., CPU utilization. This dissertation first demonstrates that directly measuring virtual machine resource utilizations with standard tools cannot always lead to accurate estimates. A directed factor graph (DFG) model is defined to model the dependencies among multiple types of resources across physical and virtual layers.;Virtualized data centers always enable sharing of resources among hosted applications for achieving high resource utilization. However, it is difficult to satisfy application SLOs on a shared infrastructure, as application workloads patterns change over time. AppRM, an automated management system not only allocates right amount of resources to applications for their performance target but also adjusts to dynamic workloads using an adaptive model.;Server consolidation is one of the key applications of server virtualization. This dissertation proposes a VM consolidation mechanism, first by extending the fair load balancing scheme for multi-dimensional vector scheduling, and then by using a queueing network model to capture the service contentions for a particular virtual machine placement
Optimization and Regulation of Performance for Computing Systems
The current demands of computing applications, the advent of technological advances related to hardware and software, the contractual relationship between users and cloud service providers and current ecological demands, require the re\ufb01nement of performance regulation on computing systems. Powerful mathematical tools such as control systems theory, discrete event systems (DES) and randomized algorithms (RAs) have o\ufb00ered improvements in e\ufb03ciency and performance in computer scenarios where the traditional approach has been the application of well founded common sense and heuristics. The comprehensive concept of computing systems is equally related to a microprocessor unit, a set of microprocessor units in a server, a set of servers interconnected in a data center or even a network of data centers forming a cloud of virtual resources. In this dissertation, we explore theoretical approaches in order to optimize and regulate performance measures in di\ufb00erent computing systems. In several cases, such as cloud services, this optimization would allow the fair negotiation of service level agreements (SLAs) between a user and a cloud service provider, that may be objectively measured for the bene\ufb01t of both negotiators. Although DES are known to be suitable for modeling computing systems, we still \ufb01nd that traditional control theory approaches, such as passivity analysis, may o\ufb00er solutions that are worth being explored. Moreover, as the size of the problem increases, so does its complexity. RAs o\ufb00er good alternatives to make decisions on the design of the solutions of such complex problems based on given values of con\ufb01dence and accuracy. In this dissertation, we propose the development of: a) a methodology to optimize performance on a many-core processor system, b) a methodology to optimize and regulate performance on a multitier server, c) some corrections to a previously proposed passivity analysis of a market-oriented cloud model, and d) a decentralized methodology to optimize cloud performance. In all the aforementioned systems, we are interested in developing optimization methods strongly supported on DES theory, speci\ufb01cally In\ufb01nitesimal Perturbation Analysis (IPA) and RAs based on sample complexity to guarantee that these computing systems will satisfy the required optimal performance on the average
End-to-End Application Cloning for Distributed Cloud Microservices with Ditto
We present Ditto, an automated framework for cloning end-to-end cloud
applications, both monolithic and microservices, which captures I/O and network
activity, as well as kernel operations, in addition to application logic. Ditto
takes a hierarchical approach to application cloning, starting with capturing
the dependency graph across distributed services, to recreating each tier's
control/data flow, and finally generating system calls and assembly that mimics
the individual applications. Ditto does not reveal the logic of the original
application, facilitating publicly sharing clones of production services with
hardware vendors, cloud providers, and the research community.
We show that across a diverse set of single- and multi-tier applications,
Ditto accurately captures their CPU and memory characteristics as well as their
high-level performance metrics, is portable across platforms, and facilitates a
wide range of system studies
QMLE: a methodology for statistical inference of service demands from queueing data
Estimating the demands placed by services on physical resources is an essential step for the definition of performance models. For example, scalability analysis relies on these parameters to predict queueing delays under increasing loads. In this paper, we investigate maximum likelihood (ML) estimators for demands at load-independent and load-dependent resources in systems with parallelism constraints. We define a likelihood function based on state measurements and derive necessary conditions for its maximization. We then obtain novel estimators that accurately and inexpensively obtain service demands using only aggregate state data. With our approach, and also thanks to approximation methods for computing marginal and joint distributions for the load-dependent case, confidence intervals can be rigorously derived, explicitly taking into account both topology and concurrency levels of the services. Our estimators and their confidence intervals are validated against simulations and real system measurements for two multi-tier applications, showing high accuracy also in the presence of load-dependent resources
- …