10 research outputs found

    Automatic Scaling of Internet Applications for Cloud Computing Services

    Get PDF
    Abstract-Many Internet applications can benefit from an automatic scaling property where their resource usage can be scaled up and down automatically by the cloud service provider. We present a system that provides automatic scaling for Internet applications in the cloud environment. We encapsulate each application instance inside a virtual machine (VM) and use virtualization technology to provide fault isolation. We model it as the Class Constrained Bin Packing (CCBP) problem where each server is a bin and each class represents an application. The class constraint reflects the practical limit on the number of applications a server can run simultaneously. We develop an efficient semi-online color set algorithm that achieves good demand satisfaction ratio and saves energy by reducing the number of servers used when the load is low. Experiment results demonstrate that our system can improve the throughput by 180% over an open source implementation of Amazon EC2 and restore the normal QoS five times as fast during flash crowds. Large scale simulations demonstrate that our algorithm is extremely scalable: the decision time remains under 4 s for a system with 10 000 servers and 10 000 applications. This is an order of magnitude improvement over traditional application placement algorithms in enterprise environments

    Machine learning based Model for Cloud Load Prediction and Resource Allocation

    Get PDF
    Elasticity and the lack of upfront capital investment offered by cloud computing is appealing to many businesses. There is a lot of discussion on the benefits and costs of the cloud model and on how to move legacy applications onto the cloud platform. Here we study a different problem: how can a cloud service provider best multiplex its virtual resources onto the physical hardware? This is important because much of the touted gains in the cloud model come from such multiplexing. Studies have found that servers in many existing data centers are often severely under-utilized due to over-provisioning for the peak demand. The cloud model is expected to make such practice unnecessary by offering automatic scale up and down in response to load variation. Besides reducing the hardware cost, it also saves on electricity which contributes to a significant portion of the operational expenses in large data centers. Proper resource allocation for various virtualized resources must be based on these cloud load predictions. The presence of heterogeneous applications, such as content delivery networks, web applications, and MapReduce tasks, complicates this process. Cloud workloads with conflicting resource allocation needs for communication and information processing further exacerbate the difficulty

    Systematic survey on evolution of cloud architectures

    Get PDF
    Cloud architectures are becoming an active area of research. The quality and durability of a software system are defined by its architecture. The architecture approaches that are used to build cloud-based systems are not available in a blended fashion to achieve an effective universal architecture solution. The paper aims to contribute to the systematic literature review (SLR) to assist researchers who are striving to contribute in this area. The main objective of this review is to systematically identify and analyse the recently published research topics related to software architecture for cloud with regard to research activity, used tools and techniques, proposed approaches, domains. The applied method is SLR based on four selected electronic databases proposed by (Kitchenham and Charters, 2007). Out of 400 classified publications, we regard 121 as relevant for our research domain. We outline taxonomy of their topics and domains, provide lists of used methods and proposed approaches. At present, there is little research coverage on software architectures for cloud, while other disciplines have become more active. The future work is to develop a secure architecture to achieve quality of service and service level agreements

    Cloud Resource Management With Turnaround Time Driven Auto-Scaling

    Get PDF
    Cloud resource management research and techniques have received relevant attention in the last years. In particular, recently numerous studies have focused on determining the relationship between server-side system information and performance experience for reducing resource wastage. However, the genuine experiences of clients cannot be readily understood only by using the collected server-side information. In this paper, a cloud resource management framework with two novel turnaround time driven auto-scaling mechanisms is proposed for ensuring the stability of service performance. In the first mechanism, turnaround time monitors are deployed in the client-side instead of the more traditional server-side, and the information collected outside the server is used for driving a dynamic auto-scaling operation. In the second mechanism, a schedule-based auto scaling preconfiguration maker is designed to test and identify the amount of resources required in the cloud. The reported experimental results demonstrate that using our original framework for cloud resource management, stable service quality can be ensured and, moreover, a certain amount of quality variation can be handled in order to allow the stability of the service performance to be increased

    Online Job Scheduling in Distributed Machine Learning Clusters

    Full text link
    Nowadays large-scale distributed machine learning systems have been deployed to support various analytics and intelligence services in IT firms. To train a large dataset and derive the prediction/inference model, e.g., a deep neural network, multiple workers are run in parallel to train partitions of the input dataset, and update shared model parameters. In a shared cluster handling multiple training jobs, a fundamental issue is how to efficiently schedule jobs and set the number of concurrent workers to run for each job, such that server resources are maximally utilized and model training can be completed in time. Targeting a distributed machine learning system using the parameter server framework, we design an online algorithm for scheduling the arriving jobs and deciding the adjusted numbers of concurrent workers and parameter servers for each job over its course, to maximize overall utility of all jobs, contingent on their completion times. Our online algorithm design utilizes a primal-dual framework coupled with efficient dual subroutines, achieving good long-term performance guarantees with polynomial time complexity. Practical effectiveness of the online algorithm is evaluated using trace-driven simulation and testbed experiments, which demonstrate its outperformance as compared to commonly adopted scheduling algorithms in today's cloud systems

    Resurssien skaalaus pilvilaskentaympäristössä

    Get PDF
    Tutkielmassa on tutkittu millaisia ovat hyvän pilvilaskentaympäristön resurssien skaalauksen mallin piirteitä. Pilvilaskentaympäristö mahdollistaa pilvipalvelujen toteuttamisen. Pilvipalvelut hyödyntävät konesalien palvelimia, joilla tehdään tarvittava laskenta. Tämä mahdollistaa yritysten ja kuluttajien palvelusovellusten vaatiman laskennan ulkoistamisen, sekä parantaa niiden saatavuutta. Tarvittavia resursseja ovat palvelimien prosessorit, keskusmuisti, tallennustila ja laajakaistayhteys. Oikeanlaisella resurssien skaalauksella vähennetään hukkaa, mikä parantaa kustannustehokkuutta vähentämällä ylimääräistä sähköenergian kulutusta. Ennakoivalla skaalaamisella varaudutaan kasvavaan resurssien tarpeeseen. Ennakoivaa skaalaamista tarvitaan, koska reaktiivinen skaalaus on liian hidasta. Ennakoivan skaalauksen tärkeys korostuu tilanteissa, joissa resurssien tarve kasvaa äkillisesti. Reaktiivisessa skaalauksessa reagoidaan muuttuneeseen tarpeeseen, mutta resurssien käyttöönotossa on viive, joten pelkällä reaktiivisella skaalauksella ei välttämättä voida vastata tarpeeseen riittävän nopeasti, mikä heikentää palvelunlaatua eli Quality of Serviceä (QoS). On tärkeää, että palvelunlaatu pysyy stabiilina, koska huono laatu johtaa asiakkaiden menettämiseen. QoS on määritelty Service Level Agreementissa eli SLA:ssa, joka on pilvipalveluntarjoajan ja konesaleja vuokraavan tahon välinen sopimus palvelun tasosta. Resurssien skaalauksella on suora vaikutus pilvipalvelujen toteuttamisen onnistumiseen pilvilaskentaympäristössä. Skaalaus on usein jaettu horisontaaliseen ja vertikaaliseen skaalaukseen pilvilaskentaympäristöissä, joissa hyödynnetään virtuaalikoneita. Horisontaalisessa skaalauksessa säädetään virtuaalikoneiden määrää, jotka suorittavat sovellusinstansseja. Vertikaalisessa skaalauksessa säädetään virtuaalikoneiden käytössä olevia resursseja. Hyvä tapa skaalata, on hyödyntää horisontaalisen ja vertikaalisen skaalauksen yhdistelmää, joka mahdollistaa tarkan skaalauksen. Tarkalla skaalauksella saadaan aikaan korkea käyttöaste. Hyvä resurssien skaalaus tarjoaa korkean käyttöasteen heikentämättä palvelunlaatua. Skaalauksen on reagoitava resurssien tarpeeseen skaalaamalla dynaamisesti ja reaaliaikaisesti. Myös äkillisesti kasvaneeseen työkuormaan on reagoitava nopeasti ja tarvittaessa ennakoivasti, jotta palvelunlaatu pysyy hyvänä

    Implementation and analysis of low latency video-conferencing through edge cloud computing

    Get PDF
    Edge cloud computing seems to be a key enabler of 5G networks which essentially brings the servers as close to the users as possible. Among all the benefits this tendency can provide, this master thesis focuses on the advantages in terms of reduction of the latency. First of all, an Edge network model that combines this paradigm with Software-defined Networks (SDN) is presented so as to provide an example of a potential production scenario. Then, a videoconference application is chosen as a particular case study of latency-sensitive and bandwidth exhaustive application and the traffic that it generates is inspected. Thanks to this analysis, a methodology to compute the latency can be proposed which is used during the test runs afterwards. Lastly, a testbed analogous to the model previously presented showcases the benefits of this approach. The results prove the improvement in the quality of the videoconference by means of a noticeable reduction of the latency when the servers are on the edge. Moreover, it is demonstrated the feasibility of providing a dynamic environment where the server can be live migrated. For the sake of providing a complete quality overview, the impact of the available bandwidth and packet loss is evaluated as well

    Resource Management In Cloud And Big Data Systems

    Get PDF
    Cloud computing is a paradigm shift in computing, where services are offered and acquired on demand in a cost-effective way. These services are often virtualized, and they can handle the computing needs of big data analytics. The ever-growing demand for cloud services arises in many areas including healthcare, transportation, energy systems, and manufacturing. However, cloud resources such as computing power, storage, energy, dollars for infrastructure, and dollars for operations, are limited. Effective use of the existing resources raises several fundamental challenges that place the cloud resource management at the heart of the cloud providers\u27 decision-making process. One of these challenges faced by the cloud providers is to provision, allocate, and price the resources such that their profit is maximized and the resources are utilized efficiently. In addition, executing large-scale applications in clouds may require resources from several cloud providers. Another challenge when processing data intensive applications is minimizing their energy costs. Electricity used in US data centers in 2010 accounted for about 2% of total electricity used nationwide. In addition, the energy consumed by the data centers is growing at over 15% annually, and the energy costs make up about 42% of the data centers\u27 operating costs. Therefore, it is critical for the data centers to minimize their energy consumption when offering services to customers. In this Ph.D. dissertation, we address these challenges by designing, developing, and analyzing mechanisms for resource management in cloud computing systems and data centers. The goal is to allocate resources efficiently while optimizing a global performance objective of the system (e.g., maximizing revenue, maximizing social welfare, or minimizing energy). We improve the state-of-the-art in both methodologies and applications. As for methodologies, we introduce novel resource management mechanisms based on mechanism design, approximation algorithms, cooperative game theory, and hedonic games. These mechanisms can be applied in cloud virtual machine (VM) allocation and pricing, cloud federation formation, and energy-efficient computing. In this dissertation, we outline our contributions and possible directions for future research in this field

    Investigations into Elasticity in Cloud Computing

    Full text link
    The pay-as-you-go model supported by existing cloud infrastructure providers is appealing to most application service providers to deliver their applications in the cloud. Within this context, elasticity of applications has become one of the most important features in cloud computing. This elasticity enables real-time acquisition/release of compute resources to meet application performance demands. In this thesis we investigate the problem of delivering cost-effective elasticity services for cloud applications. Traditionally, the application level elasticity addresses the question of how to scale applications up and down to meet their performance requirements, but does not adequately address issues relating to minimising the costs of using the service. With this current limitation in mind, we propose a scaling approach that makes use of cost-aware criteria to detect the bottlenecks within multi-tier cloud applications, and scale these applications only at bottleneck tiers to reduce the costs incurred by consuming cloud infrastructure resources. Our approach is generic for a wide class of multi-tier applications, and we demonstrate its effectiveness by studying the behaviour of an example electronic commerce site application. Furthermore, we consider the characteristics of the algorithm for implementing the business logic of cloud applications, and investigate the elasticity at the algorithm level: when dealing with large-scale data under resource and time constraints, the algorithm's output should be elastic with respect to the resource consumed. We propose a novel framework to guide the development of elastic algorithms that adapt to the available budget while guaranteeing the quality of output result, e.g. prediction accuracy for classification tasks, improves monotonically with the used budget.Comment: 211 pages, 27 tables, 75 figure

    Scalability performance measurement and testing of cloud-based software services

    Get PDF
    Cloud-based software services have become more popular and dependable and are ideal for businesses with growing or changing workload demands. These services are increasing rapidly due to the reduced hosting costs and the increased availability and efficiency of computing resources. The delivery of cloud-based software services is based on the underlying cloud infrastructure supported by cloud providers, which delivers the potential for scalability that follows the pay-as-you-go model. Performance and scalability testing and measurements of those services are necessary for future optimisations and growth of cloud computing to support the Service Level Agreement (SLA) compliant quality of cloud services, especially in the context of rapidly expanding quantity of service delivery. This thesis addresses an important issue, understanding the scalability of cloud-based software services from a technical perspective, which is very important as more software solutions are migrated to the cloud. A novel testing and quantifying approach for the scalability performance of cloud-based software services is described. Two technical scalability metrics for software services that have been deployed and distributed in cloud environments, have been formulated: volume and quality scalability metrics based on the number of software instances and the average response time. The experimental analysis comprises three stages. The first stage involves demonstrating the approach and the metrics using real-world could-based software service running on Amazon EC2 cloud using three demand scenarios. The second stage aims to extend the practicality of the metrics with experiments on two public cloud environments (Amazon EC2 and Microsoft Azure) with two cloud-based software serices to demonstrate the use of these metrics. The experimental analysis considers three sets of comparisons to provide the platform to construct the metrics as a basis that can be used effectively to compare the scalability of software on cloud environments, consequently supporting deployment decisions with technical arguments. Moreover, the work integrates the technical scalability metrics with an earlier utility-oriented scalability metric. The third stage is a case study of application-level fault inection using real-world cloud-based software services running on Amazon EC2 cloud to demonstrate the effect of fault scenarios on the scalability behaviour. The results show that the technical metrics quantify explicitly the technical scalability performance of the cloud-based software services, and that they allow clear assessment of the impact of demand scenarios, cloud platform and fault injection on the software services’ scalability behaviour. The studies undertaken in this thesis have provided a valuable insight into the scalability of cloud-based software services delivery
    corecore