5 research outputs found
Characterizing Service Level Objectives for Cloud Services: Motivation of Short-Term Cache Allocation Performance Modeling
Service level objectives (SLOs) stipulate performance goals for cloud applications, microservices, and infrastructure. SLOs are widely used, in part, because system managers can tailor goals to their products, companies, and workloads. Systems research intended to support strong SLOs should target realistic performance goals used by system managers in the field. Evaluations conducted with uncommon SLO goals may not translate to real systems. Some textbooks discuss the structure of SLOs but (1) they only sketch SLO goals and (2) they use outdated examples. We mined real SLOs published on the web, extracted their goals and characterized them. Many web documents discuss SLOs loosely but few provide details and reflect real settings. Systematic literature review (SLR) prunes results and reduces bias by (1) modeling expected SLO structure and (2) detecting and removing outliers. We collected 75 SLOs where response time, query percentile and reporting period were specified. We used these SLOs to confirm and refute common perceptions. For example, we found few SLOs with response time guarantees below 10 ms for 90% or more queries. This reality bolsters perceptions that single digit SLOs face fundamental research challenges.This work was funded by NSF Grants 1749501 and 1350941.No embargoAcademic Major: Computer Science and EngineeringAcademic Major: Financ
Elastic Provisioning of Cloud Caches: a Cost-aware TTL Approach
We consider elastic resource provisioning in the cloud, focusing on in-memory
key-value stores used as caches. Our goal is to dynamically scale resources to
the traffic pattern minimizing the overall cost, which includes not only the
storage cost, but also the cost due to misses. In fact, a small variation on
the cache miss ratio may have a significant impact on user perceived
performance in modern web services, which in turn has an impact on the overall
revenues for the content provider that uses those services. We propose and
study a dynamic algorithm for TTL caches, which is able to obtain
close-to-minimal costs. Since high-throughput caches require low complexity
operations, we discuss a practical implementation of such a scheme requiring
constant overhead per request independently from the cache size. We evaluate
our solution with real-world traces collected from Akamai, and show that we are
able to obtain a 17% decrease in the overall cost compared to a baseline static
configuration
Leveraging spot instances for resource provisioning in serverless computing
Cloud computing has become a dominating paradigm across the IT industry. However, keeping cloud costs under control is a major challenge for organizations. One option to save costs is using spot instances: virtual machines that have highly discounted prices at the expense of lower reliability and availability.
Serverless computing is a paradigm that allows developers to build and deploy applications in the cloud without provisioning or managing backend infrastructure. Function as a Service (FaaS) is the prevalent delivery model of this paradigm, which allows developers to execute functions in the cloud as a response to a request or an event. The developer focuses only on the code, and the cloud provider handles the execution and scaling of the functions. This is convenient for developers, but comes with some limitations and can become very expensive at scale.
This thesis investigates leveraging spot instances for running serverless functions, and potentially achieve both higher flexibility and cost reduction compared to commercial FaaS solutions. For this purpose, we present a system design, suitable for applications that tolerate some execution latency, and implement it in Google Cloud Platform. Our implementation is compared against Google Cloud Run, a service that offers a similar functionality.
Our system achieves significant cost savings: assuming a function execution time of two minutes, our system has the same price as the Cloud Run solution at around 8,000 requests per month, and at, for example, 20,000 requests per month, the cost is less than half of Cloud Run. However, one important design decision is that a spot instance is provisioned on the fly for every request. While this introduces latency, it also allows the system to achieve no significant reduction in reliability, as was confirmed in our evaluation
Recommended from our members
Next Generation Cloud Computing Architectures: Performance and Pricing
Cloud providers need to optimize the container deployments to efficiently utilize their network, compute and storage resources. In addition, they require an attractive pricing strategy for the compute services like containers, virtual machines, and serverless computing in order to attract users, maximize their profits and achieve a desired utilization of their resources. This thesis aims to tackle the twofold challenge of achieving high performance in container deployments and identifying the pricing for compute services.
For performance, the thesis presents a transport-adaptive network architecture (D-TAIL) improving tail latencies. Existing transport protocols such as Homa, pFabric [1, 2] utilize Shortest Remaining Processing Time (SRPT) scheduling policy which is known to have starvation issues for long flows as SRPT prioritizes short flows. D-TAIL addresses this limitation by taking age of the flow in consideration while deciding the priority. D-TAIL shows a maximum reduction of 72%, 29.66% and 28.39% in 99th-percentile FCT for transport protocols like DCTCP, pFabric and Homa respectively. In addition, the thesis also presents a container deployment design utilizing peer-to-peer network and virtual file system with content-addressable storage to address the problem of cold starts in existing container deployment systems. The proposed deployment design increases compute availability, reduces storage requirement and prevents network bottlenecks.
For pricing, the thesis studies the tradeoffs between serverless computing (SC) and traditional cloud computing (virtual machine, VM) using realistic cost models, queueing theoretic performance models, and a game theoretic formulation. For customers, we identify their workload distribution between SC and VM to minimize their cost while maintaining a particular performance constraint. For cloud provider, we identify the SC and VM prices to maximize its profit. The main result is the identification and characterization of three optimal operational regimes for both customers and the provider, that leverage either SC or VM only, or both, in a hybrid configuration
Models of vertical interconnection in the future internet networks
Interkonekcija, čiji primarni cilj je omogućavanje korisnicima jednog učesnika na tržištu telekomunikacija da komuniciraju sa korisnicima drugog učesnika, odnosno, obezbeđivanje pristupa servisima koje obezbeđuje drugi učesnik, javila se nakon liberalizacije tržišta telekomunikacija. Vertikalna interkonekcija predstavlja fizičko i logičko povezivanje učesnika na različitim nivoima mreže. U okruženju budućeg Interneta, sagledavanje tehničko-ekonomskih aspekata interkonekcije predstavlja pitanje od izuzetnog značaja. U opštem slučaju, učesnici u vertikalnoj interkonekciji u okruženju budućeg Interneta mogu biti provajderi sadržaja i aplikacija, provajderi Internet servisa, Content Delivery Network (CDN) provajderi i cloud provajderi. Pojava Cloud Computing-a uvela je značajne promene u Internet okruženju koje se pre svega odnose na mogućnost pristupa skalabilnim i deljivim, fizičkim ili virtuelnim resursima. Na taj način, stvara se elastična platforma koja obezbeđuje dinamičnu i jednostavnu skalabilnost, pojednostavljuje se obezbeđivanje infrastrukture i omogućava se unapređenje performansi. Razvoj servisa i aplikacija zahtevnih u pogledu propusnog opsega praćen širom implementacijom Cloud Computing-a utiče na kontinuiran rast Internet saobraćaja, što zahteva primenu tehnologija koje mogu zadovoljiti sve strože zahteve. Kao rešenje za transportni nivo mreže, razvijene su elastične optičke mreže koje mogu obezbediti dovoljne propusne opsege uz efikasno iskorišćenje spektra. Imajući u vidu promene koje prate razvoj okruženja budućeg Interneta, kao i značaj vertikalne interkonekcije, neophodno je razmotriti i jasno definisati tehničko-ekonomske relacije između učesnika u budućem Internetu, što je predmet istraživanja ove doktorske disertacije.
U okviru disertacije predložen je model koji ima za cilj određivanje adekvatnog ugovora o interkonekciji između učesnika u vertikalnoj interkonekciji, i to provajdera sadržaja i aplikacija i provajdera Internet servisa u procesu obezbeđivanja sadržaja krajnjim korisnicima, uz mogućnost parcijalne migracije sadržaja na resurse cloud provajdera. Analiza obuhvata različite ugovore o interkonekciji i određuje adekvatan ugovor, u zavisnosti od ciljnih profita provajdera koji učestvuju u vertikalnoj interkonekciji i prihvatljive stope odbijenih zahteva za obezbeđivanje sadržaja krajnjim korisnicima.
Data analiza je proširena istraživanjem adekvatnog mehanizma tarifiranja i alokacije resursa cloud provajdera. Predložen je nov, hibridni model pristupa resursima cloud provajdera koji obezbeđuje zadovoljavajuće rezultate u kontekstu minimizacije troškova i minimizacije stope odbijenih zahteva za pristup sadržajima...Interconnection, whose primary aim is enabling communication between end users of different undertakings, i.e. enabling access to the other undertaking's services, was introduced after the telecommunication market liberalization. Vertical interconnection represents the physical and logical linking of the undertakings on different network levels. Consideration of technical and economic aspects of the interconnection is a crucial issue in the future Internet environment. In general, undertakings in vertical interconnection in the future Internet environment include content and application providers, Internet service providers, Content Delivery Network (CDN) providers and Cloud providers. The development of Cloud Computing introduced significant changes in the Internet environment in terms of enabling access to scalable and shareable, physical or virtual resources. Therefore, the elastic platform for dynamic and simple scalability is enabled, the access to infrastructure is simplified and performances are improved. High bandwidth demanding services and applications, along with the wide adoption of Cloud Computing causes permanent growth of Internet traffic. This indicates that the introduction of new technologies, capable to satisfy high bandwidth requirements, is necessary. Elastic optical networks are proposed as a promising solution for transport networks. These networks provide enough bandwidth, along with high efficiency in spectrum utilization.
Forasmuch changes in the future Internet environment and the importance of the vertical interconnection, it is mandatory to consider and properly define technical and economic relations between undertakings in the future Internet environment, which is the research subject of this doctoral dissertation.
Within dissertation, a model for the determination of a proper interconnection agreement between undertakings in the vertical interconnection, content and application provider and an Internet service provider, in the content provisioning process with partial cloud migration is proposed. The analysis comprises different interconnection agreements and determines appropriate agreement, depending on the targeted providers' profits and satisfying requests' for content provisioning rejection rate. This analysis is
extended to determine adequate pricing and allocation mechanism for cloud provider's resources. A new, hybrid model for enabling access to cloud resources is proposed. The model provides satisfying results in terms of the costs' minimization and the minimization of requests' rejection rate..