4 research outputs found
Optimizing simultaneous autoscaling for serverless cloud computing
This paper explores resource allocation in serverless cloud computing
platforms and proposes an optimization approach for autoscaling systems.
Serverless computing relieves users from resource management tasks, enabling
focus on application functions. However, dynamic resource allocation and
function replication based on changing loads remain crucial. Typically,
autoscalers in these platforms utilize threshold-based mechanisms to adjust
function replicas independently. We model applications as interconnected graphs
of functions, where requests probabilistically traverse the graph, triggering
associated function execution. Our objective is to develop a control policy
that optimally allocates resources on servers, minimizing failed requests and
response time in reaction to load changes. Using a fluid approximation model
and Separated Continuous Linear Programming (SCLP), we derive an optimal
control policy that determines the number of resources per replica and the
required number of replicas over time. We evaluate our approach using a
simulation framework built with Python and simpy. Comparing against
threshold-based autoscaling, our approach demonstrates significant improvements
in average response times and failed requests, ranging from 15% to over 300% in
most cases. We also explore the impact of system and workload parameters on
performance, providing insights into the behavior of our optimization approach
under different conditions. Overall, our study contributes to advancing
resource allocation strategies, enhancing efficiency and reliability in
serverless cloud computing platforms
Queueing networks: solutions and applications
During the pasttwo decades queueing network models have proven to be a versatile tool for computer system and computer communication system performance evaluation. This chapter provides a survey of th field with a particular emphasis on applications. We start with a brief historical retrospective which also servesto introduce the majr issues and application areas. Formal results for product form queuenig networks are reviewed with particular emphasis on the implications for computer systems modeling. Computation algorithms, sensitivity analysis and optimization techniques are among the topics covered. Many of the important applicationsof queueing networks are not amenableto exact analysis and an (often confusing) array of approximation methods have been developed over the years. A taxonomy of approximation methods is given and used as the basis for for surveing the major approximation methods that have been studied. The application of queueing network to a number of areas is surveyed, including computer system cpacity planning, packet switching networks, parallel processing, database systems and availability modeling.Durante as últimas duas décadas modelos de redes de filas provaram ser uma ferramenta versátil para avaliação de desempenho de sistemas de computação e sistemas de comunicação. Este capítulo faz um apanhado geral da área, com ênfase em aplicações. Começamos com uma breve retrospectiva histórica que serve também para introduzir os pontos mais importantes e as áreas de aplicação. Resultados formais para redes de filas em forma de produto são revisados com ênfase na modelagem de sistemas de computação. Algoritmos de computação, análise de sensibilidade e técnicas de otimização estão entre os tópicos revistos. Muitas dentre importantes aplicações de redes de filas não são tratáveis por análise exata e uma série (frequentemente confusa) de métodos de aproximação tem sido desenvolvida. Uma taxonomia de métodos de aproximação é dada e usada como base para revisão dos mais importantes métodos de aproximação propostos. Uma revisão das aplicações de redes de filas em um número de áreas é feita, incluindo planejamento de capacidade de sistemas de computação, redes de comunicação por chaveamento de pacotes, processamento paralelo, sistemas de bancos de dados e modelagem de confiabilidade
Recommended from our members
Generalised analytic queueing network models. The need, creation, development and validation of mathematical and computational tools for the construction of analytic queueing network models capturing more critical system behaviour.
Modelling is an important technique in the comprehension and
management of complex systems. Queueing network models capture
most relevant information from computer system and network
behaviour. The construction and resolution of these models is
constrained by many factors. Approximations contain detail lost
for exact solution and/or provide results at lower cost than
simulation.
Information at the resource and interactive command level is
gathered with monitors under ULTRIX'. Validation studies indicate
central processor service times are highly variable on the
system. More pessimistic predictions assuming this variability
are in part verified by observation.
The utility of the Generalised Exponential (GE) as a
distribution parameterised by mean and variance is explored.
Small networks of GE service centres can be solved exactly using
methods proposed for Generalised Stochastic Petri Nets. For two
centre. systems of GE type a new technique simplifying the balance equations is developed. A very efficient "building bglloocbka"l.
is presented for exactly solving two centre systems with service
or transfer blocking, Bernoulli feedback and load dependent rate,
multiple GE servers. In the tandem finite buffer algorithm the
building block illustrates problems encountered modelling high
variability in blocking networks. ':
. _.
A parametric validation study is made of approximations for
single class closed networks of First-Come-First-Served (FCFS)
centres with general service times. The multiserver extension
using the building block is validated. Finally the Maximum
Entropy approximation is extended to FCFS centres with multiple
chains and implemented with computationally efficient
convolution