40 research outputs found
A Game-Theoretic Approach to Strategic Resource Allocation Mechanisms in Edge and Fog Computing
With the rapid growth of Internet of Things (IoT), cloud-centric application management raises
questions related to quality of service for real-time applications. Fog and edge computing
(FEC) provide a complement to the cloud by filling the gap between cloud and IoT. Resource
management on multiple resources from distributed and administrative FEC nodes is a key
challenge to ensure the quality of end-user’s experience. To improve resource utilisation and
system performance, researchers have been proposed many fair allocation mechanisms for
resource management. Dominant Resource Fairness (DRF), a resource allocation policy for
multiple resource types, meets most of the required fair allocation characteristics. However,
DRF is suitable for centralised resource allocation without considering the effects (or
feedbacks) of large-scale distributed environments like multi-controller software defined
networking (SDN). Nash bargaining from micro-economic theory or competitive equilibrium
equal incomes (CEEI) are well suited to solving dynamic optimisation problems proposing to
‘proportionately’ share resources among distributed participants. Although CEEI’s
decentralised policy guarantees load balancing for performance isolation, they are not faultproof
for computation offloading.
The thesis aims to propose a hybrid and fair allocation mechanism for rejuvenation of
decentralised SDN controller deployment. We apply multi-agent reinforcement learning
(MARL) with robustness against adversarial controllers to enable efficient priority scheduling
for FEC. Motivated by software cybernetics and homeostasis, weighted DRF is generalised by
applying the principles of feedback (positive or/and negative network effects) in reverse game
theory (GT) to design hybrid scheduling schemes for joint multi-resource and multitask
offloading/forwarding in FEC environments.
In the first piece of study, monotonic scheduling for joint offloading at the federated edge is
addressed by proposing truthful mechanism (algorithmic) to neutralise harmful negative and
positive distributive bargain externalities respectively. The IP-DRF scheme is a MARL
approach applying partition form game (PFG) to guarantee second-best Pareto optimality
viii | P a g e
(SBPO) in allocation of multi-resources from deterministic policy in both population and
resource non-monotonicity settings. In the second study, we propose DFog-DRF scheme to
address truthful fog scheduling with bottleneck fairness in fault-probable wireless hierarchical
networks by applying constrained coalition formation (CCF) games to implement MARL. The
multi-objective optimisation problem for fog throughput maximisation is solved via a
constraint dimensionality reduction methodology using fairness constraints for efficient
gateway and low-level controller’s placement.
For evaluation, we develop an agent-based framework to implement fair allocation policies in
distributed data centre environments. In empirical results, the deterministic policy of IP-DRF
scheme provides SBPO and reduces the average execution and turnaround time by 19% and
11.52% as compared to the Nash bargaining or CEEI deterministic policy for 57,445 cloudlets
in population non-monotonic settings. The processing cost of tasks shows significant
improvement (6.89% and 9.03% for fixed and variable pricing) for the resource non-monotonic
setting - using 38,000 cloudlets. The DFog-DRF scheme when benchmarked against asset fair
(MIP) policy shows superior performance (less than 1% in time complexity) for up to 30 FEC
nodes. Furthermore, empirical results using 210 mobiles and 420 applications prove the
efficacy of our hybrid scheduling scheme for hierarchical clustering considering latency and
network usage for throughput maximisation.Abubakar Tafawa Balewa University, Bauchi (Tetfund, Nigeria
Resource Management and Scheduling for Big Data Applications in Cloud Computing Environments
This chapter presents software architectures of the big data processing
platforms. It will provide an in-depth knowledge on resource management
techniques involved while deploying big data processing systems on cloud
environment. It starts from the very basics and gradually introduce the core
components of resource management which we have divided in multiple layers. It
covers the state-of-art practices and researches done in SLA-based resource
management with a specific focus on the job scheduling mechanisms.Comment: 27 pages, 9 figure
A survey and classification of software-defined storage systems
The exponential growth of digital information is imposing increasing scale and efficiency demands on modern storage infrastructures. As infrastructure complexity increases, so does the difficulty in ensuring quality of service, maintainability, and resource fairness, raising unprecedented performance, scalability, and programmability challenges. Software-Defined Storage (SDS) addresses these challenges by cleanly disentangling control and data flows, easing management, and improving control functionality of conventional storage systems. Despite its momentum in the research community, many aspects of the paradigm are still unclear, undefined, and unexplored, leading to misunderstandings that hamper the research and development of novel SDS technologies. In this article, we present an in-depth study of SDS systems, providing a thorough description and categorization of each plane of functionality. Further, we propose a taxonomy and classification of existing SDS solutions according to different criteria. Finally, we provide key insights about the paradigm and discuss potential future research directions for the field.This work was financed by the Portuguese funding agency FCT-Fundacao para a Ciencia e a Tecnologia through national funds, the PhD grant SFRH/BD/146059/2019, the project ThreatAdapt (FCT-FNR/0002/2018), the LASIGE Research Unit (UIDB/00408/2020), and cofunded by the FEDER, where applicable
Recommended from our members
Operating system support for warehouse-scale computing
Modern applications are increasingly backed by large-scale data centres. Systems software in these data centre environments, however, faces substantial challenges: the lack of uniform resource abstractions makes sharing and resource management inefficient, infrastructure software lacks end-to-end access control mechanisms, and work placement ignores the effects of hardware heterogeneity and workload interference.
In this dissertation, I argue that uniform, clean-slate operating system (OS) abstractions designed to support distributed systems can make data centres more efficient and secure. I present a novel distributed operating system for data centres, focusing on two OS components: the abstractions for resource naming, management and protection, and the scheduling of work to compute resources.
First, I introduce a reference model for a decentralised, distributed data centre OS, based on pervasive distributed objects and inspired by concepts in classic 1980s distributed OSes. Translucent abstractions free users from having to understand implementation details, but enable introspection for performance optimisation. Fine-grained access control is supported by combining
storable, communicable identifier capabilities, and context-dependent, ephemeral handle capabilities. Finally, multi-phase I/O requests implement optimistically concurrent access to objects
while supporting diverse application-level consistency policies.
Second, I present the DIOS operating system, an implementation of my model as an extension to Linux. The DIOS system call API is centred around distributed objects, globally resolvable names, and translucent references that carry context-sensitive object meta-data. I illustrate how these concepts support distributed applications, and evaluate the performance of DIOS in microbenchmarks and a data-intensive MapReduce application. I find that it offers improved, finegrained isolation of resources, while permitting flexible sharing.
Third, I present the Firmament cluster scheduler, which generalises prior work on scheduling via minimum-cost flow optimisation. Firmament can flexibly express many scheduling policies using pluggable cost models; it makes high-quality placement decisions based on fine-grained information about tasks and resources; and it scales the flow-based scheduling approach to very large clusters. In two case studies, I show that Firmament supports policies that reduce colocation interference between tasks and that it successfully exploits flexibility in the workload to improve the energy efficiency of a heterogeneous cluster. Moreover, my evaluation shows that Firmament scales the minimum-cost flow optimisation to clusters of tens of thousands of machines while still making sub-second placement decisions.St John's College Supplementary Emolument Fund
DARP
Effective Resource and Workload Management in Data Centers
The increasing demand for storage, computation, and business continuity has driven the growth of data centers. Managing data centers efficiently is a difficult task because of the wide variety of datacenter applications, their ever-changing intensities, and the fact that application performance targets may differ widely. Server virtualization has been a game-changing technology for IT, providing the possibility to support multiple virtual machines (VMs) simultaneously. This dissertation focuses on how virtualization technologies can be utilized to develop new tools for maintaining high resource utilization, for achieving high application performance, and for reducing the cost of data center management.;For multi-tiered applications, bursty workload traffic can significantly deteriorate performance. This dissertation proposes an admission control algorithm AWAIT, for handling overloading conditions in multi-tier web services. AWAIT places on hold requests of accepted sessions and refuses to admit new sessions when the system is in a sudden workload surge. to meet the service-level objective, AWAIT serves the requests in the blocking queue with high priority. The size of the queue is dynamically determined according to the workload burstiness.;Many admission control policies are triggered by instantaneous measurements of system resource usage, e.g., CPU utilization. This dissertation first demonstrates that directly measuring virtual machine resource utilizations with standard tools cannot always lead to accurate estimates. A directed factor graph (DFG) model is defined to model the dependencies among multiple types of resources across physical and virtual layers.;Virtualized data centers always enable sharing of resources among hosted applications for achieving high resource utilization. However, it is difficult to satisfy application SLOs on a shared infrastructure, as application workloads patterns change over time. AppRM, an automated management system not only allocates right amount of resources to applications for their performance target but also adjusts to dynamic workloads using an adaptive model.;Server consolidation is one of the key applications of server virtualization. This dissertation proposes a VM consolidation mechanism, first by extending the fair load balancing scheme for multi-dimensional vector scheduling, and then by using a queueing network model to capture the service contentions for a particular virtual machine placement
Stochastic Model Predictive Control and Machine Learning for the Participation of Virtual Power Plants in Simultaneous Energy Markets
The emergence of distributed energy resources in the electricity system involves new scenarios in which domestic consumers (end-users) can be aggregated to participate in energy markets, acting as prosumers. Every prosumer is considered to work as an individual energy node, which has its own renewable generation source, its controllable and non-controllable energy loads, or even its own individual tariffs to trade. The nodes can build aggregations which are managed by a system operator.
The participation in energy markets is not trivial for individual prosumers due to different aspects such as the technical requirements which must be satisfied, or the need to trade with a minimum volume of energy. These requirements can be solved by the definition of aggregated participations.
In this context, the aggregators handle the difficult task of coordinating and stabilizing the prosumers' operations, not only at an individual level, but also at a system level, so that the set of energy nodes behaves as a single entity with respect to the market. The system operators can act as a trading-distributing company, or only as a trading one. For this reason, the optimization model must consider not only aggregated tariffs, but also individual tariffs to allow individual billing for each energy node. The energy node must have the required technical and legal competences, as well as the necessary equipment to manage their participation in energy markets or to delegate it to the system operator. This aggregation, according to business rules and not only to physical locations, is known as virtual power plant.
The optimization of the aggregated participation in the different energy markets requires the introduction of the concept of dynamic storage virtualization. Therefore, every energy node in the system under study will have a battery installed to store excess energy. This dynamic virtualization defines logical partitions in the storage system to allow its use for different purposes. As an example, two different partitions can be defined: one for the aggregated participation in the day-ahead market, and the other one for the demand-response program.
There are several criteria which must be considered when defining the participation strategy. A risky strategy will report more benefits in terms of trading; however, this strategy will also be more likely to get penalties for not meeting the contract due to uncertainties or operation errors. On the other hand, a conservative strategy would result worse economically in terms of trading, but it will reduce these potential penalties. The inclusion of dynamic intent profiles allows to set risky bids when there exist a potential low error of forecast in terms of generation, load or failures; and conservative bids otherwise.
The system operator is the agent who decides how much energy will be reserved to trade, how much to energy node self consumption, how much to demand-response program participation etc. The large number of variables and states makes this problem too complex to be solved by classical methods, especially considering the fact that slight differences in wrong decisions would imply important economic issues in the short term.
The concept of dynamic storage virtualization has been studied and implemented to allow the simultaneous participation in multiple energy markets. The simultaneous participations can be optimized considering the objective of potential profits, potential risks or even a combination of both considering more advanced criteria related to the system operator's know-how.
Day-ahead bidding algorithms, demand-response program participation optimization and a penalty-reduction operation control algorithm have been developed. A stochastic layer has been defined and implemented to improve the robustness inherent to forecast-dependent systems. This layer has been developed with chance-constraints, which includes the possibility of combining an intelligent agent based on a encoder-decoder arquitecture built with neural networks composed of gated recurrent units.
The formulation and the implementation allow a total decouplement among all the algorithms without any dependency among them. Nevertheless, they are completely engaged because the individual execution of each one considers both the current scenario and the selected strategy. This makes possible a wider and better context definition and a more real and accurate situation awareness.
In addition to the relevant simulation runs, the platform has also been tested on a real system composed of 40 energy nodes during one year in the German island of Borkum. This experience allowed the extraction of very satisfactory conclusions about the deployment of the platform in real environments.La irrupción de los sistemas de generación distribuidos en los sistemas eléctricos dan
lugar a nuevos escenarios donde los consumidores domésticos (usuarios finales)
pueden participar en los mercados de energÃa actuando como prosumidores. Cada prosumidor
es considerado como un nodo de energÃa con su propia fuente de generación de
energÃa renovable, sus cargas controlables y no controlables e incluso sus propias tarifas.
Los nodos pueden formar agregaciones que serán gestionadas por un agente denominado
operador del sistema.
La participación en los mercados energéticos no es trivial, bien sea por requerimientos
técnicos de instalación o debido a la necesidad de cubrir un volumen mÃnimo de energÃa por
transacción, que cada nodo debe cumplir individualmente. Estas limitaciones hacen casi
imposible la participación individual, pero pueden ser salvadas mediante participaciones
agregadas.
El agregador llevará a cabo la ardua tarea de coordinar y estabilizar las operaciones de los
nodos de energÃa, tanto individualmente como a nivel de sistema, para que todo el conjunto
se comporte como una unidad con respecto al mercado. Las entidades que gestionan
el sistema pueden ser meras comercializadoras, o distribuidoras y comercializadoras
simultáneamente. Por este motivo, el modelo de optimización sobre el que basarán sus
decisiones deberá considerar, además de las tarifas agregadas, otras individuales para
permitir facturaciones independientes. Los nodos deberán tener autonomÃa legal y técnica,
asà como el equipamiento necesario y suficiente para poder gestionar, o delegar en el
operador del sistema, su participación en los mercados de energÃa. Esta agregación
atendiendo a reglas de negocio y no solamente a restricciones de localización fÃsica es lo
que se conoce como Virtual Power Plant.
La optimización de la participación agregada en los mercados, desde el punto de
vista técnico y económico, requiere de la introducción del concepto de virtualización
dinámica del almacenamiento, para lo que será indispensable que los nodos pertenecientes
al sistema bajo estudio consten de una baterÃa para almacenar la energÃa sobrante. Esta
virtualización dinámica definirá particiones lógicas en el sistema de almacenamiento para
dedicar diferentes porcentajes de la energÃa almacenada para propósitos distintos. Como
ejemplo, se podrÃa hacer una virtualización en dos particiones lógicas diferentes: una de demand-response. AsÃ, el sistema podrÃa operar y satisfacer ambos mercados de
manera simultánea con el mismo grid y el mismo almacenamiento. El potencial de estas
particiones lógicas es que se pueden definir de manera dinámica, dependiendo del contexto
de ejecución y del estado, tanto de la red, como de cada uno de los nodos a nivel individual.
Para establecer una estrategia de participación se pueden considerar apuestas arriesgadas
que reportarán más beneficios en términos de compra-venta, pero también posibles
penalizaciones por no poder cumplir con el contrato. Por el contrario, una estrategia
conservadora podrÃa resultar menos beneficiosa económicamente en dichos términos de
compra-venta, pero reducirá las penalizaciones. La inclusión del concepto de perfiles de
intención dinámicos permitirá hacer pujas que sean arriesgadas, cuando existan errores de
predicción potencialmente pequeños en términos de generación, consumo o fallos; y pujas
más conservadoras en caso contrario.
El operador del sistema es el agente que definirá cuánta energÃa utiliza para comercializar,
cuánta para asegurar autoconsumo, cuánta desea tener disponible para participar en el
programa de demand-response etc. El gran número de variables y de situaciones posibles
hacen que este problema sea muy costoso y complejo de resolver mediante métodos
clásicos, sobre todo teniendo en cuenta que pequeñas variaciones en la toma de decisiones
pueden tener grandes implicaciones económicas incluso a corto plazo.
En esta tesis se ha investigado en el concepto de virtualización dinámica del almacenamiento
para permitir una participación simultánea en múltiples mercados. La estrategia
de optimización definida permite participaciones simultáneas en diferentes mercados que
pueden ser controladas con el objetivo de optimizar el beneficio potencial, el riesgo potencial,
o incluso una combinación mixta de ambas en base a otros criterios más avanzados
marcados por el know-how del operador del sistema.
Se han desarrollado algoritmos de optimización para el mercado del day-ahead, para la
participación en el programa de demand-response y un algoritmo de control para reducir
las penalizaciones durante la operación mediante modelos de control predictivo. Se ha
realizado la definición e implementación de un componente estocástico para hacer el
sistema más robusto frente a la incertidumbre inherente a estos sistemas en los que hay
tanto peso de una componente de tipo forecasing. La formulación de esta capa se ha
realizado mediante chance-constraints, que incluye la posibilidad de combinar diferentes
componentes para mejorar la precisión de la optimización. Para el caso de uso presentado
se ha elegido la combinación de métodos estadÃsticos por probabilidad junto a un agente
inteligente basado en una arquitectura de codificador-decodificador construida con redes
neuronales compuestas de Gated Recurrent Units.
La formulación y la implementación utilizada permiten que, aunque todos los algoritmos
estén completamente desacoplados y no presenten dependencias entre ellos, todos se actual como la estrategia seleccionada. Esto permite la definición de un contexto mucho
más amplio en la ejecución de las optimizaciones y una toma de decisiones más consciente,
real y ajustada a la situación que condiciona al proceso.
Además de las pertinentes pruebas de simulación, parte de la herramienta ha sido
probada en un sistema real compuesto por 40 nodos domésticos, convenientemente equipados,
durante un año en una infraestructura implantada en la isla alemana de Borkum. Esta
experiencia ha permitido extraer conclusiones muy interesantes sobre la implantación de
la plataforma en entornos reales
Adaptive Asynchronous Control and Consistency in Distributed Data Exploration Systems
Advances in machine learning and streaming systems provide a backbone to transform vast arrays of raw data into valuable information. Leveraging distributed execution, analysis engines can process this information effectively within an iterative data exploration workflow to solve problems at unprecedented rates. However, with increased input dimensionality, a desire to simultaneously share and isolate information, as well as overlapping and dependent tasks, this process is becoming increasingly difficult to maintain. User interaction derails exploratory progress due to manual oversight on lower level tasks such as tuning parameters, adjusting filters, and monitoring queries. We identify human-in-the-loop management of data generation and distributed analysis as an inhibiting problem precluding efficient online, iterative data exploration which causes delays in knowledge discovery and decision making. The flexible and scalable systems implementing the exploration workflow require semi-autonomous methods integrated as architectural support to reduce human involvement. We, thus, argue that an abstraction layer providing adaptive asynchronous control and consistency management over a series of individual tasks coordinated to achieve a global objective can significantly improve data exploration effectiveness and efficiency. This thesis introduces methodologies which autonomously coordinate distributed execution at a lower level in order to synchronize multiple efforts as part of a common goal. We demonstrate the impact on data exploration through serverless simulation ensemble management and multi-model machine learning by showing improved performance and reduced resource utilization enabling a more productive semi-autonomous exploration workflow. We focus on the specific genres of molecular dynamics and personalized healthcare, however, the contributions are applicable to a wide variety of domains
Strategic and operational services for workload management in the cloud
In hosting environments such as Infrastructure as a Service (IaaS) clouds, desirable application performance is typically guaranteed through the use of Service Level Agreements (SLAs), which specify minimal fractions of resource capacities that must be allocated by a service provider for unencumbered use by customers to ensure proper operation of their workloads. Most IaaS offerings are presented to customers as fixed-size and fixed-price SLAs, that do not match well the needs of specific applications. Furthermore, arbitrary colocation of applications with different SLAs may result in inefficient utilization of hosts' resources, resulting in economically undesirable customer behavior.
In this thesis, we propose the design and architecture of a Colocation as a Service (CaaS) framework: a set of strategic and operational services that allow the efficient colocation of customer workloads. CaaS strategic services provide customers the means to specify their application workload using an SLA language that provides them the opportunity and incentive to take advantage of any tolerances they may have regarding the scheduling of their workloads. CaaS operational services provide the information necessary for, and carry out the reconfigurations mandated by strategic services. We recognize that it could be the case that there are multiple, yet functionally equivalent ways to express an SLA. Thus, towards that end, we present a service that allows the provably-safe transformation of SLAs from one form to another for the purpose of achieving more efficient colocation. Our CaaS framework could be incorporated into an IaaS offering by providers or it could be implemented as a value added proposition by IaaS resellers. To establish the practicality of such offerings, we present a prototype implementation of our proposed CaaS framework