347 research outputs found
SPACE4AI-R: a Runtime Management Tool for AI Applications Component Placement and Resource Scaling in Computing Continua
The recent migration towards Internet of Things determined the rise of a Computing Continuum paradigm where Edge and Cloud resources coordinate to support the execution of Artificial Intelligence (AI) applications, becoming the foundation of use-cases spanning from predictive maintenance to machine vision and healthcare. This generates a fragmented scenario where computing and storage power are distributed among multiple devices with highly heterogeneous capacities. The runtime management of AI applications executed in the Computing Continuum is challenging, and requires ad-hoc solutions. We propose SPACE4AI-R, which combines Random Search and Stochastic Local Search algorithms to cope with workload fluctuations by identifying the minimum-cost reconfiguration of the initial production deployment, while providing performance guarantees across heterogeneous resources including Edge devices and servers, Cloud GPU-based Virtual Machines and Function as a Service solutions. Experimental results prove the efficacy of our tool, yielding up to 60% cost reductions against a static design-time placement, with a maximum execution time under 1.5s in the most complex scenarios
Generalized Nash equilibria for SaaS/PaaS Clouds
Cloud computing is an emerging technology that allows to access computing resources on a pay-per-use basis. The main challenges in this area are the efficient performance management and the energy costs minimization. In this paper we model the service provisioning problem of Cloud Platform-as-a-Service systems as a Generalized Nash Equilibrium Problem and show that a potential function for the game exists. Moreover, we prove that the social optimum problem is convex and we derive some properties of social optima from the corresponding Karush-Kuhn-Tucker system. Next, we propose a distributed solution algorithm based on the best response dynamics and we prove its convergence to generalized Nash equilibria. Finally, we numerically evaluate equilibria in terms of their efficiency with respect to the social optimum of the Cloud by varying our algorithm initial solution. Numerical results show that our algorithm is scalable and very efficient and thus can be adopted for the run-time management of very large scale systems
MALIBOO: When Machine Learning meets Bayesian Optimization
Bayesian Optimization (BO) is an efficient method for finding optimal cloud computing configurations for several types of applications. On the other hand, Machine Learning (ML) methods can provide useful knowledge about the application at hand thanks to their predicting capabilities. In this paper, we propose a hybrid algorithm that is based on BO and integrates elements from ML techniques, to find the optimal configuration of time-constrained recurring jobs executed in cloud environments. The algorithm is tested by considering edge computing and Apache Spark big data applications. The results we achieve show that this algorithm reduces the amount of unfeasible executions up to 2-3 times with respect to state-of-the-art techniques
Bayesian optimization with machine learning for big data applications in the cloud
L'ottimizzazione bayesiana è un metodo promettente per trovare configurazioni ottimali di applicazioni big data eseguite su cloud. I metodi di machine learning possono fornire informazioni utili sull'applicazione in oggetto grazie alle loro capacità predittive. In questo articolo, proponiamo un algoritmo ibrido basato sull'ottimizzazione bayesiana che integra tecniche di machine learning per risolvere problemi di ottimizzazione con vincoli di tempo in sistemi di cloud computing.Bayesian Optimization is a promising method for efficiently finding optimal cloud computing configurations for big data applications. Machine Learning methods can provide useful knowledge about the application at hand thanks to their predicting capabilities. In this paper, we propose a hybrid algorithm that is based on Bayesian Optimization and integrates elements from Machine Learning techniques to tackle time-constrained optimization problems in a cloud computing setting
A Random Greedy based Design Time Tool for AI Applications Component Placement and Resource Selection in Computing Continua
Artificial Intelligence (AI) and Deep Learning (DL) are pervasive today, with applications spanning from personal assistants to healthcare. Nowadays, the accelerated migration towards mobile computing and Internet of Things, where a huge amount of data is generated by widespread end devices, is determining the rise of the edge computing paradigm, where computing resources are distributed among devices with highly heterogeneous capacities. In this fragmented scenario, efficient component placement and resource allocation algorithms are crucial to orchestrate at best the computing continuum resources. In this paper, we propose a tool to effectively address the component placement problem for AI applications at design time. Through a randomized greedy algorithm, our approach identifies the placement of minimum cost providing performance guarantees across heterogeneous resources including edge devices, cloud GPU-based Virtual Machines and Function as a Service solutions. Finally, we compare the random greedy method with the HyperOpt framework and demonstrate that our proposed approach converges to a near-optimal solution much faster, especially in large scale systems
Recommended from our members
A Service-Based Framework for Flexible Business Processes
This article describes a framework for the design and enactment of flexible and adaptive business processes. It combines design-time and run-time mechanisms to offer a single integrated solution. The design-time environment supports the specification of process-drivenWeb applications with Quality of Service (QoS) constraints and monitoring annotations. The runtime identifies the actual services, from the QoS perspective, oversees the execution through monitoring, and reacts to failures and infringement of QoS constraints. The article also discusses these issues on a proof of concept application developed for an industrial supply chain scenario
Optimal Map Reduce Job Capacity Allocation in Cloud Systems.
We are entering a Big Data world. Many sectors of our economy are now guided by data-driven decision processes. Big Data and business intelligence applications are facilitated by the MapReduce programming model while, at infrastructural layer, cloud computing provides flexible and cost effective solutions for allocating on demand large clusters. Capacity allocation in such systems is a key challenge to provide performance for MapReduce jobs and minimize cloud resource costs. The contribution of this paper is twofold: (i) we provide new upper and lower bounds for MapReduce job execution time in shared Hadoop clusters, (ii) we formulate a linear programming model able to minimize cloud resources costs and job rejection penalties for the execution of jobs of multiple classes with (soft) deadline guarantees. Simulation results show how the execution time of MapReduce jobs falls within 14% of our upper bound on average.
Moreover, numerical analyses demonstrate that our method is able to determine the global optimal solution of the linear problem for systems including up to 1,000 user classes in less than 0.5 seconds
Runtime Management of Artificial Intelligence Applications for Smart Eyewears
Artificial Intelligence (AI) applications are gaining popularity as they
seamlessly integrate into end-user devices, enhancing the quality of life.
In recent years, there has been a growing focus on designing Smart EyeWear (SEW) that can optimize user experiences based on specific usage
domains. However, SEWs face limitations in computational capacity and
battery life. This paper investigates SEW and proposes an algorithm
to minimize energy consumption and 5G connection costs while ensuring high Quality-of-Experience. To achieve this, a management software,
based on Q-learning, offloads some Deep Neural Network (DNN) computations to the user’s smartphone and/or the cloud, leveraging the possibility
to partition the DNNs. Performance evaluation considers variability in 5G
and WiFi bandwidth as well as in the cloud latency. Results indicate execution time violations below 14%, demonstrating that the approach is
promising for efficient resource allocation and user satisfaction
Weight filtration on the cohomology of complex analytic spaces
We extend Deligne's weight filtration to the integer cohomology of complex
analytic spaces (endowed with an equivalence class of compactifications). In
general, the weight filtration that we obtain is not part of a mixed Hodge
structure. Our purely geometric proof is based on cubical descent for
resolution of singularities and Poincar\'e-Verdier duality. Using similar
techniques, we introduce the singularity filtration on the cohomology of
compactificable analytic spaces. This is a new and natural analytic invariant
which does not depend on the equivalence class of compactifications and is
related to the weight filtration.Comment: examples added + minor correction
An optimization framework for the capacity allocation and admission control of MapReduce jobs in cloud systems
Nowadays, we live in a Big Data world and many sectors of our economy are guided by data-driven decision processes. Big Data and Business Intelligence applications are facilitated by the MapReduce programming model, while, at infrastructural layer, cloud computing provides flexible and cost-effective solutions to provide on-demand large clusters. Capacity allocation in such systems, meant as the problem of providing computational power to support concurrent MapReduce applications in a cost-effective fashion, represents a challenge of paramount importance. In this paper we lay the foundation for a solution implementing admission control and capacity allocation for MapReduce jobs with a priori deadline guarantees. In particular, shared Hadoop 2.x clusters supporting batch and/or interactive jobs are targeted. We formulate a linear programming model able to minimize cloud resources costs and rejection penalties for the execution of jobs belonging to multiple classes with deadline guarantees. Scalability analyses demonstrated that the proposed method is able to determine the global optimal solution of the linear problem for systems including up to 10,000 classes in less than 1 s
- …