1,898 research outputs found

    Optimizing latency and reliability of pipeline workflow applications

    Get PDF
    (ENG) Mapping applications onto heterogeneous platforms is a difficult challenge, even for simple application patterns such as pipeline graphs. Theproblem is even more complex when processors are subject to failureduring the execution of the application. In this paper, we study thecomplexity of a bi-criteria mapping which aims at optimizing the la-tency (i.e., the response time) and the reliability (i.e., the probabilitythat the computation will be successful) of the application. Latency isminimized by using faster processors, while reliability is increased byreplicating computations on a set of processors. However, replicationincreases latency (additional communications, slower processors). Theapplication fails to be executed only if all the processors fail duringexecution. While simple polynomial algorithms can be found for fullyhomogeneous platforms, the problem becomes NP-hard when tacklingheterogeneous platforms. This is yet another illustration of the additional complexity added by heterogeneity.L’ordonnancement et l’allocation des applications sur plates-formes hétérogènes sont des problèmes cruciaux, même pour des applications simples comme des graphes en pipeline. Le problème devient même encore plus complexe quand les processeurs peuvent tomber en panne pendant l’exécution de l’application. Dans cet article, nous étudions la complexité d’une allocation bi-critère qui vise à optimiser la latence (i.e., le temps de réponse) et la fiabilité (i.e., la probabilité que le calcul réussisse)de l’application. La latence est minimisée en utilisant des processeurs rapides, tandis que la fiabilité est augmentée en répliquant les calculs sur un ensemble de processeurs. Toutefois, la réplication augmente la latence (communications additionnelles et processeurs moins rapides). L’application échoue à être exécutée seulement si tout les processeurs échouent pendant l’exécution. Des algorithmes simples en temps polynomial peuvent être trouvés pour plates-formes complètement homogènes,tandis que le problème devient NP-dur quand on s’attaque aux plates-formes hétérogènes. C’est encore une autre illustration de la complexité additionnelle due à l’hétérogénéité

    A Survey of Pipelined Workflow Scheduling: Models and Algorithms

    Get PDF
    International audienceA large class of applications need to execute the same workflow on different data sets of identical size. Efficient execution of such applications necessitates intelligent distribution of the application components and tasks on a parallel machine, and the execution can be orchestrated by utilizing task-, data-, pipelined-, and/or replicated-parallelism. The scheduling problem that encompasses all of these techniques is called pipelined workflow scheduling, and it has been widely studied in the last decade. Multiple models and algorithms have flourished to tackle various programming paradigms, constraints, machine behaviors or optimization goals. This paper surveys the field by summing up and structuring known results and approaches

    Optimization and Management of Large-scale Scientific Workflows in Heterogeneous Network Environments: From Theory to Practice

    Get PDF
    Next-generation computation-intensive scientific applications feature large-scale computing workflows of various structures, which can be modeled as simple as linear pipelines or as complex as Directed Acyclic Graphs (DAGs). Supporting such computing workflows and optimizing their end-to-end network performance are crucial to the success of scientific collaborations that require fast system response, smooth data flow, and reliable distributed operation.We construct analytical cost models and formulate a class of workflow mapping problems with different mapping objectives and network constraints. The difficulty of these mapping problems essentially arises from the topological matching nature in the spatial domain, which is further compounded by the resource sharing complicacy in the temporal dimension. We provide detailed computational complexity analysis and design optimal or heuristic algorithms with rigorous correctness proof or performance analysis. We decentralize the proposed mapping algorithms and also investigate these optimization problems in unreliable network environments for fault tolerance.To examine and evaluate the performance of the workflow mapping algorithms before actual deployment and implementation, we implement a simulation program that simulates the execution dynamics of distributed computing workflows. We also develop a scientific workflow automation and management platform based on an existing workflow engine for experimentations in real environments. The performance superiority of the proposed mapping solutions are illustrated by extensive simulation-based comparisons with existing algorithms and further verified by large-scale experiments on real-life scientific workflow applications through effective system implementation and deployment in real networks

    A Decision Framework for Allocation of Constellation-Scale Mission Compute Functionality to Ground and Edge Computing

    Get PDF
    This paper explores constellation-scale architectural trades, highlights dominant factors, and presents a decision framework for migrating or sharing mission compute functionality between ground and space segments. Over recent decades, sophisticated logic has been developed for scheduling and tasking of space assets, as well as processing and exploitation of satellite data, and this software has been traditionally hosted in ground computing. Current efforts exist to migrate this software to ground cloud-based services. The option and motivation to host some of this logic “at the edge” within the space segment has arisen as space assets are proliferated, are interlinked via transport networks, and are networked with multi-domain assets. Examples include edge-based Battle Management, Command, Control, and Communications (BMC3) being developed by the Space Development Agency and future onboard computing for commercial constellations. Edge computing pushes workload, computation, and storage closer to data sources and onto devices at the edge of the network. Potential benefits of edge computing include increased speed of response, system reliability, robustness to disrupted networks, and data security. Yet, space-based edge nodes have disadvantages including power and mass limitations, constant physical motion, difficulty of physical access, and potential vulnerability to attacks. This paper presents a structured decision framework with justifying rationale to provide insights and begin to address a key question of what mission compute functionality should be allocated to the space-based edge , and under what mission or architectural conditions, versus to conventional ground-based systems. The challenge is to identify the Pareto-dominant trades and impacts to mission success. This framework will not exhaustively address all missions, architectures, and CONOPs, however it is intended to provide generalized guidelines and heuristics to support architectural decision-making. Via effects-based simulation and analysis, a set of hypotheses about ground- and edge-based architectures are evaluated and summarized along with prior research. Results for a set of key metrics and decision drivers show that edge computing for specific functionality is quantitatively valuable, especially for interoperable, multi-domain, collaborative assets

    Bi-criteria Pipeline Mappings for Parallel Image Processing

    Get PDF
    Mapping workflow applications onto parallel platforms is a challenging problem, even for simple application patterns such as pipeline graphs. Several antagonistic criteria should be optimized, such as throughput and latency (or a combination). Typical applications include digital image processing, where images are processed in steady-state mode. In this paper, we study the mapping of a particular image processing application, the JPEG encoding. Mapping pipelined JPEG encoding onto parallel platforms is useful for instance for encoding Motion JPEG images. As the bi-criteria mapping problem is NP-complete, we concentrate on the evaluation and performance of polynomial heuristics

    Resource provisioning and scheduling algorithms for hybrid workflows in edge cloud computing

    Get PDF
    In recent years, Internet of Things (IoT) technology has been involved in a wide range of application domains to provide real-time monitoring, tracking and analysis services. The worldwide number of IoT-connected devices is projected to increase to 43 billion by 2023, and IoT technologies are expected to engaged in 25% of business sector. Latency-sensitive applications in scope of intelligent video surveillance, smart home, autonomous vehicle, augmented reality, are all emergent research directions in industry and academia. These applications are required connecting large number of sensing devices to attain the desired level of service quality for decision accuracy in a sensitive timely manner. Moreover, continuous data stream imposes processing large amounts of data, which adds a huge overhead on computing and network resources. Thus, latency-sensitive and resource-intensive applications introduce new challenges for current computing models, i.e, batch and stream. In this thesis, we refer to the integrated application model of stream and batch applications as a hybrid work ow model. The main challenge of the hybrid model is achieving the quality of service (QoS) requirements of the two computation systems. This thesis provides a systemic and detailed modeling for hybrid workflows which describes the internal structure of each application type for purposes of resource estimation, model systems tuning, and cost modeling. For optimizing the execution of hybrid workflows, this thesis proposes algorithms, techniques and frameworks to serve resource provisioning and task scheduling on various computing systems including cloud, edge cloud and cooperative edge cloud. Overall, experimental results provided in this thesis demonstrated strong evidences on the responsibility of proposing different understanding and vision on the applications of integrating stream and batch applications, and how edge computing and other emergent technologies like 5G networks and IoT will contribute on more sophisticated and intelligent solutions in many life disciplines for more safe, secure, healthy, smart and sustainable society

    Scheduling linear chain streaming applications on heterogeneous systems with failures

    Get PDF
    International audienceIn this paper, we study the problem of optimizing the throughput of streaming applications for heterogeneous platforms subject to failures. Applications are linear graphs of tasks (pipelines), with a type associated to each task. The challenge is to map each task onto one machine of a target platform, each machine having to be specialized to process only one task type, given that every machine is able to process all the types before being specialized in order to avoid costly setups. The objective is to maximize the throughput, i.e., the rate at which jobs can be processed when accounting for failures. Each instance can thus be performed by any machine specialized in its type and the workload of the system can be shared among a set of specialized machines. For identical machines, we prove that an optimal solution can be computed in polynomial time. However, the problem becomes NP-hard when two machines may compute the same task type at different speeds. Several polynomial time heuristics are designed for the most realistic specialized settings. Simulation results assess their efficiency, showing that the best heuristics obtain a good throughput, much better than the throughput obtained with a random mapping. Moreover, the throughput is close to the optimal solution in the particular cases where the optimal throughput can be computed

    Recent Advances in Machine Learning for Network Automation in the O-RAN

    Get PDF
    © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY), https://creativecommons.org/licenses/by/4.0/The evolution of network technologies has witnessed a paradigm shift toward open and intelligent networks, with the Open Radio Access Network (O-RAN) architecture emerging as a promising solution. O-RAN introduces disaggregation and virtualization, enabling network operators to deploy multi-vendor and interoperable solutions. However, managing and automating the complex O-RAN ecosystem presents numerous challenges. To address this, machine learning (ML) techniques have gained considerable attention in recent years, offering promising avenues for network automation in O-RAN. This paper presents a comprehensive survey of the current research efforts on network automation using ML in O-RAN. We begin by providing an overview of the O-RAN architecture and its key components, highlighting the need for automation. Subsequently, we delve into O-RAN support for ML techniques. The survey then explores challenges in network automation using ML within the O-RAN environment, followed by the existing research studies discussing application of ML algorithms and frameworks for network automation in O-RAN. The survey further discusses the research opportunities by identifying important aspects where ML techniques can benefit.Peer reviewe
    • …
    corecore