163 research outputs found

    Using simple PID-inspired controllers for online resilient resource management of distributed scientific workflows

    Get PDF
    Scientific workflows have become mainstream for conducting large-scale scientific research. As a result, many workflow applications and Workflow Management Systems (WMSs) have been developed as part of the cyberinfrastructure to allow scientists to execute their applications seamlessly on a range of distributed platforms. Although the scientific community has addressed this challenge from both theoretical and practical approaches, failure prediction, detection, and recovery still raise many research questions. In this paper, we propose an approach inspired by the control theory developed as part of autonomic computing to predict failures before they happen, and mitigated them when possible. The proposed approach is inspired on the proportional–integral–derivative controller (PID controller) control loop mechanism, which is widely used in industrial control systems, where the controller will react to adjust its output to mitigate faults. PID controllers aim to detect the possibility of a non-steady state far enough in advance so that an action can be performed to prevent it from happening. To demonstrate the feasibility of the approach, we tackle two common execution faults of large scale data-intensive workflows—data storage overload and memory overflow. We developed a simulator, which implements and evaluates simple standalone PID-inspired controllers to autonomously manage data and memory usage of a data-intensive bioinformatics workflow that consumes/produces over 4.4 TB of data, and requires over 24 TB of memory to run all tasks concurrently. Experimental results obtained via simulation indicate that workflow executions may significantly benefit from the controller-inspired approach, in particular under online and unknown conditions. Simulation results show that nearly-optimal executions (slowdown of 1.01) can be attained when using our proposed method, and faults are detected and mitigated far in advance of their occurrence

    Modern software cybernetics: new trends

    Get PDF
    Software cybernetics research is to apply a variety of techniques from cybernetics research to software engineering research. For more than fifteen years since 2001, there has been a dramatic increase in work relating to software cybernetics. From cybernetics viewpoint, the work is mainly on the first-order level, namely, the software under observation and control. Beyond the first-order cybernetics, the software, developers/users, and running environments influence each other and thus create feedback to form more complicated systems. We classify software cybernetics as Software Cybernetics I based on the first-order cybernetics, and as Software Cybernetics II based on the higher order cybernetics. This paper provides a review of the literature on software cybernetics, particularly focusing on the transition from Software Cybernetics I to Software Cybernetics II. The results of the survey indicate that some new research areas such as Internet of Things, big data, cloud computing, cyber-physical systems, and even creative computing are related to Software Cybernetics II. The paper identifies the relationships between the techniques of Software Cybernetics II applied and the new research areas to which they have been applied, formulates research problems and challenges of software cybernetics with the application of principles of Phase II of software cybernetics; identifies and highlights new research trends of software cybernetic for further research

    Modern software cybernetics: New trends

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Software cybernetics research is to apply a variety of techniques from cybernetics research to software engineering research. For more than fifteen years since 2001, there has been a dramatic increase in work relating to software cybernetics. From cybernetics viewpoint, the work is mainly on the first-order level, namely, the software under observation and control. Beyond the first-order cybernetics, the software, developers/users, and running environments influence each other and thus create feedback to form more complicated systems. We classify software cybernetics as Software Cybernetics I based on the first-order cybernetics, and as Software Cybernetics II based on the higher order cybernetics. This paper provides a review of the literature on software cybernetics, particularly focusing on the transition from Software Cybernetics I to Software Cybernetics II. The results of the survey indicate that some new research areas such as Internet of Things, big data, cloud computing, cyber-physical systems, and even creative computing are related to Software Cybernetics II. The paper identifies the relationships between the techniques of Software Cybernetics II applied and the new research areas to which they have been applied, formulates research problems and challenges of software cybernetics with the application of principles of Phase II of software cybernetics; identifies and highlights new research trends of software cybernetic for further research

    DFlow: Efficient Dataflow-based Invocation Workflow Execution for Function-as-a-Service

    Full text link
    The Serverless Computing is becoming increasingly popular due to its ease of use and fine-grained billing. These features make it appealing for stateful application or serverless workflow. However, current serverless workflow systems utilize a controlflow-based invocation pattern to invoke functions. In this execution pattern, the function invocation depends on the state of the function. A function can only begin executing once all its precursor functions have completed. As a result, this pattern may potentially lead to longer end-to-end execution time. We design and implement the DFlow, a novel dataflow-based serverless workflow system that achieves high performance for serverless workflow. DFlow introduces a distributed scheduler (DScheduler) by using the dataflow-based invocation pattern to invoke functions. In this pattern, the function invocation depends on the data dependency between functions. The function can start to execute even its precursor functions are still running. DFlow further features a distributed store (DStore) that utilizes effective fine-grained optimization techniques to eliminate function interaction, thereby enabling efficient data exchange. With the support of DScheduler and DStore, DFlow can achieving an average improvement of 60% over CFlow, 40% over FaaSFlow, 25% over FaasFlowRedis, and 40% over KNIX on 99%-ile latency respectively. Further, it can improve network bandwidth utilization by 2x-4x over CFlow and 1.5x-3x over FaaSFlow, FaaSFlowRedis and KNIX, respectively. DFlow effectively reduces the cold startup latency, achieving an average improvement of 5.6x over CFlow and 1.1x over FaaSFlowComment: 22 pages, 13 figure

    Workflow models for heterogeneous distributed systems

    Get PDF
    The role of data in modern scientific workflows becomes more and more crucial. The unprecedented amount of data available in the digital era, combined with the recent advancements in Machine Learning and High-Performance Computing (HPC), let computers surpass human performances in a wide range of fields, such as Computer Vision, Natural Language Processing and Bioinformatics. However, a solid data management strategy becomes crucial for key aspects like performance optimisation, privacy preservation and security. Most modern programming paradigms for Big Data analysis adhere to the principle of data locality: moving computation closer to the data to remove transfer-related overheads and risks. Still, there are scenarios in which it is worth, or even unavoidable, to transfer data between different steps of a complex workflow. The contribution of this dissertation is twofold. First, it defines a novel methodology for distributed modular applications, allowing topology-aware scheduling and data management while separating business logic, data dependencies, parallel patterns and execution environments. In addition, it introduces computational notebooks as a high-level and user-friendly interface to this new kind of workflow, aiming to flatten the learning curve and improve the adoption of such methodology. Each of these contributions is accompanied by a full-fledged, Open Source implementation, which has been used for evaluation purposes and allows the interested reader to experience the related methodology first-hand. The validity of the proposed approaches has been demonstrated on a total of five real scientific applications in the domains of Deep Learning, Bioinformatics and Molecular Dynamics Simulation, executing them on large-scale mixed cloud-High-Performance Computing (HPC) infrastructures

    The Complete Reference (Volume 4)

    Get PDF
    This is the fourth volume of the successful series Robot Operating Systems: The Complete Reference, providing a comprehensive overview of robot operating systems (ROS), which is currently the main development framework for robotics applications, as well as the latest trends and contributed systems. The book is divided into four parts: Part 1 features two papers on navigation, discussing SLAM and path planning. Part 2 focuses on the integration of ROS into quadcopters and their control. Part 3 then discusses two emerging applications for robotics: cloud robotics, and video stabilization. Part 4 presents tools developed for ROS; the first is a practical alternative to the roslaunch system, and the second is related to penetration testing. This book is a valuable resource for ROS users and wanting to learn more about ROS capabilities and features.info:eu-repo/semantics/publishedVersio

    Engineering Self-Adaptive Collective Processes for Cyber-Physical Ecosystems

    Get PDF
    The pervasiveness of computing and networking is creating significant opportunities for building valuable socio-technical systems. However, the scale, density, heterogeneity, interdependence, and QoS constraints of many target systems pose severe operational and engineering challenges. Beyond individual smart devices, cyber-physical collectives can provide services or solve complex problems by leveraging a “system effect” while coordinating and adapting to context or environment change. Understanding and building systems exhibiting collective intelligence and autonomic capabilities represent a prominent research goal, partly covered, e.g., by the field of collective adaptive systems. Therefore, drawing inspiration from and building on the long-time research activity on coordination, multi-agent systems, autonomic/self-* systems, spatial computing, and especially on the recent aggregate computing paradigm, this thesis investigates concepts, methods, and tools for the engineering of possibly large-scale, heterogeneous ensembles of situated components that should be able to operate, adapt and self-organise in a decentralised fashion. The primary contribution of this thesis consists of four main parts. First, we define and implement an aggregate programming language (ScaFi), internal to the mainstream Scala programming language, for describing collective adaptive behaviour, based on field calculi. Second, we conceive of a “dynamic collective computation” abstraction, also called aggregate process, formalised by an extension to the field calculus, and implemented in ScaFi. Third, we characterise and provide a proof-of-concept implementation of a middleware for aggregate computing that enables the development of aggregate systems according to multiple architectural styles. Fourth, we apply and evaluate aggregate computing techniques to edge computing scenarios, and characterise a design pattern, called Self-organising Coordination Regions (SCR), that supports adjustable, decentralised decision-making and activity in dynamic environments.Con lo sviluppo di informatica e intelligenza artificiale, la diffusione pervasiva di device computazionali e la crescente interconnessione tra elementi fisici e digitali, emergono innumerevoli opportunità per la costruzione di sistemi socio-tecnici di nuova generazione. Tuttavia, l'ingegneria di tali sistemi presenta notevoli sfide, data la loro complessità—si pensi ai livelli, scale, eterogeneità, e interdipendenze coinvolti. Oltre a dispositivi smart individuali, collettivi cyber-fisici possono fornire servizi o risolvere problemi complessi con un “effetto sistema” che emerge dalla coordinazione e l'adattamento di componenti fra loro, l'ambiente e il contesto. Comprendere e costruire sistemi in grado di esibire intelligenza collettiva e capacità autonomiche è un importante problema di ricerca studiato, ad esempio, nel campo dei sistemi collettivi adattativi. Perciò, traendo ispirazione e partendo dall'attività di ricerca su coordinazione, sistemi multiagente e self-*, modelli di computazione spazio-temporali e, specialmente, sul recente paradigma di programmazione aggregata, questa tesi tratta concetti, metodi, e strumenti per l'ingegneria di ensemble di elementi situati eterogenei che devono essere in grado di lavorare, adattarsi, e auto-organizzarsi in modo decentralizzato. Il contributo di questa tesi consiste in quattro parti principali. In primo luogo, viene definito e implementato un linguaggio di programmazione aggregata (ScaFi), interno al linguaggio Scala, per descrivere comportamenti collettivi e adattativi secondo l'approccio dei campi computazionali. In secondo luogo, si propone e caratterizza l'astrazione di processo aggregato per rappresentare computazioni collettive dinamiche concorrenti, formalizzata come estensione al field calculus e implementata in ScaFi. Inoltre, si analizza e implementa un prototipo di middleware per sistemi aggregati, in grado di supportare più stili architetturali. Infine, si applicano e valutano tecniche di programmazione aggregata in scenari di edge computing, e si propone un pattern, Self-Organising Coordination Regions, per supportare, in modo decentralizzato, attività decisionali e di regolazione in ambienti dinamici
    corecore