858 research outputs found

    A Survey of Pipelined Workflow Scheduling: Models and Algorithms

    Get PDF
    International audienceA large class of applications need to execute the same workflow on different data sets of identical size. Efficient execution of such applications necessitates intelligent distribution of the application components and tasks on a parallel machine, and the execution can be orchestrated by utilizing task-, data-, pipelined-, and/or replicated-parallelism. The scheduling problem that encompasses all of these techniques is called pipelined workflow scheduling, and it has been widely studied in the last decade. Multiple models and algorithms have flourished to tackle various programming paradigms, constraints, machine behaviors or optimization goals. This paper surveys the field by summing up and structuring known results and approaches

    Resource provisioning and scheduling algorithms for hybrid workflows in edge cloud computing

    Get PDF
    In recent years, Internet of Things (IoT) technology has been involved in a wide range of application domains to provide real-time monitoring, tracking and analysis services. The worldwide number of IoT-connected devices is projected to increase to 43 billion by 2023, and IoT technologies are expected to engaged in 25% of business sector. Latency-sensitive applications in scope of intelligent video surveillance, smart home, autonomous vehicle, augmented reality, are all emergent research directions in industry and academia. These applications are required connecting large number of sensing devices to attain the desired level of service quality for decision accuracy in a sensitive timely manner. Moreover, continuous data stream imposes processing large amounts of data, which adds a huge overhead on computing and network resources. Thus, latency-sensitive and resource-intensive applications introduce new challenges for current computing models, i.e, batch and stream. In this thesis, we refer to the integrated application model of stream and batch applications as a hybrid work ow model. The main challenge of the hybrid model is achieving the quality of service (QoS) requirements of the two computation systems. This thesis provides a systemic and detailed modeling for hybrid workflows which describes the internal structure of each application type for purposes of resource estimation, model systems tuning, and cost modeling. For optimizing the execution of hybrid workflows, this thesis proposes algorithms, techniques and frameworks to serve resource provisioning and task scheduling on various computing systems including cloud, edge cloud and cooperative edge cloud. Overall, experimental results provided in this thesis demonstrated strong evidences on the responsibility of proposing different understanding and vision on the applications of integrating stream and batch applications, and how edge computing and other emergent technologies like 5G networks and IoT will contribute on more sophisticated and intelligent solutions in many life disciplines for more safe, secure, healthy, smart and sustainable society

    Deadline-Aware Reservation-Based Scheduling

    Get PDF
    The ever-growing need to improve return-on-investment (ROI) for cluster infrastructure that processes data which is being continuously generated at a higher rate than ever before introduces new challenges for big-data processing frameworks. Highly complex mixed workload arriving at modern clusters along with a growing number of time-sensitive critical production jobs necessitates cluster management systems to evolve. Most big-data systems are not only required to guarantee that production jobs will complete before their deadline, but also minimize the latency for best-effort jobs to increase ROI. This research presents DARSS, a deadline-aware reservation-based scheduling system. DARSS addresses the above-stated problem by using a reservation-based approach to scheduling that supports temporal requirements of production jobs while keeping the latency for best-effort jobs low. Fined-grained resource allocation enables DARSS to schedule more tasks than a coarser-grained approach would. Furthermore, DARSS schedules production jobs as close to their deadlines as possible. This scheduling policy allows the system to maximize the number of low-priority tasks that can be scheduled opportunistically. DARSS is a scalable system that can be integrated with YARN. DARSS is evaluated on a simulated cluster of 300 nodes against a workload derived from Google Borg's trace. DARSS is compared with Microsoft's Rayon and YARN's built-in scheduler. DARSS achieves better production job acceptance rate than both YARN and Rayon. The experiments show that all of the production jobs accepted by DARSS complete before their deadlines. Furthermore, DARSS has a higher number of best-effort jobs serviced than Rayon. And finally, DARSS has lower latency for best-effort jobs than Rayon

    SciDAC Institute: Combinatorial Scientific Computing and Petascale Simulations (CSCAPES). Final Report

    Full text link
    Final project repor

    Model-driven Scheduling for Distributed Stream Processing Systems

    Full text link
    Distributed Stream Processing frameworks are being commonly used with the evolution of Internet of Things(IoT). These frameworks are designed to adapt to the dynamic input message rate by scaling in/out.Apache Storm, originally developed by Twitter is a widely used stream processing engine while others includes Flink, Spark streaming. For running the streaming applications successfully there is need to know the optimal resource requirement, as over-estimation of resources adds extra cost.So we need some strategy to come up with the optimal resource requirement for a given streaming application. In this article, we propose a model-driven approach for scheduling streaming applications that effectively utilizes a priori knowledge of the applications to provide predictable scheduling behavior. Specifically, we use application performance models to offer reliable estimates of the resource allocation required. Further, this intuition also drives resource mapping, and helps narrow the estimated and actual dataflow performance and resource utilization. Together, this model-driven scheduling approach gives a predictable application performance and resource utilization behavior for executing a given DSPS application at a target input stream rate on distributed resources.Comment: 54 page

    A Genetic Algorithm to Schedule Workflow Collections on a SOA-Grid with Communication Costs

    No full text
    International audienceIn this paper we study the problem of scheduling a collection of workflows, identical or not, on a SOA grid. A workflow (job) is represented by a directed acyclic graph (DAG) with typed tasks. All of the grid hosts are able to process a set of task types with unrelated processing costs and are able to transmit files through communication links for which the communication times are not negligible. The goal is to minimize the maximum completion time (makespan) of the workflows. To solve this problem we propose a genetic approach. The contributions of this paper are both the design of a Genetic Algorithm taking the communication costs into account and the performance analysis

    ASCR/HEP Exascale Requirements Review Report

    Full text link
    This draft report summarizes and details the findings, results, and recommendations derived from the ASCR/HEP Exascale Requirements Review meeting held in June, 2015. The main conclusions are as follows. 1) Larger, more capable computing and data facilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 timescale is at least two orders of magnitude -- and in some cases greater -- than that available currently. 2) The growth rate of data produced by simulations is overwhelming the current ability, of both facilities and researchers, to store and analyze it. Additional resources and new techniques for data analysis are urgently needed. 3) Data rates and volumes from HEP experimental facilities are also straining the ability to store and analyze large and complex data volumes. Appropriately configured leadership-class facilities can play a transformational role in enabling scientific discovery from these datasets. 4) A close integration of HPC simulation and data analysis will aid greatly in interpreting results from HEP experiments. Such an integration will minimize data movement and facilitate interdependent workflows. 5) Long-range planning between HEP and ASCR will be required to meet HEP's research needs. To best use ASCR HPC resources the experimental HEP program needs a) an established long-term plan for access to ASCR computational and data resources, b) an ability to map workflows onto HPC resources, c) the ability for ASCR facilities to accommodate workflows run by collaborations that can have thousands of individual members, d) to transition codes to the next-generation HPC platforms that will be available at ASCR facilities, e) to build up and train a workforce capable of developing and using simulations and analysis to support HEP scientific research on next-generation systems.Comment: 77 pages, 13 Figures; draft report, subject to further revisio

    FORTE: an extensible framework for robustness and efficiency in data transfer pipelines

    Get PDF
    In the age of big data and growing product complexity, it is common to monitor many aspects of a product or system, in order to extract well-founded intelligence and draw conclusions, to continue driving innovation. Automating and scaling processes in data-pipelines becomes essential to keep pace with increasing rates of data generated by such practices, while meeting security, governance, scalability and resource-efficiency demands.We present FORTE, an extensible framework for robustness and transfer-efficiency in data pipelines. We identify sources of potential bottlenecks and explore the design space of approaches to deal with the challenges they pose. We study and evaluate synergetic effects of data compression and in-memory processing as well as task scheduling, in association with pipeline performance.A prototype implementation of FORTE is implemented and studied in a use-case at Volvo Trucks for high-volume production-level data sets, in the order of magnitude of hundreds of gigabytes to terabytes per burst. Various general-purpose lossless data compression algorithms are evaluated, in order to balance compression effectiveness and time in the pipeline.All in all, FORTE enables to deal with trade-offs and achieve benefits in latency and sustainable rate (up to 1.8 times better), effectiveness in resource utilisation, all while also enabling additional features such as integrity verification, logging, monitoring and traceability, as well as cataloguing of transferred data. We also note that the resource efficiency improvements achievable with FORTE, and its extensibility, can imply further benefits regarding scheduling, orchestration and energy-efficiency in such pipelines

    Trusted resource allocation in volunteer edge-cloud computing for scientific applications

    Get PDF
    Data-intensive science applications in fields such as e.g., bioinformatics, health sciences, and material discovery are becoming increasingly dynamic and demanding with resource requirements. Researchers using these applications which are based on advanced scientific workflows frequently require a diverse set of resources that are often not available within private servers or a single Cloud Service Provider (CSP). For example, a user working with Precision Medicine applications would prefer only those CSPs who follow guidelines from HIPAA (Health Insurance Portability and Accountability Act) for implementing their data services and might want services from other CSPs for economic viability. With the generation of more and more data these workflows often require deployment and dynamic scaling of multi-cloud resources in an efficient and high-performance manner (e.g., quick setup, reduced computation time, and increased application throughput). At the same time, users seek to minimize the costs of configuring the related multi-cloud resources. While performance and cost are among the key factors to decide upon CSP resource selection, the scientific workflows often process proprietary/confidential data that introduces additional constraints of security postures. Thus, users have to make an informed decision on the selection of resources that are most suited for their applications while trading off between the key factors of resource selection which are performance, agility, cost, and security (PACS). Furthermore, even with the most efficient resource allocation across multi-cloud, the cost to solution might not be economical for all users which have led to the development of new paradigms of computing such as volunteer computing where users utilize volunteered cyber resources to meet their computing requirements. For economical and readily available resources, it is essential that such volunteered resources can integrate well with cloud resources for providing the most efficient computing infrastructure for users. In this dissertation, individual stages such as user requirement collection, user's resource preferences, resource brokering and task scheduling, in lifecycle of resource brokering for users are tackled. For collection of user requirements, a novel approach through an iterative design interface is proposed. In addition, fuzzy interference-based approach is proposed to capture users' biases and expertise for guiding their resource selection for their applications. The results showed improvement in performance i.e. time to execute in 98 percent of the studied applications. The data collected on user's requirements and preferences is later used by optimizer engine and machine learning algorithms for resource brokering. For resource brokering, a new integer linear programming based solution (OnTimeURB) is proposed which creates multi-cloud template solutions for resource allocation while also optimizing performance, agility, cost, and security. The solution was further improved by the addition of a machine learning model based on naive bayes classifier which captures the true QoS of cloud resources for guiding template solution creation. The proposed solution was able to improve the time to execute for as much as 96 percent of the largest applications. As discussed above, to fulfill necessity of economical computing resources, a new paradigm of computing viz-a-viz Volunteer Edge Computing (VEC) is proposed which reduces cost and improves performance and security by creating edge clusters comprising of volunteered computing resources close to users. The initial results have shown improved time of execution for application workflows against state-of-the-art solutions while utilizing only the most secure VEC resources. Consequently, we have utilized reinforcement learning based solutions to characterize volunteered resources for their availability and flexibility towards implementation of security policies. The characterization of volunteered resources facilitates efficient allocation of resources and scheduling of workflows tasks which improves performance and throughput of workflow executions. VEC architecture is further validated with state-of-the-art bioinformatics workflows and manufacturing workflows.Includes bibliographical references