254 research outputs found

    A Survey of Pipelined Workflow Scheduling: Models and Algorithms

    Get PDF
    International audienceA large class of applications need to execute the same workflow on different data sets of identical size. Efficient execution of such applications necessitates intelligent distribution of the application components and tasks on a parallel machine, and the execution can be orchestrated by utilizing task-, data-, pipelined-, and/or replicated-parallelism. The scheduling problem that encompasses all of these techniques is called pipelined workflow scheduling, and it has been widely studied in the last decade. Multiple models and algorithms have flourished to tackle various programming paradigms, constraints, machine behaviors or optimization goals. This paper surveys the field by summing up and structuring known results and approaches

    Optimization and Management of Large-scale Scientific Workflows in Heterogeneous Network Environments: From Theory to Practice

    Get PDF
    Next-generation computation-intensive scientific applications feature large-scale computing workflows of various structures, which can be modeled as simple as linear pipelines or as complex as Directed Acyclic Graphs (DAGs). Supporting such computing workflows and optimizing their end-to-end network performance are crucial to the success of scientific collaborations that require fast system response, smooth data flow, and reliable distributed operation.We construct analytical cost models and formulate a class of workflow mapping problems with different mapping objectives and network constraints. The difficulty of these mapping problems essentially arises from the topological matching nature in the spatial domain, which is further compounded by the resource sharing complicacy in the temporal dimension. We provide detailed computational complexity analysis and design optimal or heuristic algorithms with rigorous correctness proof or performance analysis. We decentralize the proposed mapping algorithms and also investigate these optimization problems in unreliable network environments for fault tolerance.To examine and evaluate the performance of the workflow mapping algorithms before actual deployment and implementation, we implement a simulation program that simulates the execution dynamics of distributed computing workflows. We also develop a scientific workflow automation and management platform based on an existing workflow engine for experimentations in real environments. The performance superiority of the proposed mapping solutions are illustrated by extensive simulation-based comparisons with existing algorithms and further verified by large-scale experiments on real-life scientific workflow applications through effective system implementation and deployment in real networks

    SciDAC Institute: Combinatorial Scientific Computing and Petascale Simulations (CSCAPES). Final Report

    Full text link
    Final project repor

    Demonstration of latency-aware 5G network slicing on optical metro networks

    Get PDF
    The H2020 METRO-HAUL European project has architected a latency-aware, cost-effective, agile, and programmable optical metro network. This includes the design of semi-disaggregated metro nodes with compute and storage capabilities, which interface effectively with both 5G access and multi-Tbit/s elastic optical networks in the core. In this paper, we report the automated deployment of 5G services, in particular, a public safety video surveillance use case employing low-latency object detection and tracking using on-camera and on-the-edge analytics. The demonstration features flexible deployment of network slice instances, implemented in terms of ETSI NFV Network Services. We summarize the key findings in a detailed analysis of end-to-end quality of service, service setup time, and soft-failure detection time. The results show that the round-trip-time over an 80 km link is under 800 µs and the service deployment time under 180 seconds.Horizon 2020 Framework Programme (761727); Bundesministerium für Bildung und Forschung (16KIS0979K).Peer ReviewedArticle signat per 25 autors/es: B. Shariati, Fraunhofer HHI, Berlin, Germany / L. Velasco, Universitat Politècnica de Catalunya, Barcelona, Spain / J.-J. Pedreno-Manresa, ADVA, Munich, Germany / A. Dochhan, ADVA, Munich, Germany / R. Casellas, Centre Tecnològic Telecomunicacions Catalunya, Castelldefels, Spain / A. Muqaddas, University of Bristol, Bristol, UK / O. Gonzalez de Dios, Telefónica, Madrid, Spain / L. Luque Canto, Telefónica, Madrid, Spain / B. Lent, Qognify GmbH, Bruchsal, Germany / J. E. Lopez de Vergara, Naudit HPCN, Madrid, Spain / S. Lopez-Buedo, Naudit HPCN, Madrid, Spain / F. Moreno, Universidad Politécnica de Cartagena, Cartagena, Spain / P. Pavon, Universidad Politécnica de Cartagena, Cartagena, Spain / M. Ruiz, Universitat Politècnica de Catalunya, Barcelona, Spain / S. K. Patri, ADVA, Munich, Germany / A. Giorgetti, CNIT, Pisa, Italy / F. Cugini, CNIT, Pisa, Italy / A. Sgambelluri, CNIT, Pisa, Italy / R. Nejabati, University of Bristol, Bristol, UK / D. Simeonidou, University of Bristol, Bristol, UK / R.-P. Braun, Deutsche Telekom, Germany / A. Autenrieth, ADVA, Munich, Germany / J.-P. Elbers, ADVA, Munich, Germany / J. K. Fischer, Fraunhofer HHI, Berlin, Germany / R. Freund, Fraunhofer HHI, Berlin, GermanyPostprint (author's final draft

    Resource management for data streaming applications

    Get PDF
    This dissertation investigates novel middleware mechanisms for building streaming applications. Developing streaming applications is a challenging task because (i) they are continuous in nature; (ii) they require fusion of data coming from multiple sources to derive higher level information; (iii) they require efficient transport of data from/to distributed sources and sinks; (iv) they need access to heterogeneous resources spanning sensor networks and high performance computing; and (v) they are time critical in nature. My thesis is that an intuitive programming abstraction will make it easier to build dynamic, distributed, and ubiquitous data streaming applications. Moreover, such an abstraction will enable an efficient allocation of shared and heterogeneous computational resources thereby making it easier for domain experts to build these applications. In support of the thesis, I present a novel programming abstraction, called DFuse, that makes it easier to develop these applications. A domain expert only needs to specify the input and output connections to fusion channels, and the fusion functions. The subsystems developed in this dissertation take care of instantiating the application, allocating resources for the application (via the scheduling heuristic developed in this dissertation) and dynamically managing the resources (via the dynamic scheduling algorithm presented in this dissertation). Through extensive performance evaluation, I demonstrate that the resources are allocated efficiently to optimize the throughput and latency constraints of an application.Ph.D.Committee Chair: Ramachandran, Umakishore; Committee Member: Chervenak, Ann; Committee Member: Cooper, Brian; Committee Member: Liu, Ling; Committee Member: Schwan, Karste

    Orchestration of machine learning workflows on Internet of Things data

    Get PDF
    Applications empowered by machine learning (ML) and the Internet of Things (IoT) are changing the way people live and impacting a broad range of industries. However, creating and automating ML workflows at scale using real-world IoT data often leads to complex systems integration and production issues. Examples of challenges faced during the development of these ML applications include glue code, hidden dependencies, and data pipeline jungles. This research proposes the Machine Learning Framework for IoT data (ML4IoT), which is designed to orchestrate ML workflows to perform training and enable inference by ML models on IoT data. In the proposed framework, containerized microservices are used to automate the execution of tasks specified in ML workflows, which are defined through REST APIs. To address the problem of integrating big data tools and machine learning into a unified platform, the proposed framework enables the definition and execution of end-to-end ML workflows on large volumes of IoT data. In addition, to address the challenges of running multiple ML workflows in parallel, the ML4IoT has been designed to use container-based components that provide a convenient mechanism to enable the training and deployment of numerous ML models in parallel. Finally, to address the common production issues faced during the development of ML applications, the proposed framework used microservices architecture to bring flexibility, reusability, and extensibility to the framework. Through the experiments, we demonstrated the feasibility of the (ML4IoT), which managed to train and deploy predictive ML models in two types of IoT data. The obtained results suggested that the proposed framework can manage real-world IoT data, by providing elasticity to execute 32 ML workflows in parallel, which were used to train 128 ML models simultaneously. Also, results demonstrated that in the ML4IoT, the performance of rendering online predictions is not affected when 64 ML models are deployed concurrently to infer new information using online IoT data
    • …
    corecore