164 research outputs found

    Quality of service based data-aware scheduling

    Get PDF
    Distributed supercomputers have been widely used for solving complex computational problems and modeling complex phenomena such as black holes, the environment, supply-chain economics, etc. In this work we analyze the use of these distributed supercomputers for time sensitive data-driven applications. We present the scheduling challenges involved in running deadline sensitive applications on shared distributed supercomputers running large parallel jobs and introduce a ``data-aware\u27\u27 scheduling paradigm that overcomes these challenges by making use of Quality of Service classes for running applications on shared resources. We evaluate the new data-aware scheduling paradigm using an event-driven hurricane simulation framework which attempts to run various simulations modeling storm surge, wave height, etc. in a timely fashion to be used by first responders and emergency officials. We further generalize the work and demonstrate with examples how data-aware computing can be used in other applications with similar requirements

    A Decentralized Framework for the Optimal Coordination of Distributed Energy Resources

    Get PDF
    Demand-response aggregators are faced with the challenge of how to best manage numerous and heterogeneous distributed energy resources (DERs). This paper proposes a decentralized methodology for optimal coordination of DERs. The proposed approach is based on Dantzig-Wolfe decomposition and column generation, thus allowing to integrate any type of resource whose operation can be formulated within a mixed-integer linear program. We show that the proposed framework offers the same guarantees of optimality as a centralized formulation, with the added benefits of distributed computation, enhanced privacy, and higher robustness to changes in the problem data. The practical efficiency of the algorithm is demonstrated through extensive computational experiments, on a set of instances generated using data from Ontario energy markets. The proposed approach was able to solve all test instances to proven optimality, while achieving significant speed-ups over a centralized formulation solved by state-of-the-art optimization software

    Big Data Management Using Scientific Workflows

    Get PDF
    Humanity is rapidly approaching a new era, where every sphere of activity will be informed by the ever-increasing amount of data. Making use of big data has the potential to improve numerous avenues of human activity, including scientific research, healthcare, energy, education, transportation, environmental science, and urban planning, just to name a few. However, making such progress requires managing terabytes and even petabytes of data, generated by billions of devices, products, and events, often in real time, in different protocols, formats and types. The volume, velocity, and variety of big data, known as the 3 Vs , present formidable challenges, unmet by the traditional data management approaches. Traditionally, many data analyses have been performed using scientific workflows, tools for formalizing and structuring complex computational processes. While scientific workflows have been used extensively in structuring complex scientific data analysis processes, little work has been done to enable scientific workflows to cope with the three big data challenges on the one hand, and to leverage the dynamic resource provisioning capability of cloud computing to analyze big data on the other hand. In this dissertation, to facilitate efficient composition, verification, and execution of distributed large-scale scientific workflows, we first propose a formal approach to scientific workflow verification, including a workflow model, and the notion of a well-typed workflow. Our approach translates a scientific workflow into an equivalent typed lambda expression, and typechecks the workflow. We then propose a typetheoretic approach to the shimming problem in scientific workflows, which occurs when connecting related but incompatible components. We reduce the shimming problem to a runtime coercion problem in the theory of type systems, and propose a fully automated and transparent solution. Our technique algorithmically inserts invisible shims into the workflow specification, thereby resolving the shimming problem for any well-typed workflow. Next, we identify a set of important challenges for running big data workflows in the cloud. We then propose a generic, implementation-independent system architecture that addresses many of these challenges. Finally, we develop a cloud-enabled big data workflow management system, called DATAVIEW, that delivers a specific implementation of our proposed architecture. To further validate our proposed architecture, we conduct a case study in which we design and run a big data workflow from the automotive domain using the Amazon EC2 cloud environment

    Energy scheduling model to optimize transition routes towards 100% renewable urban districts

    Get PDF
    The purpose of this paper is to develop a model to analyze options for 100% renewable urban districts which self-consume locally generated renewable energy as much as possible and import (or export) energy from (or to) external grids as little as possible. Energy scheduling algorithms are developed to prioritize energy generation and storage of local renewable energy. The model is applied in a Dutch case study in which three renewable energy system concepts are evaluated against the case reference. Optimal capacities are determined for minimal operational costs including a penalty on carbon dioxide production. Attractiveness of these concepts is discussed in relation to costs, environmental concerns and applicability within the Dutch context of the energy transition

    Partitioning workflow applications over federated clouds to meet non-functional requirements

    Get PDF
    PhD ThesisWith cloud computing, users can acquire computer resources when they need them on a pay-as-you-go business model. Because of this, many applications are now being deployed in the cloud, and there are many di erent cloud providers worldwide. Importantly, all these various infrastructure providers o er services with di erent levels of quality. For example, cloud data centres are governed by the privacy and security policies of the country where the centre is located, while many organisations have created their own internal \private cloud" to meet security needs. With all this varieties and uncertainties, application developers who decide to host their system in the cloud face the issue of which cloud to choose to get the best operational conditions in terms of price, reliability and security. And the decision becomes even more complicated if their application consists of a number of distributed components, each with slightly di erent requirements. Rather than trying to identify the single best cloud for an application, this thesis considers an alternative approach, that is, combining di erent clouds to meet users' non-functional requirements. Cloud federation o ers the ability to distribute a single application across two or more clouds, so that the application can bene t from the advantages of each one of them. The key challenge for this approach is how to nd the distribution (or deployment) of application components, which can yield the greatest bene ts. In this thesis, we tackle this problem and propose a set of algorithms, and a framework, to partition a work ow-based application over federated clouds in order to exploit the strengths of each cloud. The speci c goal is to split a distributed application structured as a work ow such that the security and reliability requirements of each component are met, whilst the overall cost of execution is minimised. To achieve this, we propose and evaluate a cloud broker for partitioning a work ow application over federated clouds. The broker integrates with the e-Science Central cloud platform to automatically deploy a work ow over public and private clouds. We developed a deployment planning algorithm to partition a large work ow appli- - i - cation across federated clouds so as to meet security requirements and minimise the monetary cost. A more generic framework is then proposed to model, quantify and guide the partitioning and deployment of work ows over federated clouds. This framework considers the situation where changes in cloud availability (including cloud failure) arise during work ow execution

    Semantics-enriched workflow creation and management system with an application to document image analysis and recognition

    Get PDF
    Scientific workflow systems are an established means to model and execute experiments or processing pipelines. Nevertheless, designing workflows can be a daunting task for users due to the complexities of the systems and the sheer number of available processing nodes, each having different compatibility/applicability characteristics. This Thesis explores how concepts of the Semantic Web can be used to augment workflow systems in order to assist researchers as well as non-expert users in creating valid and effective workflows. A prototype workflow creation/management system has been developed, including components for ontology modelling, workflow composition, and workflow repositories. Semantics are incorporated as a lightweight layer, permeating all aspects of the system and workflows, including retrieval, composition, and validation. Document image analysis and recognition is used as a representative application domain to evaluate the validity of the system. A new semantic model is proposed, covering a wide range of aspects of the target domain and adjacent fields. Real-world use cases demonstrate the assistive features and the automated workflow creation. On that basis, the prototype workflow creation/management system is compared to other state-of-the-art workflow systems and it is shown how those could benefit from the semantic model. The Thesis concludes with a discussion on how a complete infrastructure based on semantics-enriched datasets, workflow systems, and sharing platforms could represent the next step in automation within document image analysis and other domains
    • …