1,325 research outputs found

    funcX: A Federated Function Serving Fabric for Science

    Full text link
    Exploding data volumes and velocities, new computational methods and platforms, and ubiquitous connectivity demand new approaches to computation in the sciences. These new approaches must enable computation to be mobile, so that, for example, it can occur near data, be triggered by events (e.g., arrival of new data), be offloaded to specialized accelerators, or run remotely where resources are available. They also require new design approaches in which monolithic applications can be decomposed into smaller components, that may in turn be executed separately and on the most suitable resources. To address these needs we present funcX---a distributed function as a service (FaaS) platform that enables flexible, scalable, and high performance remote function execution. funcX's endpoint software can transform existing clouds, clusters, and supercomputers into function serving systems, while funcX's cloud-hosted service provides transparent, secure, and reliable function execution across a federated ecosystem of endpoints. We motivate the need for funcX with several scientific case studies, present our prototype design and implementation, show optimizations that deliver throughput in excess of 1 million functions per second, and demonstrate, via experiments on two supercomputers, that funcX can scale to more than more than 130000 concurrent workers.Comment: Accepted to ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC 2020). arXiv admin note: substantial text overlap with arXiv:1908.0490

    A Framework for Model-Driven Scientific Workflow Engineering

    Get PDF
    So-called scientific workflows are one important means in the context of data-intensive science for reliable and efficient scientific data processing in distributed computing infrastructures such as Grids. Scientific Workflow Management Systems (SWfMS) help scientists model and run scientific workflows, whereas a domain-specific layer for workflow modeling by a scientist and a technical layer for automated workflow execution can be distinguished. Initially, many SWfMS were developed from scratch using custom workflow technologies languages without application of already existing and established business workflow technologies. Among the reasons were different life cycles for scientific and business workflows as well as incompatible interfaces and communication protocols of the respective execution infrastructures. Meanwhile, several business IT infrastructures have evolved to serviceoriented architectures (SOAs), for which many Web service standards and technologies have been developed. The Web Services Business Process Execution Language (BPEL), for example, is a well-accepted standard for the implementation and execution of business workflows in SOAs. The SOA architecture pattern has been adopted in scientific IT infrastructures by so-called Service Grids based on existing standards and technologies. Due to this development, BPEL is also suitable for the execution of scientific workflows at the technical layer, which has been elaborated on in many publications and projects. However, BPEL is a workflow language for IT experts and is originally not suited for scientific workflow modeling by a scientist at the domain-specific layer. A domain-specific abstraction of BPEL is therefore required that can be specifically tailored for scientific workflow modeling as well as a corresponding mapping to the technical layer. These challenges of the domain-specific abstraction and the mapping are addressed in this thesis with the help of the Business Process Model and Notation (BPMN) standard and technologies from Model-Driven Software Development (MDSD). Therefore, the MoDFlow approach for Model-Driven Scientific WorkFlow Engineering is presented to map domain-specific scientific workflow models via a BPMN-based intermediate layer to an executable workflow model. The intermediate layer is specified by MoDFlow.BPMN, which is a BPMN metamodel subset with custom extensions for the scientific domain. MoDFlow.BPMN2BPEL defines three consecutive transformation steps to map MoDFlow.BPMN to BPEL for workflow execution. Furthermore, different methods to utilize and extend MoDFlow.BPMN and MoDFlow.BPMN2BPEL are described in the MoDFlow approach, in which the definition of so-called domain-specific languages (DSLs) for the modeling of scientific workflows at the domain-specific layer is focused. The MoDFlow framework is an implementation of the MoDFlow approach, which is based on the Eclipse Modeling Framework (EMF). The MoDFlow framework is evaluated in three application scenarios, in which different utilization and extension mechanisms are examined. The first two application scenarios investigate the technical feasibility of the approach and support scientific workflows with parameter sweeps that are executed on a Grid infrastructure. The third application scenario has been conducted in collaboration with the PubFlow project, which aims to create an infrastructure to model and execute data publication workflows. Based on the Xtext framework, a textual DSL and a corresponding language infrastructure is defined for this purpose that supports developers in creating data publication workflows. This scenario aims to illustrate the practicability of the MoDFlow framework. PubFlow currently plans to implement an additional graphical DSL based on the BPMN notation and a corresponding workflow editor for scientists

    Dynamic adaptation of interaction models for stateful web services

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia InformáticaWireless Sensor Networks (WSNs) are accepted as one of the fundamental technologies for current and future science in all domains, where WSNs formed from either static or mobile sensor devices allow a low cost high-resolution sensing of the environment. Such opens the possibility of developing new kinds of crucial applications or providing more accurate data to more traditional ones. For instance, examples may range from large-scale WSNs deployed on oceans contributing to weather prediction simulations; to high number of diverse Sensor devices deployed over a geographical area at different heights from the ground for collecting more accurate data for cyclic wildfire spread simulations; or to networks of mobile phone devices contributing to urban traffic management via Participatory Sensing applications. In order to simplify data access, network parameterisation, and WSNs aggregation, WSNs have been integrated in Web environments, namely through high level standard interfaces like Web services. However, the typical interface access usually supports a restricted number of interaction models and the available mechanisms for their run-time adaptation are still scarce. Nevertheless, applications demand a richer and more flexible control on interface accesses – e.g. such accesses may depend on contextual information and, consequently, may evolve in time. Additionally, Web services have become increasingly popular in the latest years, and their usage led to the need of aggregating and coordinating them and also to represent state in between Web services invocations. Current standard composition languages for Web services (wsbpel,wsci,bpml) deal with the traditional forms of service aggregation and coordination, while WS-Resource framework (wsrf) deals with accessing services pertaining state concerns (relating both executing applications and the runtime environment). Subjacent to the notion of service coordination is the need to capture dependencies among them (through the workflow concept, for instance), reuse common interaction models, e.g. embodied in common behavioural Patterns like Client/Server, Publish/- Subscriber, Stream, and respond to dynamic events in the system (novel user requests, service failures, etc.). Dynamic adaptation, in particular, is a pressing requirement for current service-based systems due to the increasing trend on XaaS ("everything as a service") which promises to reduce costs on application development and infrastructure support, as is already apparent in the Cloud computing domain. Therefore, the self-adaptive (or dynamic/adaptive) systems present themselves as a solution to the above concerns. However, since they comprise a vast area, this thesis only focus on self-adaptive software. Concretely, we propose a novel model for dynamic interactions, in particular with Stateful Web Services, i.e. services interfacing continued activities. The solution consists on a middleware prototype based on pattern abstractions which may be able to provide (novel) richer interaction models and a few structured dynamic adaptation mechanisms, which are captured in the context of a "Session" abstraction. The middleware was implemented and uses a pre-existent framework supporting Web enabled access to WSNs, and some evaluation scenarios were tested in this setting. Namely, this area was chosen as the application domain that contextualizes this work as it contributes to the development of increasingly important applications needing highresolution and low cost sensing of environment. The result is a novel way to specify richer and dynamic modes of accessing and acquiring data generated by WSNs.Este trabalho foi parcialmente financiado pelo Centro de Informática e Tecnologias da Informação (CITI), e pela Fundação para a Ciência e a Tecnologia (FCT / MCTES) em projectos de investigaçã

    trackr: A Framework for Enhancing Discoverability and Reproducibility of Data Visualizations and Other Artifacts in R

    Full text link
    Research is an incremental, iterative process, with new results relying and building upon previous ones. Scientists need to find, retrieve, understand, and verify results in order to confidently extend them, even when the results are their own. We present the trackr framework for organizing, automatically annotating, discovering, and retrieving results. We identify sources of automatically extractable metadata for computational results, and we define an extensible system for organizing, annotating, and searching for results based on these and other metadata. We present an open-source implementation of these concepts for plots, computational artifacts, and woven dynamic reports generated in the R statistical computing language
    corecore