93 research outputs found

    Enforcing QoS in scientific workflow systems enacted over Cloud infrastructures

    Get PDF
    AbstractThe ability to support Quality of Service (QoS) constraints is an important requirement in some scientific applications. With the increasing use of Cloud computing infrastructures, where access to resources is shared, dynamic and provisioned on-demand, identifying how QoS constraints can be supported becomes an important challenge. However, access to dedicated resources is often not possible in existing Cloud deployments and limited QoS guarantees are provided by many commercial providers (often restricted to error rate and availability, rather than particular QoS metrics such as latency or access time). We propose a workflow system architecture which enforces QoS for the simultaneous execution of multiple scientific workflows over a shared infrastructure (such as a Cloud environment). Our approach involves multiple pipeline workflow instances, with each instance having its own QoS requirements. These workflows are composed of a number of stages, with each stage being mapped to one or more physical resources. A stage involves a combination of data access, computation and data transfer capability. A token bucket-based data throttling framework is embedded into the workflow system architecture. Each workflow instance stage regulates the amount of data that is injected into the shared resources, allowing for bursts of data to be injected while at the same time providing isolation of workflow streams. We demonstrate our approach by using the Montage workflow, and develop a Reference net model of the workflow

    Feedback-control & queueing theory-based resource management for streaming applications

    Get PDF
    Recent advances in sensor technologies and instrumentation have led to an extraordinary growth of data sources and streaming applications. A wide variety of devices, from smart phones to dedicated sensors, have the capability of collecting and streaming large amounts of data at unprecedented rates. A number of distinct streaming data models have been proposed. Typical applications for this include smart cites & built environments for instance, where sensor-based infrastructures continue to increase in scale and variety. Understanding how such streaming content can be processed within some time threshold remains a non-trivial and important research topic. We investigate how a cloud-based computational infrastructure can autonomically respond to such streaming content, offering Quality of Service guarantees. We propose an autonomic controller (based on feedback control and queueing theory) to elastically provision virtual machines to meet performance targets associated with a particular data stream. Evaluation is carried out using a federated Cloud-based infrastructure (implemented using CometCloud) – where the allocation of new resources can be based on: (i) differences between sites, i.e. types of resources supported (e.g. GPU vs. CPU only), (ii) cost of execution; (iii) failure rate and likely resilience, etc. In particular, we demonstrate how Little’s Law –a widely used result in queuing theory– can be adapted to support dynamic control in the context of such resource provisioning

    Comparing FutureGrid, Amazon EC2, and Open Science Grid for Scientific Workflows

    Get PDF
    Scientists have a number of computing infrastructures available to conduct their research, including grids and public or private clouds. This paper explores the use of these cyberinfrastructures to execute scientific workflows, an important class of scientific applications. It examines the benefits and drawbacks of cloud and grid systems using the case study of an astronomy application. The application analyzes data from the NASA Kepler mission in order to compute periodograms, which help astronomers detect the periodic dips in the intensity of starlight caused by exoplanets as they transit their host star. In this paper we describe our experiences modeling the periodogram application as a scientific workflow using Pegasus, and deploying it on the FutureGrid scientific cloud testbed, the Amazon EC2 commercial cloud, and the Open Science Grid. We compare and contrast the infrastructures in terms of setup, usability, cost, resource availability and performance

    A Software Architecture Assisting Workflow Executions on Cloud Resources

    Get PDF
    An enterprise providing services handled by means of workflows needs to monitor and control their execution, gather usage data, determine priorities, and properly use computing cloud-related resources. This paper proposes a software architecture that connects unaware services to components handling workflow monitoring and management concerns. Moreover, the provided components enhance dependability of services while letting developers focus only on the business logic

    Model-driven development of data intensive applications over cloud resources

    Get PDF
    The proliferation of sensors over the last years has generated large amounts of raw data, forming data streams that need to be processed. In many cases, cloud resources are used for such processing, exploiting their flexibility, but these sensor streaming applications often need to support operational and control actions that have real-time and low-latency requirements that go beyond the cost effective and flexible solutions supported by existing cloud frameworks, such as Apache Kafka, Apache Spark Streaming, or Map-Reduce Streams. In this paper, we describe a model-driven and stepwise refinement methodological approach for streaming applications executed over clouds. The central role is assigned to a set of Petri Net models for specifying functional and non-functional requirements. They support model reuse, and a way to combine formal analysis, simulation, and approximate computation of minimal and maximal boundaries of non-functional requirements when the problem is either mathematically or computationally intractable. We show how our proposal can assist developers in their design and implementation decisions from a performance perspective. Our methodology allows to conduct performance analysis: The methodology is intended for all the engineering process stages, and we can (i) analyse how it can be mapped onto cloud resources, and (ii) obtain key performance indicators, including throughput or economic cost, so that developers are assisted in their development tasks and in their decision taking. In order to illustrate our approach, we make use of the pipelined wavefront array

    Model-driven development of data intensive applications over cloud resources

    Full text link
    The proliferation of sensors over the last years has generated large amounts of raw data, forming data streams that need to be processed. In many cases, cloud resources are used for such processing, exploiting their flexibility, but these sensor streaming applications often need to support operational and control actions that have real-time and low-latency requirements that go beyond the cost effective and flexible solutions supported by existing cloud frameworks, such as Apache Kafka, Apache Spark Streaming, or Map-Reduce Streams. In this paper, we describe a model-driven and stepwise refinement methodological approach for streaming applications executed over clouds. The central role is assigned to a set of Petri Net models for specifying functional and non-functional requirements. They support model reuse, and a way to combine formal analysis, simulation, and approximate computation of minimal and maximal boundaries of non-functional requirements when the problem is either mathematically or computationally intractable. We show how our proposal can assist developers in their design and implementation decisions from a performance perspective. Our methodology allows to conduct performance analysis: The methodology is intended for all the engineering process stages, and we can (i) analyse how it can be mapped onto cloud resources, and (ii) obtain key performance indicators, including throughput or economic cost, so that developers are assisted in their development tasks and in their decision taking. In order to illustrate our approach, we make use of the pipelined wavefront array.Comment: Preprin

    A formal architecture-centric and model driven approach for the engineering of science gateways

    Get PDF
    From n-Tier client/server applications, to more complex academic Grids, or even the most recent and promising industrial Clouds, the last decade has witnessed significant developments in distributed computing. In spite of this conceptual heterogeneity, Service-Oriented Architecture (SOA) seems to have emerged as the common and underlying abstraction paradigm, even though different standards and technologies are applied across application domains. Suitable access to data and algorithms resident in SOAs via so-called ‘Science Gateways’ has thus become a pressing need in order to realize the benefits of distributed computing infrastructures.In an attempt to inform service-oriented systems design and developments in Grid-based biomedical research infrastructures, the applicant has consolidated work from three complementary experiences in European projects, which have developed and deployed large-scale production quality infrastructures and more recently Science Gateways to support research in breast cancer, pediatric diseases and neurodegenerative pathologies respectively. In analyzing the requirements from these biomedical applications the applicant was able to elaborate on commonly faced issues in Grid development and deployment, while proposing an adapted and extensible engineering framework. Grids implement a number of protocols, applications, standards and attempt to virtualize and harmonize accesses to them. Most Grid implementations therefore are instantiated as superposed software layers, often resulting in a low quality of services and quality of applications, thus making design and development increasingly complex, and rendering classical software engineering approaches unsuitable for Grid developments.The applicant proposes the application of a formal Model-Driven Engineering (MDE) approach to service-oriented developments, making it possible to define Grid-based architectures and Science Gateways that satisfy quality of service requirements, execution platform and distribution criteria at design time. An novel investigation is thus presented on the applicability of the resulting grid MDE (gMDE) to specific examples and conclusions are drawn on the benefits of this approach and its possible application to other areas, in particular that of Distributed Computing Infrastructures (DCI) interoperability, Science Gateways and Cloud architectures developments

    A Process Framework for Managing Quality of Service in Private Cloud

    Get PDF
    As information systems leaders tap into the global market of cloud computing-based services, they struggle to maintain consistent application performance due to lack of a process framework for managing quality of service (QoS) in the cloud. Guided by the disruptive innovation theory, the purpose of this case study was to identify a process framework for meeting the QoS requirements of private cloud service users. Private cloud implementation was explored by selecting an organization in California through purposeful sampling. Information was gathered by interviewing 23 information technology (IT) professionals, a mix of frontline engineers, managers, and leaders involved in the implementation of private cloud. Another source of data was documents such as standard operating procedures, policies, and guidelines related to private cloud implementation. Interview transcripts and documents were coded and sequentially analyzed. Three prominent themes emerged from the analysis of data: (a) end user expectations, (b) application architecture, and (c) trending analysis. The findings of this study may help IT leaders in effectively managing QoS in cloud infrastructure and deliver reliable application performance that may help in increasing customer population and profitability of organizations. This study may contribute to positive social change as information systems managers and workers can learn and apply the process framework for delivering stable and reliable cloud-hosted computer applications

    Computational resource management for data-driven applications with deadline constraints

    Get PDF
    Recent advances in the type and variety of sensing technologies have led to an extraordinary growth in the volume of data being produced and led to a number of streaming applications that make use of this data. Sensors typically monitor environmental or physical phenomenon at predefined time intervals or triggered by user-defined events. Understanding how such streaming content (the raw data or events) can be processed within a time threshold remains an important research challenge. We investigate how a cloud-based computational infrastructure can autonomically respond to such streaming content, offering quality of service guarantees. In particular, we contextualize our approach using an electric vehicles (EVs) charging scenario, where such vehicles need to connect to the electrical grid to charge their batteries. There has been an emerging interest in EV aggregators (primarily intermediate brokers able to estimate aggregate charging demand for a collection of EVs) to coordinate the charging process. We consider predicting EV charging demand as a potential workload with execution time constraints. We assume that an EV aggregator manages a number of geographic areas and a pool of computational resources of a cloud computing cluster to support scheduling of EV charging. The objective is to ensure that there is enough computational capacity to satisfy the requirements for managing EV battery charging requests within specific time constraints
    • 

    corecore