304 research outputs found

    Feedback-control & queueing theory-based resource management for streaming applications

    Get PDF
    Recent advances in sensor technologies and instrumentation have led to an extraordinary growth of data sources and streaming applications. A wide variety of devices, from smart phones to dedicated sensors, have the capability of collecting and streaming large amounts of data at unprecedented rates. A number of distinct streaming data models have been proposed. Typical applications for this include smart cites & built environments for instance, where sensor-based infrastructures continue to increase in scale and variety. Understanding how such streaming content can be processed within some time threshold remains a non-trivial and important research topic. We investigate how a cloud-based computational infrastructure can autonomically respond to such streaming content, offering Quality of Service guarantees. We propose an autonomic controller (based on feedback control and queueing theory) to elastically provision virtual machines to meet performance targets associated with a particular data stream. Evaluation is carried out using a federated Cloud-based infrastructure (implemented using CometCloud) – where the allocation of new resources can be based on: (i) differences between sites, i.e. types of resources supported (e.g. GPU vs. CPU only), (ii) cost of execution; (iii) failure rate and likely resilience, etc. In particular, we demonstrate how Little’s Law –a widely used result in queuing theory– can be adapted to support dynamic control in the context of such resource provisioning

    Model-driven development of data intensive applications over cloud resources

    Get PDF
    The proliferation of sensors over the last years has generated large amounts of raw data, forming data streams that need to be processed. In many cases, cloud resources are used for such processing, exploiting their flexibility, but these sensor streaming applications often need to support operational and control actions that have real-time and low-latency requirements that go beyond the cost effective and flexible solutions supported by existing cloud frameworks, such as Apache Kafka, Apache Spark Streaming, or Map-Reduce Streams. In this paper, we describe a model-driven and stepwise refinement methodological approach for streaming applications executed over clouds. The central role is assigned to a set of Petri Net models for specifying functional and non-functional requirements. They support model reuse, and a way to combine formal analysis, simulation, and approximate computation of minimal and maximal boundaries of non-functional requirements when the problem is either mathematically or computationally intractable. We show how our proposal can assist developers in their design and implementation decisions from a performance perspective. Our methodology allows to conduct performance analysis: The methodology is intended for all the engineering process stages, and we can (i) analyse how it can be mapped onto cloud resources, and (ii) obtain key performance indicators, including throughput or economic cost, so that developers are assisted in their development tasks and in their decision taking. In order to illustrate our approach, we make use of the pipelined wavefront array

    A Model for Scientific Workflows with Parallel and Distributed Computing

    Get PDF
    In the last decade we witnessed an immense evolution of the computing infrastructures in terms of processing, storage and communication. On one hand, developments in hardware architectures have made it possible to run multiple virtual machines on a single physical machine. On the other hand, the increase of the available network communication bandwidth has enabled the widespread use of distributed computing infrastructures, for example based on clusters, grids and clouds. The above factors enabled different scientific communities to aim for the development and implementation of complex scientific applications possibly involving large amounts of data. However, due to their structural complexity, these applications require decomposition models to allow multiple tasks running in parallel and distributed environments. The scientific workflow concept arises naturally as a way to model applications composed of multiple activities. In fact, in the past decades many initiatives have been undertaken to model application development using the workflow paradigm, both in the business and in scientific domains. However, despite such intensive efforts, current scientific workflow systems and tools still have limitations, which pose difficulties to the development of emerging large-scale, distributed and dynamic applications. This dissertation proposes the AWARD model for scientific workflows with parallel and distributed computing. AWARD is an acronym for Autonomic Workflow Activities Reconfigurable and Dynamic. The AWARD model has the following main characteristics. It is based on a decentralized execution control model where multiple autonomic workflow activities interact by exchanging tokens through input and output ports. The activities can be executed separately in diverse computing environments, such as in a single computer or on multiple virtual machines running on distributed infrastructures, such as clusters and clouds. It provides basic workflow patterns for parallel and distributed application decomposition and other useful patterns supporting feedback loops and load balancing. The model is suitable to express applications based on a finite or infinite number of iterations, thus allowing to model long-running workflows, which are typical in scientific experimention. A distintive contribution of the AWARD model is the support for dynamic reconfiguration of long-running workflows. A dynamic reconfiguration allows to modify the structure of the workflow, for example, to introduce new activities, modify the connections between activity input and output ports. The activity behavior can also be modified, for example, by dynamically replacing the activity algorithm. In addition to the proposal of a new workflow model, this dissertation presents the implementation of a fully functional software architecture that supports the AWARD model. The implemented prototype was used to validate and refine the model across multiple workflow scenarios whose usefulness has been demonstrated in practice clearly, through experimental results, demonstrating the advantages of the major characteristics and contributions of the AWARD model. The implemented prototype was also used to develop application cases, such as a workflow to support the implementation of the MapReduce model and a workflow to support a text mining application developed by an external user. The extensive experimental work confirmed the adequacy of the AWARD model and its implementation for developing applications that exploit parallelism and distribution using the scientific workflows paradigm

    Distributed computing practice for large-scale science and engineering applications

    Get PDF
    It is generally accepted that the ability to develop large-scale distributed applications has lagged seriously behind other developments in cyberinfrastructure. In this paper, we provide insight into how such applications have been developed and an understanding of why developing applications for distributed infrastructure is hard. Our approach is unique in the sense that it is centered around half a dozen existing scientific applications; we posit that these scientific applications are representative of the characteristics, requirements, as well as the challenges of the bulk of current distributed applications on production cyberinfrastructure (such as the US TeraGrid). We provide a novel and comprehensive analysis of such distributed scientific applications. Specifically, we survey existing models and methods for large-scale distributed applications and identify commonalities, recurring structures, patterns and abstractions. We find that there are many ad hoc solutions employed to develop and execute distributed applications, which result in a lack of generality and the inability of distributed applications to be extensible and independent of infrastructure details. In our analysis, we introduce the notion of application vectors: a novel way of understanding the structure of distributed applications. Important contributions of this paper include identifying patterns that are derived from a wide range of real distributed applications, as well as an integrated approach to analyzing applications, programming systems and patterns, resulting in the ability to provide a critical assessment of the current practice of developing, deploying and executing distributed applications. Gaps and omissions in the state of the art are identified, and directions for future research are outlined

    Reconfiguração Dinâmica Estruturada de Workflows de Serviços Web

    Get PDF
    O conceito de serviço consiste numa abstracção simples mas poderosa para representar/integrar entidades de diferentes contextos, e.g. serviços Web da área de nego ́cio, serviços computacionais em Grid, ou Internet of Things. Os sistemas de gestão de workflows, por seu lado, são uma solução standard para a especificação/execução da composição de serviços, para os quais ́e necessário garantir mecanismos adicionais de reconfiguração dinâmica. Estes resultam de novos requisitos e.g. coordenação de serviços representando entidades bem distintas e/ou for- mando aplicações de larga escala, ou de requisitos aplicacionais que é necessário incorporar em tempo de execução do workflow. Neste trabalho propomos uma extensão com mecanismos de reconfiguração dinâmica baseados no conceito de padrão, de uma ferramenta de workflow da área de negócio, i.e. oferecendo orquestração de serviços segundo um modelo de fluxo de controlo. Os padrões implementados seguem o modelo de fluxo de dados característico da área científica/sistemas de eventos. As reconfigurações dinâmicas são bem definidas e confinadas, dado que se restringem a estes padrões.publishersversionpublishe

    Dynamic adaptation of interaction models for stateful web services

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia InformáticaWireless Sensor Networks (WSNs) are accepted as one of the fundamental technologies for current and future science in all domains, where WSNs formed from either static or mobile sensor devices allow a low cost high-resolution sensing of the environment. Such opens the possibility of developing new kinds of crucial applications or providing more accurate data to more traditional ones. For instance, examples may range from large-scale WSNs deployed on oceans contributing to weather prediction simulations; to high number of diverse Sensor devices deployed over a geographical area at different heights from the ground for collecting more accurate data for cyclic wildfire spread simulations; or to networks of mobile phone devices contributing to urban traffic management via Participatory Sensing applications. In order to simplify data access, network parameterisation, and WSNs aggregation, WSNs have been integrated in Web environments, namely through high level standard interfaces like Web services. However, the typical interface access usually supports a restricted number of interaction models and the available mechanisms for their run-time adaptation are still scarce. Nevertheless, applications demand a richer and more flexible control on interface accesses – e.g. such accesses may depend on contextual information and, consequently, may evolve in time. Additionally, Web services have become increasingly popular in the latest years, and their usage led to the need of aggregating and coordinating them and also to represent state in between Web services invocations. Current standard composition languages for Web services (wsbpel,wsci,bpml) deal with the traditional forms of service aggregation and coordination, while WS-Resource framework (wsrf) deals with accessing services pertaining state concerns (relating both executing applications and the runtime environment). Subjacent to the notion of service coordination is the need to capture dependencies among them (through the workflow concept, for instance), reuse common interaction models, e.g. embodied in common behavioural Patterns like Client/Server, Publish/- Subscriber, Stream, and respond to dynamic events in the system (novel user requests, service failures, etc.). Dynamic adaptation, in particular, is a pressing requirement for current service-based systems due to the increasing trend on XaaS ("everything as a service") which promises to reduce costs on application development and infrastructure support, as is already apparent in the Cloud computing domain. Therefore, the self-adaptive (or dynamic/adaptive) systems present themselves as a solution to the above concerns. However, since they comprise a vast area, this thesis only focus on self-adaptive software. Concretely, we propose a novel model for dynamic interactions, in particular with Stateful Web Services, i.e. services interfacing continued activities. The solution consists on a middleware prototype based on pattern abstractions which may be able to provide (novel) richer interaction models and a few structured dynamic adaptation mechanisms, which are captured in the context of a "Session" abstraction. The middleware was implemented and uses a pre-existent framework supporting Web enabled access to WSNs, and some evaluation scenarios were tested in this setting. Namely, this area was chosen as the application domain that contextualizes this work as it contributes to the development of increasingly important applications needing highresolution and low cost sensing of environment. The result is a novel way to specify richer and dynamic modes of accessing and acquiring data generated by WSNs.Este trabalho foi parcialmente financiado pelo Centro de Informática e Tecnologias da Informação (CITI), e pela Fundação para a Ciência e a Tecnologia (FCT / MCTES) em projectos de investigaçã

    BRAHMA(+): A Framework for Resource Scaling of Streaming and ASAP Time-Varying Workflows

    Get PDF
    Automatic scaling of complex software-as-a-service application workflows is one of the most important problems concerning resource management in clouds. In this paper, we study the automatic workflow resource scaling problem for streaming and ASAP workflows, and its time-varying variant where the workflow resource requirements change over time. Service components of streaming workflows execute concurrently while those of ASAP workflows execute sequentially. We propose an intelligent framework, BRAHMA(+), which possesses the capability to learn the workflow behavior and construct a knowledge base that serves as its decision making engine. The proposed resource provisioning algorithms leverage this learned information curated in the knowledge base to perform informed and intelligent scaling decisions. Additionally, BRAHMA(+) employs the use of online-learning strategies to keep the knowledge base up-to-date, thereby accommodating the changes in the workflow resource requirements over time. We evaluate the proposed algorithms using CloudSim simulations. Results on streaming and ASAP workflows, with both static and time-varying resource requirements show that the proposed algorithms are effective and produce good cost-quality trade-offs. The proactive and hybrid algorithms meet the service level agreements and restrict deadline violations to a small fraction (3%-5% in the considered scenarios), while only suffering a marginal increase in average cost per component compared to the described baseline algorithms

    Exploring power behaviors and trade-offs of in-situ data analytics

    Get PDF
    pre-printAs scientific applications target exascale, challenges related to data and energy are becoming dominating concerns. For example, coupled simulation workflows are increasingly adopting in-situ data processing and analysis techniques to address costs and overheads due to data movement and I/O. However it is also critical to understand these overheads and associated trade-offs from an energy perspective. The goal of this paper is exploring data-related energy/performance trade-offs for end-to-end simulation workflows running at scale on current high-end computing systems. Specifically, this paper presents: (1) an analysis of the data-related behaviors of a combustion simulation workflow with an in-situ data analytics pipeline, running on the Titan system at ORNL; (2) a power model based on system power and data exchange patterns, which is empirically validated; and (3) the use of the model to characterize the energy behavior of the workflow and to explore energy/performance tradeoffs on current as well as emerging systems
    corecore