231 research outputs found

    Provendo robustez a escalonadores de workflows sensíveis às incertezas da largura de banda disponível

    Get PDF
    Orientadores: Edmundo Roberto Mauro Madeira, Luiz Fernando BittencourtTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Para que escalonadores de aplicações científicas modeladas como workflows derivem escalonamentos eficientes em nuvens híbridas, é necessário que se forneçam, além da descrição da demanda computacional desses aplicativos, as informações sobre o poder de computação dos recursos disponíveis, especialmente aqueles dados relacionados com a largura de banda disponível. Entretanto, a imprecisão das ferramentas de medição fazem com que as informações da largura de banda disponível fornecida aos escalonadores difiram dos valores reais que deveriam ser considerados para se obter escalonamentos quase ótimos. Escalonadores especialmente projetados para nuvens híbridas simplesmente ignoram a existência de tais imprecisões e terminam produzindo escalonamentos enganosos e de baixo desempenho, o que os tornam sensíveis às informações incertas. A presente Tese introduz um procedimento pró-ativo para fornecer um certo nível de robustez a escalonamentos derivados de escalonadores não projetados para serem robustos frente às incertezas decorrentes do uso de informações imprecisas dadas por ferramentas de medições de rede. Para tornar os escalonamentos sensíveis às incertezas em escalonamentos robustos às essas imprecisões, o procedimento propõe um refinamento (uma deflação) das estimativas da largura de banda antes de serem utilizadas pelo escalonador não robusto. Ao propor o uso de estimativas refinadas da largura de banda disponível, escalonadores inicialmente sensíveis às incertezas passaram a produzir escalonamentos com um certo nível de robustez às essas imprecisões. A eficácia e a eficiência do procedimento proposto são avaliadas através de simulação. Comparam-se, portanto, os escalonamentos gerados por escalonadores que passaram a usar o procedimento proposto com aqueles produzidos pelos mesmos escalonadores mas sem aplicar esse procedimento. Os resultados das simulações mostram que o procedimento proposto é capaz de prover robustez às incertezas da informação da largura de banda a escalonamentos derivados de escalonardes não robustos às tais incertezas. Adicionalmente, esta Tese também propõe um escalonador de aplicações científicas especialmente compostas por um conjunto de workflows. A novidade desse escalonador é que ele é flexível, ou seja, permite o uso de diferentes categorias de funções objetivos. Embora a flexibilidade proposta seja uma novidade no estado da arte, esse escalonador também é sensível às imprecisões da largura de banda. Entretanto, o procedimento mostrou-se capaz de provê-lo de robustez frente às tais incertezas. É mostrado nesta Tese que o procedimento proposto aumentou a eficácia e a eficiência de escalonadores de workflows não robustos projetados para nuvens híbridas, já que eles passaram a produzir escalonamentos com um certo nível de robustez na presença de estimativas incertas da largura de banda disponível. Dessa forma, o procedimento proposto nesta Tese é uma importante ferramenta para aprimorar os escalonadores sensíveis às estimativas incertas da banda disponível especialmente projetados para um ambiente computacional onde esses valores são imprecisos por natureza. Portanto, esta Tese propõe um procedimento que promove melhorias nas execuções de aplicações científicas em nuvens híbridasAbstract: To derive efficient schedules for the tasks of scientific applications modelled as workflows, schedulers need information on the application demands as well as on the resource availability, especially those regarding the available bandwidth. However, the lack of precision of bandwidth estimates provided by monitoring/measurement tools should be considered by the scheduler to achieve near-optimal schedules. Uncertainties of available bandwidth can be a result of imprecise measurement and monitoring network tools and/or their incapacity of estimating in advance the real value of the available bandwidth expected for the application during the scheduling step of the application. Schedulers specially designed for hybrid clouds simply ignore the inaccuracies of the given estimates and end up producing non-robust, low-performance schedules, which makes them sensitive to the uncertainties stemming from using these networking tools. This thesis introduces a proactive procedure to provide a certain level of robustness for schedules derived from schedulers that were not designed to be robust in the face of uncertainties of bandwidth estimates stemming from using unreliable networking tools. To make non-robust schedulers into robust schedulers, the procedure applies a deflation on imprecise bandwidth estimates before being used as input to non-robust schedulers. By proposing the use of refined (deflated) estimates of the available bandwidth, non-robust schedulers initially sensitive to these uncertainties started to produce robust schedules that are insensitive to these inaccuracies. The effectiveness and efficiency of the procedure in providing robustness to non-robust schedulers are evaluated through simulation. Schedules generated by induced-robustness schedulers through the use of the procedure is compared to that of produced by sensitive schedulers. In addition, this thesis also introduces a flexible scheduler for a special case of scientific applications modelled as a set of workflows grouped into ensembles. Although the novelty of this scheduler is the replacement of objective functions according to the user's needs, it is still a non-robust scheduler. However, the procedure was able to provide the necessary robustness for this flexible scheduler be able to produce robust schedules under uncertain bandwidth estimates. It is shown in this thesis that the proposed procedure enhanced the robustness of workflow schedulers designed especially for hybrid clouds as they started to produce robust schedules in the presence of uncertainties stemming from using networking tools. The proposed procedure is an important tool to furnish robustness to non-robust schedulers that are originally designed to work in a computational environment where bandwidth estimates are very likely to vary and cannot be estimated precisely in advance, bringing, therefore, improvements to the executions of scientific applications in hybrid cloudsDoutoradoCiência da ComputaçãoDoutor em Ciência da Computação2012/02778-6FAPES

    Maestro: Achieving scalability and coordination in centralizaed network control plane

    Get PDF
    Modem network control plane that supports versatile communication services (e.g. performance differentiation, access control, virtualization, etc.) is highly complex. Different control components such as routing protocols, security policy enforcers, resource allocation planners, quality of service modules, and more, are interacting with each other in the control plane to realize complicated control objectives. These different control components need to coordinate their actions, and sometimes they could even have conflicting goals which require careful handling. Furthermore, a lot of these existing components are distributed protocols running on large number of network devices. Because protocol state is distributed in the network, it is very difficult to tightly coordinate the actions of these distributed control components, thus inconsistent control actions could create serious problems in the network. As a result, such complexity makes it really difficult to ensure the optimality and consistency among all different components. Trying to address the complexity problem in the network control plane, researchers have proposed different approaches, and among these the centralized control plane architecture has become widely accepted as a key to solve the problem. By centralizing the control functionality into a single management station, we can minimize the state distributed in the network, thus have better control over the consistency of such state. However, the centralized architecture has fundamental limitations. First, the centralized architecture is more difficult to scale up to large network size or high requests rate. In addition, it is equally important to fairly service requests and maintain low request-handling latency, while at the same time having highly scalable throughput. Second, the centralized routing control is neither as responsive nor as robust to failures as distributed routing protocols. In order to enhance the responsiveness and robustness, one approach is to achieve the coordination between the centralized control plane and distributed routing protocols. In this thesis, we develop a centralized network control system, called Maestro, to solve the fundamental limitations of centralized network control plane. First we use Maestro as the central controller for a flow-based routing network, in which large number of requests are being sent to the controller at very high rate for processing. Such a network requires the central controller to be extremely scalable. Using Maestro, we systematically explore and study multiple design choices to optimally utilize modern multi-core processors, to fairly distribute computation resource, and to efficiently amortize unavoidable overhead. We show a Maestro design based on the abstraction that each individual thread services switches in a round-robin manner, can achieve excellent throughput scalability while maintaining far superior and near optimal max-min fairness. At the same time, low latency even at high throughput is achieved by Maestro's workload-adaptive request batching. Second, we use Maestro to achieve the coordination between centralized controls and distributed routing protocols in a network, to realize a hybrid control plane framework which is more responsive and robust than a pure centralized control plane, and more globally optimized and consistent than a pure distributed control plane. Effectively we get the advantages of both the centralized and the distributed solutions. Through experimental evaluations, we show that such coordination between the centralized controls and distributed routing protocols can improve the SLA compliance of the entire network

    Queue-priority optimized algorithm: a novel task scheduling for runtime systems of application integration platforms

    Get PDF
    The need for integration of applications and services in business processes from enterprises has increased with the advancement of cloud and mobile applications. Enterprises started dealing with high volumes of data from the cloud and from mobile applications, besides their own. This is the reason why integration tools must adapt themselves to handle with high volumes of data, and to exploit the scalability of cloud computational resources without increasing enterprise operations costs. Integration platforms are tools that integrate enterprises’ applications through integration processes, which are nothing but workflows composed of a set of atomic tasks connected through communication channels. Many integration platforms schedule tasks to be executed by computational resources through the First-in-first-out heuristic. This article proposes a Queue-priority algorithm that uses a novel heuristic and tackles high volumes of data in the task scheduling of integration processes. This heuristic is optimized by the Particle Swarm Optimization computational method. The results of our experiments were confirmed by statistical tests, and validated the proposal as a feasible alternative to improve integration platforms in the execution of integration processes under a high volume of data.info:eu-repo/semantics/acceptedVersio

    Automatic Latency Management for {ROS 2}: {B}enefits, Challenges, and Open Problems

    Get PDF

    Wireless Data Acquisition for Edge Learning: Data-Importance Aware Retransmission

    Full text link
    By deploying machine-learning algorithms at the network edge, edge learning can leverage the enormous real-time data generated by billions of mobile devices to train AI models, which enable intelligent mobile applications. In this emerging research area, one key direction is to efficiently utilize radio resources for wireless data acquisition to minimize the latency of executing a learning task at an edge server. Along this direction, we consider the specific problem of retransmission decision in each communication round to ensure both reliability and quantity of those training data for accelerating model convergence. To solve the problem, a new retransmission protocol called data-importance aware automatic-repeat-request (importance ARQ) is proposed. Unlike the classic ARQ focusing merely on reliability, importance ARQ selectively retransmits a data sample based on its uncertainty which helps learning and can be measured using the model under training. Underpinning the proposed protocol is a derived elegant communication-learning relation between two corresponding metrics, i.e., signal-to-noise ratio (SNR) and data uncertainty. This relation facilitates the design of a simple threshold based policy for importance ARQ. The policy is first derived based on the classic classifier model of support vector machine (SVM), where the uncertainty of a data sample is measured by its distance to the decision boundary. The policy is then extended to the more complex model of convolutional neural networks (CNN) where data uncertainty is measured by entropy. Extensive experiments have been conducted for both the SVM and CNN using real datasets with balanced and imbalanced distributions. Experimental results demonstrate that importance ARQ effectively copes with channel fading and noise in wireless data acquisition to achieve faster model convergence than the conventional channel-aware ARQ.Comment: This is an updated version: 1) extension to general classifiers; 2) consideration of imbalanced classification in the experiments. Submitted to IEEE Journal for possible publicatio

    Runtime Adaptation of Scientific Service Workflows

    Get PDF
    Software landscapes are rather subject to change than being complete after having been built. Changes may be caused by a modified customer behavior, the shift to new hardware resources, or otherwise changed requirements. In such situations, several challenges arise. New architectural models have to be designed and implemented, existing software has to be integrated, and, finally, the new software has to be deployed, monitored, and, where appropriate, optimized during runtime under realistic usage scenarios. All of these situations often demand manual intervention, which causes them to be error-prone. This thesis addresses these types of runtime adaptation. Based on service-oriented architectures, an environment is developed that enables the integration of existing software (i.e., the wrapping of legacy software as web services). A workflow modeling tool that aims at an easy-to-use approach by separating the role of the workflow expert and the role of the domain expert. After the development of workflows, tools that observe the executing infrastructure and perform automatic scale-in and scale-out operations are presented. Infrastructure-as-a-Service providers are used to scale the infrastructure in a transparent and cost-efficient way. The deployment of necessary middleware tools is automatically done. The use of a distributed infrastructure can lead to communication problems. In order to keep workflows robust, these exceptional cases need to treated. But, in this way, the process logic of a workflow gets mixed up and bloated with infrastructural details, which yields an increase in its complexity. In this work, a module is presented that can deal automatically with infrastructural faults and that thereby allows to keep the separation of these two layers. When services or their components are hosted in a distributed environment, some requirements need to be addressed at each service separately. Although techniques as object-oriented programming or the usage of design patterns like the interceptor pattern ease the adaptation of service behavior or structures. Still, these methods require to modify the configuration or the implementation of each individual service. On the other side, aspect-oriented programming allows to weave functionality into existing code even without having its source. Since the functionality needs to be woven into the code, it depends on the specific implementation. In a service-oriented architecture, where the implementation of a service is unknown, this approach clearly has its limitations. The request/response aspects presented in this thesis overcome this obstacle and provide a SOA-compliant and new methods to weave functionality into the communication layer of web services. The main contributions of this thesis are the following: Shifting towards a service-oriented architecture: The generic and extensible Legacy Code Description Language and the corresponding framework allow to wrap existing software, e.g., as web services, which afterwards can be composed into a workflow by SimpleBPEL without overburdening the domain expert with technical details that are indeed handled by a workflow expert. Runtime adaption: Based on the standardized Business Process Execution Language an automatic scheduling approach is presented that monitors all used resources and is able to automatically provision new machines in case a scale-out becomes necessary. If the resource's load drops, e.g., because of less workflow executions, a scale-in is also automatically performed. The scheduling algorithm takes the data transfer between the services into account in order to prevent scheduling allocations that eventually increase the workflow's makespan due to unnecessary or disadvantageous data transfers. Furthermore, a multi-objective scheduling algorithm that is based on a genetic algorithm is able to additionally consider cost, in a way that a user can define her own preferences rising from optimized execution times of a workflow and minimized costs. Possible communication errors are automatically detected and, according to certain constraints, corrected. Adaptation of communication: The presented request/response aspects allow to weave functionality into the communication of web services. By defining a pointcut language that only relies on the exchanged documents, the implementation of services must neither be known nor be available. The weaving process itself is modeled using web services. In this way, the concept of request/response aspects is naturally embedded into a service-oriented architecture

    Possibilistic decision theory: from theoretical foundations to influence diagrams methodology

    Get PDF
    Le domaine de prise de décision est un domaine multidisciplinaire en relation avec plusieurs disciplines telles que l'économie, la recherche opérationnelle, etc. La théorie de l'utilité espérée a été proposée pour modéliser et résoudre les problèmes de décision. Ces théories ont été mises en cause par plusieurs paradoxes (Allais, Ellsberg) qui ont montré les limites de son applicabilité. Par ailleurs, le cadre probabiliste utilisé dans ces théories s'avère non approprié dans certaines situations particulières (ignorance totale, incertitude qualitative). Pour pallier ces limites, plusieurs travaux ont été élaborés concernant l'utilisation des intégrales de Choquet et de Sugeno comme critères de décision d'une part et l'utilisation d'une théorie d'incertitude autre que la théorie des probabilités pour la modélisation de l'incertitude d'une autre part. Notre idée principale est de profiter de ces deux directions de recherche afin de développer, dans le cadre de la décision séquentielle, des modèles de décision qui se basent sur les intégrales de Choquet comme critères de décision et sur la théorie des possibilités pour la représentation de l'incertitude. Notre objectif est de développer des modèles graphiques décisionnels, qui représentent des modèles compacts et simples pour la prise de décision dans un contexte possibiliste. Nous nous intéressons en particulier aux arbres de décision et aux diagrammes d'influence possibilistes et à leurs algorithmes d'évaluation.The field of decision making is a multidisciplinary field in relation with several disciplines such as economics, operations research, etc. Theory of expected utility has been proposed to model and solve decision problems. These theories have been questioned by several paradoxes (Allais, Ellsberg) who have shown the limits of its applicability. Moreover, the probabilistic framework used in these theories is not appropriate in particular situations (total ignorance, qualitative uncertainty). To overcome these limitations, several studies have been developed basing on the use of Choquet and Sugeno integrals as decision criteria and a non classical theory to model uncertainty. Our main idea is to use these two lines of research to develop, within the framework of sequential decision making, decision models based on Choquet integrals as decision criteria and possibility theory to represent uncertainty. Our goal is to develop graphical decision models that represent compact models for decision making when uncertainty is represented using possibility theory. We are particularly interested by possibilistic decision trees and influence diagrams and their evaluation algorithms