16 research outputs found

    Management of generic and multi-platform workflows for exploiting heterogeneous environments on e-Science

    Full text link
    Scientific Workflows (SWFs) are widely used to model applications in e-Science. In this programming model, scientific applications are described as a set of tasks that have dependencies among them. During the last decades, the execution of scientific workflows has been successfully performed in the available computing infrastructures (supercomputers, clusters and grids) using software programs called Workflow Management Systems (WMSs), which orchestrate the workload on top of these computing infrastructures. However, because each computing infrastructure has its own architecture and each scientific applications exploits efficiently one of these infrastructures, it is necessary to organize the way in which they are executed. WMSs need to get the most out of all the available computing and storage resources. Traditionally, scientific workflow applications have been extensively deployed in high-performance computing infrastructures (such as supercomputers and clusters) and grids. But, in the last years, the advent of cloud computing infrastructures has opened the door of using on-demand infrastructures to complement or even replace local infrastructures. However, new issues have arisen, such as the integration of hybrid resources or the compromise between infrastructure reutilization and elasticity, everything on the basis of cost-efficiency. The main contribution of this thesis is an ad-hoc solution for managing workflows exploiting the capabilities of cloud computing orchestrators to deploy resources on demand according to the workload and to combine heterogeneous cloud providers (such as on-premise clouds and public clouds) and traditional infrastructures (supercomputers and clusters) to minimize costs and response time. The thesis does not propose yet another WMS, but demonstrates the benefits of the integration of cloud orchestration when running complex workflows. The thesis shows several configuration experiments and multiple heterogeneous backends from a realistic comparative genomics workflow called Orthosearch, to migrate memory-intensive workload to public infrastructures while keeping other blocks of the experiment running locally. The running time and cost of the experiments is computed and best practices are suggested.Los flujos de trabajo científicos son comúnmente usados para modelar aplicaciones en e-Ciencia. En este modelo de programación, las aplicaciones científicas se describen como un conjunto de tareas que tienen dependencias entre ellas. Durante las últimas décadas, la ejecución de flujos de trabajo científicos se ha llevado a cabo con éxito en las infraestructuras de computación disponibles (supercomputadores, clústers y grids) haciendo uso de programas software llamados Gestores de Flujos de Trabajos, los cuales distribuyen la carga de trabajo en estas infraestructuras de computación. Sin embargo, debido a que cada infraestructura de computación posee su propia arquitectura y cada aplicación científica explota eficientemente una de estas infraestructuras, es necesario organizar la manera en que se ejecutan. Los Gestores de Flujos de Trabajo necesitan aprovechar el máximo todos los recursos de computación y almacenamiento disponibles. Habitualmente, las aplicaciones científicas de flujos de trabajos han sido ejecutadas en recursos de computación de altas prestaciones (tales como supercomputadores y clústers) y grids. Sin embargo, en los últimos años, la aparición de las infraestructuras de computación en la nube ha posibilitado el uso de infraestructuras bajo demanda para complementar o incluso reemplazar infraestructuras locales. No obstante, este hecho plantea nuevas cuestiones, tales como la integración de recursos híbridos o el compromiso entre la reutilización de la infraestructura y la elasticidad, todo ello teniendo en cuenta que sea eficiente en el coste. La principal contribución de esta tesis es una solución ad-hoc para gestionar flujos de trabajos explotando las capacidades de los orquestadores de recursos de computación en la nube para desplegar recursos bajo demando según la carga de trabajo y combinar proveedores de computación en la nube heterogéneos (privados y públicos) e infraestructuras tradicionales (supercomputadores y clústers) para minimizar el coste y el tiempo de respuesta. La tesis no propone otro gestor de flujos de trabajo más, sino que demuestra los beneficios de la integración de la orquestación de la computación en la nube cuando se ejecutan flujos de trabajo complejos. La tesis muestra experimentos con diferentes configuraciones y múltiples plataformas heterogéneas, haciendo uso de un flujo de trabajo real de genómica comparativa llamado Orthosearch, para traspasar cargas de trabajo intensivas de memoria a infraestructuras públicas mientras se mantienen otros bloques del experimento ejecutándose localmente. El tiempo de respuesta y el coste de los experimentos son calculados, además de sugerir buenas prácticas.Els fluxos de treball científics són comunament usats per a modelar aplicacions en e-Ciència. En aquest model de programació, les aplicacions científiques es descriuen com un conjunt de tasques que tenen dependències entre elles. Durant les últimes dècades, l'execució de fluxos de treball científics s'ha dut a terme amb èxit en les infraestructures de computació disponibles (supercomputadors, clústers i grids) fent ús de programari anomenat Gestors de Fluxos de Treballs, els quals distribueixen la càrrega de treball en aquestes infraestructures de computació. No obstant açò, a causa que cada infraestructura de computació posseeix la seua pròpia arquitectura i cada aplicació científica explota eficientment una d'aquestes infraestructures, és necessari organitzar la manera en què s'executen. Els Gestors de Fluxos de Treball necessiten aprofitar el màxim tots els recursos de computació i emmagatzematge disponibles. Habitualment, les aplicacions científiques de fluxos de treballs han sigut executades en recursos de computació d'altes prestacions (tals com supercomputadors i clústers) i grids. No obstant açò, en els últims anys, l'aparició de les infraestructures de computació en el núvol ha possibilitat l'ús d'infraestructures sota demanda per a complementar o fins i tot reemplaçar infraestructures locals. No obstant açò, aquest fet planteja noves qüestions, tals com la integració de recursos híbrids o el compromís entre la reutilització de la infraestructura i l'elasticitat, tot açò tenint en compte que siga eficient en el cost. La principal contribució d'aquesta tesi és una solució ad-hoc per a gestionar fluxos de treballs explotant les capacitats dels orquestadors de recursos de computació en el núvol per a desplegar recursos baix demande segons la càrrega de treball i combinar proveïdors de computació en el núvol heterogenis (privats i públics) i infraestructures tradicionals (supercomputadors i clústers) per a minimitzar el cost i el temps de resposta. La tesi no proposa un gestor de fluxos de treball més, sinó que demostra els beneficis de la integració de l'orquestració de la computació en el núvol quan s'executen fluxos de treball complexos. La tesi mostra experiments amb diferents configuracions i múltiples plataformes heterogènies, fent ús d'un flux de treball real de genòmica comparativa anomenat Orthosearch, per a traspassar càrregues de treball intensives de memòria a infraestructures públiques mentre es mantenen altres blocs de l'experiment executant-se localment. El temps de resposta i el cost dels experiments són calculats, a més de suggerir bones pràctiques.Carrión Collado, AA. (2017). Management of generic and multi-platform workflows for exploiting heterogeneous environments on e-Science [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/86179TESI

    Development of high performance computing cluster for evaluation of sequence alignment algorithms

    Get PDF
    As the biological databases are increasing rapidly, there is a challenge for both Biologists and Computer Scientists to develop algorithms and databases to manage the increasing data. There are many algorithms developed to align the sequences stored in biological databases - some take time to process the data while others are inefficient to produce reasonable results. As more data is generated, and time consuming algorithms are developed to handle them, there is a need for specialized computers to handle the computations. Researchers are typically limited by the computational power of their computers. High Performance Computing (HPC) field addresses this challenge and can be used in a cost-effective manner where there is no need for expensive equipment, instead old computers can be used together to form a powerful system. This is the premise of this research, wherein the setup of a low-cost Beowulf cluster is explored, with the subsequent evaluation of its performance for processing sequent alignment algorithms. A mixed method methodology is used in this dissertation, which consists of literature study, theoretical and practise based system. This mixed method methodology also have a proof and concept where the Beowulf cluster is designed and implemented to perform the sequence alignment algorithms and also the performance test. This dissertation firstly gives an overview of sequence alignment algorithms that are already developed and also highlights their timeline. A presentation of the design and implementation of the Beowulf Cluster is highlighted and this is followed by the experiments on the baseline performance of the cluster. A detailed timeline of the sequence alignment algorithms is given and also the comparison between ClustalW-MPI and T-Coffee (Tree-based Consistency Objective Function For alignment Evaluation) algorithm is presented as part of the findings in the research study. The efficiency of the cluster was observed to be 19.8%, this percentage is unexpected because the predicted efficiency is 83.3%, which is found in the theoretical cluster calculator. The theoretical performance of the cluster showed a high performance as compared with the experimental performance, this is attributable to the slow network, which was 100Mbps, low processor speed of 2.50 GHz, and low memory of 2 Gigabytes

    Microgrids: Planning, Protection and Control

    Get PDF
    This Special Issue will include papers related to the planning, protection, and control of smart grids and microgrids, and their applications in the industry, transportation, water, waste, and urban and residential infrastructures. Authors are encouraged to present their latest research; reviews on topics including methods, approaches, systems, and technology; and interfaces to other domains such as big data, cybersecurity, human–machine, sustainability, and smart cities. The planning side of microgrids might include technology selection, scheduling, interconnected microgrids, and their integration with regional energy infrastructures. The protection side of microgrids might include topics related to protection strategies, risk management, protection technologies, abnormal scenario assessments, equipment and system protection layers, fault diagnosis, validation and verification, and intelligent safety systems. The control side of smart grids and microgrids might include control strategies, intelligent control algorithms and systems, control architectures, technologies, embedded systems, monitoring, and deployment and implementation

    Cost-effective resource management for distributed computing

    Get PDF
    Current distributed computing and resource management infrastructures (e.g., Cluster and Grid) suffer from a wide variety of problems related to resource management, which include scalability bottleneck, resource allocation delay, limited quality-of-service (QoS) support, and lack of cost-aware and service level agreement (SLA) mechanisms. This thesis addresses these issues by presenting a cost-effective resource management solution which introduces the possibility of managing geographically distributed resources in resource units that are under the control of a Virtual Authority (VA). A VA is a collection of resources controlled, but not necessarily owned, by a group of users or an authority representing a group of users. It leverages the fact that different resources in disparate locations will have varying usage levels. By creating smaller divisions of resources called VAs, users would be given the opportunity to choose between a variety of cost models, and each VA could rent resources from resource providers when necessary, or could potentially rent out its own resources when underloaded. The resource management is simplified since the user and owner of a resource recognize only the VA because all permissions and charges are associated directly with the VA. The VA is controlled by a ’rental’ policy which is supported by a pool of resources that the system may rent from external resource providers. As far as scheduling is concerned, the VA is independent from competitors and can instead concentrate on managing its own resources. As a result, the VA offers scalable resource management with minimal infrastructure and operating costs. We demonstrate the feasibility of the VA through both a practical implementation of the prototype system and an illustration of its quantitative advantages through the use of extensive simulations. First, the VA concept is demonstrated through a practical implementation of the prototype system. Further, we perform a cost-benefit analysis of current distributed resource infrastructures to demonstrate the potential cost benefit of such a VA system. We then propose a costing model for evaluating the cost effectiveness of the VA approach by using an economic approach that captures revenues generated from applications and expenses incurred from renting resources. Based on our costing methodology, we present rental policies that can potentially offer effective mechanisms for running distributed and parallel applications without a heavy upfront investment and without the cost of maintaining idle resources. By using real workload trace data, we test the effectiveness of our proposed rental approaches. Finally, we propose an extension to the VA framework that promotes long-term negotiations and rentals based on service level agreements or long-term contracts. Based on the extended framework, we present new SLA-aware policies and evaluate them using real workload traces to demonstrate their effectiveness in improving rental decisions

    Framework para a construção de “portais de negócio” para gestão de solicitações de consumidores IaaS na HP Cloud

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia InformáticaO HP CloudSystem Matrix (CSM) faz parte de uma pilha de software HP para computação na cloud que cobre todos os níveis de serviço considerados relevantes: IaaS (Infra-estrutura como Serviço), PaaS (Plataforma como Serviço) e SaaS (Software como Serviço). Apesar de ser a base desta pilha, i.e., oferecer o nível IaaS, é um produto extremamente complexo pois interage com todas as infra-estruturas: as computacionais (i.e., servidores físicos ou virtuais), as de armazenamento (do disco interno aos discos em servidores de armazenamento), e as de interligação (redes Ethernet e FC). Apesar de toda a complexidade da infra-estrutura, real e virtual, que gere, o CSM torna conceptualmente simples a entrega aos consumidores de infra-estruturas para suporte a aplicações: 1) o administrador define que recursos da infra-estrutura estão disponíveis para integrar a “oferta cloud”; 2) o arquitecto define templates para as arquitecturas que considera adequadas para necessidades dos consumidores (e.g., arquitectura 3-tier para uma solução ERP - Enterprise Resource Planning); e 3) o consumidor escolhe o template que melhor se ajusta às suas necessidades e efectua um pedido de aprovisionamento da infra-estrutura. A interacção entre os diferentes interlocutores (1), (2), (3) e o CSM é fundamentalmente realizada sobre portais; contudo, especialmente no caso do consumidor, o portal disponibilizado pelo produto tem sido considerado como “complexo”, por apresentar informação demasiado técnica, “rígido”, por não poder ser customizado (por exemplo para suprimir a “informação demasiado técnica”), e “grosseiro” por não permitir a especificação mais fina das características da infra-estrutura que se quer aprovisionar (por exemplo, permite variar o número de CPUs e a quantidade de memória de um servidor, mas não permite escolher a tecnologia dos discos que se pretendem aprovisionar, e.g., SSD em vez de FC, 15K em vez de 10K rpm). Assim, o objectivo final da dissertação é desenvolver um framework que permita, com base num conjunto (extensível e configurável) de opções pré-definidas e em layouts customizáveis, definir portais que se integram com o HP CloudSystem Matrix e que permitam aos utilizadores (consumidores) uma interacção não só mais simples, mas também mais versátil. Neste trabalho, são abordados os modelos de serviço e de implantação (deployment) de clouds; a virtualização (não somente de servidores, mas também de armazenamento e de redes), pedra base de toda a tecnologia cloud; e os módulos e APIs disponíveis para interoperar com o CSM, nomeadamente API-MOE e API-VMware. Por fim, é apresentada uma framework com uma arquitectura multicamada (N-tier) implementada com tecnologias padrão: TCP/IP para a pilha de comunicações, REST (Representational State Transfer) para regular a interacção e troca de informação cliente/servidor e XML (Extensible Markup Language) e JSON (JavaScript Object Notation) como formatos de dados

    Arquitetura de um controlador de SLA para ambiente de nuvens federadas

    Get PDF
    Dissertação (mestrado)—Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Ciência da Computação, 2017.Com o aumento de investimentos no setor da computação em nuvem e a disputa forte por consumidores que utilizem serviços em nuvens, ofertas de serviços surgem cada vez mais acirradas para conquistar o público consumidor deste mercado. E para continuar na disputa desse mercado, alguns provedores de nuvens se unem para formar federações. Esse processo envolve contratos, que visam garantir que esses provedores de serviços cumpram o que estão ofertando. Esses contratos denominados de Acordo de Nível de Serviço (em inglês, Service Level Agreement - SLA), são contratos que identificam as partes envolvidas em um negócio, além de especificar o mínimo de expectativas e limites que existe entre as partes, buscando melhorar a qualidade de serviço e a relação entre cliente e provedor. Assim, para melhor atender usuários de plataformas de nuvens federadas, este trabalho propõe a implementação de um controlador de SLA capaz de gerenciar os acordos de nível de serviço entre as nuvens federadas e os usuários de uma plataforma. Esse controlador deve trabalhar de forma dinâmica, automática, transparente e simples. Para comprovar a eficiência do controlador proposto foi utilizado como estudo de caso a plataforma BioNimbuZ para a implementação deste trabalho.With increasing investments in the cloud computing industry and the evergrowing batle for market share, cloud service offferings are becoming ever more fierce. In order to continue to dispute this market, some cloud providers have come together to form federations. This process involves contracts, which to ensure that these service providers comply with what they are offering. The contracts signed between the parties involved, the Service Level Agreement (SLA), is an IT service contract that specifies the minimum expectations and obligations that exist between the provider and the customer, aiming to improve the quality of service and the relationship between client and provider. Thus, to better meet the needs of users of federated cloud platforms, this work proposes the implementation of an SLA controller capable of managing service level agreements between federated clouds and users of a platform. This controller should work dynamically, automatically, transparently and simply. In order to prove the efficiency of the propose SLA controller the platform BioNimbuZ was used as a case study for the implementation of this work
    corecore