21 research outputs found

    Tolerância a falhas em sistemas MPI com grupos dinâmicos de processos recomendados e registro de mensagens distribuído baseado em paxos

    Get PDF
    Orientador : Prof. Dr. Elias P. Duarte Jr.Tese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa: Curitiba, 11/05/2017Inclui referências : f. 93-103Área de concentração : Ciência da computaçãoResumo: Os sistemas HPC (High-Performance Computing) são geralmente empregados para executar aplicações de longa duração, incluindo, por exemplo, simulações científicas e industriais complexas. Construir sistemas HPC tolerante a falhas permanece um desafio à medida que o tamanho desses sistemas aumenta. Esta tese de doutorado apresenta duas estratégias de tolerância a falhas para sistemas HPC baseados em MPI. A primeira contribuição apresenta uma solução para lidar com a variabilidade de desempenho que afeta negativamente ou inviabiliza a execução das aplicações HPC. Este é o caso dos clusters compartilhados onde um nodo computacional pode se tornar muito lento e comprometer a execução de toda a aplicação. Esta tese propõe um novo modelo de diagnóstico em nível de sistema onde os processos executam testes entre si a fim de determinar se são recomendados ou não-recomendados. Os processos classificados como recomendados formam um grupo dinâmico, chamado de DGRP (Dynamic Group of Recommended Processes), e são responsáveis por executar a aplicação. Os processos testados como não-recomendados são removidos do DGRP. Um processo pode reingressar ao DGRP após uma rodada de consenso executada pelos processos do DGRP. O modelo foi implementado e empregado para monitorar os processos em um cluster compartilhado multiusuário. No estudo de caso apresentado, os processos do DGRP executam o algoritmo de ordenação paralela Hyperquicksort. O Hyperquicksort é implementado e adaptado para se reconfigurar em tempo de execução a fim de suportar até n ?? 1 processos não-recomendados (em um sistema com n processos). Os resultados obtidos demonstram a sua eficiência. A segunda contribuição desta tese se insere na técnica de rollback-recovery na sua variante chamada de registro de mensagens. O registro de mensagens não requer a sincronização dos processos para salvar o estado da aplicação e evita que todos os processos reiniciem a partir do último estado salvo. No entanto, a maioria dos protocolos de registro de mensagens conta com um componente centralizado e que não tolera falhas, chamado de event logger, para armazenar as informações de recuperação, isto é, os determinantes. Esta tese de doutorado propõe o primeiro event logger distribuído e tolerante a falhas para os protocolos de registro de mensagens. Duas implementações baseadas no algoritmo de consenso Paxos, chamadas de Paxos Clássico e Paxos Paralelo, foram realizadas para o event logger. Um protocolo pessimista de registro de mensagens é construído e implementado para interagir com o event logger proposto e realizar a recuperação automática das aplicações MPI. O desempenho dos event loggers é avaliado perante a aplicação AMG (Algebraic MultiGrid) e as aplicações do NAS Parallel benchmark. A recuperação é avaliada através do algoritmo paralelo de Gusfield e a aplicação AMG. Resultados demonstram que o event logger baseado em Paxos Paralelo tem desempenho comparável ou superior ao da abordagem centralizada e que o protocolo proposto realiza a recuperação da aplicação eficientemente. Palavras-chave: Tolerância a Falhas em MPI, DGRP, Registro de Mensagens, Paxos Paralelo.Abstract: HPC systems are employed to execute long-running applications including, for example, complex industrial and scientific simulations. Building robust, fault-tolerant HPC systems remains a challenge as the size of the system grows. This doctoral thesis presents two faulttolerant strategies for HPC systems based on MPI. Our first contribution presents a solution to deal with the performance variation of HPC system processes that negatively a_ect or even prevent the execution of HPC applications. This is the case in shared clusters in which a single node can become too slow and can thus compromise the entire application execution. This thesis proposes a new system-level diagnosis model in which processes execute tests among themselves in order to determine whether they are recommended or non-recommended. Processes classified as recommended form a Dynamic Group of Recommended Processes (DGRP), which is responsible for running the application. A process can rejoin the DGRP after a round of consensus executed by the DGRP processes. The model was implemented and used to monitor processes in a shared multi-user cluster. In the case study presented, the DGRP processes execute the parallel sorting algorithm Hyperquicksort. Hyperquicksort is implemented and adapted to reconfigure itself at runtime in order to proceed even if up to N ?? 1 processes become non-recommended (N is the total number of processes). Results are presented showing that the strategy is e_cient. The second contribution of this thesis is in the field of the rollbackrecovery technique in its variant based on message logging. Message logging does not require all processes to coordinate in order to save their states during normal execution. Neither does it require to restart all processes from the last saved states after a single process fails. However, most existing message logging protocols rely on a centralized entity which does not tolerate failures, called event logger, which stores recovery information called determinants. This thesis proposes, to the best of our knowledge, the first distributed and fault-tolerant event logger. Two implementations are presented based on the Paxos consensus algorithm, called Classic Paxos and Parallel Paxos. A pessimistic message logging protocol is built and implemented based on the proposed event logger to perform automatic recovery of MPI applications after failures. We evaluate the performance of the event logger using both the AMG (Algebraic MultiGrid) application and NAS Parallel benchmark applications. Application recovery is evaluated in two case studies based on Gusfield's parallel cut tree algorithm and the AMG application. Results show that the event logger based on Parallel Paxos performs as well as or better than a centralized event logger and that the proposed recovery protocol is also e_cient. Keywords: Fault Tolerance in MPI, DGRP, Message Logging, Parallel Paxos

    Transposição de autenticação em arquiteturas orientadas a serviços através de identidades federadas

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Engenharia Elétrica.A transposição de autenticação consiste, basicamente, em permitir o acesso a recursos controlados em um domínio diferente daquele do cliente, usando credenciais fornecidas no seu domínio de origem. No entanto, grandes são os empecilhos para a concretização de tais sistemas e vão desde a falta de interoperabilidade entre protocolos, linguagens de programação, plataformas operacionais, utilizadas no seu projeto, até as diferentes infra-estruturas de segurança subjacentes. Além disso, a segurança computacional (computer security) se apresenta como um requisito indispensável neste cenário. Embora Serviços Web (Web Services) se configurem como a tecnologia mais indicada para contornar os desafios da transposição, o grande número de especificações e padrões, destinados a Serviços Web, tanto impõe um certo nível de complexidade quanto dificulta a sua adoção. Algumas propostas ainda são imaturas e carecem de um estudo e implementação que abordem as soluções como um todo para uma boa compreensão do potencial e das limitações da tecnologia. Para que a identidade do usuário seja reconhecida em diversos domínios administrativos, o gerenciamento de identidades federadas desponta como uma solução promissora. Além disso, devido a dinâmica do ambiente dos Serviços Web, no qual serviços muitas vezes são agregados de forma temporária para atender um determinado fluxo de negócios, o gerenciamento de identidade ganha cada vez mais destaque na literatura. O objetivo deste trabalho é propor um modelo para a transposição de autenticação - e a conseqüente autorização descentralizada e transposição de atributos do cliente - entre domínios administrativos com diferentes infra-estruturas de segurança. Neste modelo, cada domínio agrupa clientes e provedores de serviço de acordo com a tecnologia de segurança usada. São assumidos, na construção do modelo, padrões relacionados a tecnologias de Serviços Web e os conceitos de gerenciamento de identidades federadas. Um protótipo é implementado a fim de verificar a aplicabilidade do modelo. Authentication transposition means, basically, to allow access to secure resources at a domain distinct from that client, using credential provided at your original domain. However, it#s considerably hard to conclude those systems because of the lack of interoperability between protocols, programming languages, operational platforms, used in the project, and the distinct underlying security infra-structure. Also, the computer security is a crucial requirement in this scenario. ThoughWeb Services is the more indicated technology to enfold the challenges of transposition, the large amount of specifications and patterns, designated to Web Services, imposes a certain level of complexity and makes it harder to adopt this technology. Some propositions are still immature and need more study and implementation that approaches the solutions as a whole. This will allow a good understanding of the potential and limitation of the technology. For the users# identity to be acknowledged in various administration domains, managing federated domains becomes a promising solution. Besides, because of the Web Services environments dynamic, which services usually are aggregated temporarily to attend a specific business flow, identity managing is cited more and more in the literature. The purpose of this thesis is to propose a model for authentication transposition - and the consequent decentralized authorization and transposition of the client attributes. In this model, each domain groups clients and services providers according to the security technology in use. In the model definition, Web Services patterns and federated identity management concepts are used. A prototype is implemented in order to verify the applicability model

    Implementação de Difusão Atômica Baseada em Diagnóstico com Testes Imperfeitos

    Get PDF
    Este trabalho descreve duas implementações da difusão atômica em um conjunto de processos identificados como estáveis. A primeira implementação da difusão atômica é baseada no algoritmo de consenso Paxos. A segunda implementação assume um processo estável como sequenciador fixo. Os processos estáveis são identificados e mantidos por meio de uma estratégia de testes definida em um novo modelo de diagnóstico em nível de sistema que se apoia em testes ditos imperfeitos. Este trabalho descreve a implementação da estratégia de testes. Resultados de simulação mostram o desempenho da estratégia de seleção de nodos estáveis, bem como uma comparação experimental do número de mensagens e a latência das duas implementações da difusão atômica

    Produtividade da soja em associação ao fungo micorrízico arbuscular Rhizophagus clarus cultivada em condições de campo

    Get PDF
    Soybeans are key to agribusiness progress, but their production can be affected by climate change. Thus, alternatives that increase the yield of the plants under adverse conditions are fundamental, and the arbuscular mycorrhizal fungi (FMA) stand out. Therefore, they associate the roots of the plants, increasing the absorption of water and nutrients. Thus, the objective of this work was to evaluate soybean yield in the field experiment in association with FMA Rhizophagus clarus under conditions of irrigated and non-irrigated system. In the end, agronomic and symbiosis parameters were evaluated with FMA. The experimental design was in randomized blocks with subdivided plots, the means obtained were submitted to analysis of variance and compared by the Tukey test (5%), using SISVAR software. Soybean plants when associated with FMA and cultivated in non-irrigated conditions, obtained higher productivity than plants in the irrigated system and weight of 1000 grains. In this way, it is concluded that inoculation benefits soybean yield under non-irrigated system conditions.A soja é fundamental para o progresso do agronegócio, porém sua produtividade pode ser afetada pelas mudanças climáticas. Assim, alternativas que aumentem o rendimento das plantas em condições adversas são fundamentais, e os fungos micorrízicos arbusculares (FMA) destacam-se, pois, associam-se as raízes das plantas aumentando a absorção de água e nutrientes. Assim, o objetivo deste trabalho foi avaliar a produtividade de plantas de soja a campo experimental em associação com o FMA Rhizophagus clarus sob condição de sistema irrigado e não irrigado. Ao final, avaliou-se parâmetros agronômicos e de simbiose com o FMA. O delineamento experimental foi em blocos casualizados com parcelas subdivididas, as médias obtidas foram submetidas à análise de variância e comparadas pelo teste Tukey (5%), utilizando software SISVAR. Plantas de soja quando associadas com FMA e cultivadas em condição não irrigada, obtiveram maior produtividade do que plantas no sistema irrigado, além de peso de 1000 grãos. Desta forma, conclui-se que a inoculação beneficia a produtividade da soja em condições de sistema não irrigado

    Pervasive gaps in Amazonian ecological research

    Get PDF
    Biodiversity loss is one of the main challenges of our time,1,2 and attempts to address it require a clear un derstanding of how ecological communities respond to environmental change across time and space.3,4 While the increasing availability of global databases on ecological communities has advanced our knowledge of biodiversity sensitivity to environmental changes,5–7 vast areas of the tropics remain understudied.8–11 In the American tropics, Amazonia stands out as the world’s most diverse rainforest and the primary source of Neotropical biodiversity,12 but it remains among the least known forests in America and is often underrepre sented in biodiversity databases.13–15 To worsen this situation, human-induced modifications16,17 may elim inate pieces of the Amazon’s biodiversity puzzle before we can use them to understand how ecological com munities are responding. To increase generalization and applicability of biodiversity knowledge,18,19 it is thus crucial to reduce biases in ecological research, particularly in regions projected to face the most pronounced environmental changes. We integrate ecological community metadata of 7,694 sampling sites for multiple or ganism groups in a machine learning model framework to map the research probability across the Brazilian Amazonia, while identifying the region’s vulnerability to environmental change. 15%–18% of the most ne glected areas in ecological research are expected to experience severe climate or land use changes by 2050. This means that unless we take immediate action, we will not be able to establish their current status, much less monitor how it is changing and what is being lostinfo:eu-repo/semantics/publishedVersio

    Pervasive gaps in Amazonian ecological research

    Get PDF

    Pervasive gaps in Amazonian ecological research

    Get PDF
    Biodiversity loss is one of the main challenges of our time,1,2 and attempts to address it require a clear understanding of how ecological communities respond to environmental change across time and space.3,4 While the increasing availability of global databases on ecological communities has advanced our knowledge of biodiversity sensitivity to environmental changes,5,6,7 vast areas of the tropics remain understudied.8,9,10,11 In the American tropics, Amazonia stands out as the world's most diverse rainforest and the primary source of Neotropical biodiversity,12 but it remains among the least known forests in America and is often underrepresented in biodiversity databases.13,14,15 To worsen this situation, human-induced modifications16,17 may eliminate pieces of the Amazon's biodiversity puzzle before we can use them to understand how ecological communities are responding. To increase generalization and applicability of biodiversity knowledge,18,19 it is thus crucial to reduce biases in ecological research, particularly in regions projected to face the most pronounced environmental changes. We integrate ecological community metadata of 7,694 sampling sites for multiple organism groups in a machine learning model framework to map the research probability across the Brazilian Amazonia, while identifying the region's vulnerability to environmental change. 15%–18% of the most neglected areas in ecological research are expected to experience severe climate or land use changes by 2050. This means that unless we take immediate action, we will not be able to establish their current status, much less monitor how it is changing and what is being lost

    Pervasive gaps in Amazonian ecological research

    Get PDF
    Biodiversity loss is one of the main challenges of our time,1,2 and attempts to address it require a clear understanding of how ecological communities respond to environmental change across time and space.3,4 While the increasing availability of global databases on ecological communities has advanced our knowledge of biodiversity sensitivity to environmental changes,5,6,7 vast areas of the tropics remain understudied.8,9,10,11 In the American tropics, Amazonia stands out as the world's most diverse rainforest and the primary source of Neotropical biodiversity,12 but it remains among the least known forests in America and is often underrepresented in biodiversity databases.13,14,15 To worsen this situation, human-induced modifications16,17 may eliminate pieces of the Amazon's biodiversity puzzle before we can use them to understand how ecological communities are responding. To increase generalization and applicability of biodiversity knowledge,18,19 it is thus crucial to reduce biases in ecological research, particularly in regions projected to face the most pronounced environmental changes. We integrate ecological community metadata of 7,694 sampling sites for multiple organism groups in a machine learning model framework to map the research probability across the Brazilian Amazonia, while identifying the region's vulnerability to environmental change. 15%–18% of the most neglected areas in ecological research are expected to experience severe climate or land use changes by 2050. This means that unless we take immediate action, we will not be able to establish their current status, much less monitor how it is changing and what is being lost

    Catálogo Taxonômico da Fauna do Brasil: setting the baseline knowledge on the animal diversity in Brazil

    Get PDF
    The limited temporal completeness and taxonomic accuracy of species lists, made available in a traditional manner in scientific publications, has always represented a problem. These lists are invariably limited to a few taxonomic groups and do not represent up-to-date knowledge of all species and classifications. In this context, the Brazilian megadiverse fauna is no exception, and the Catálogo Taxonômico da Fauna do Brasil (CTFB) (http://fauna.jbrj.gov.br/), made public in 2015, represents a database on biodiversity anchored on a list of valid and expertly recognized scientific names of animals in Brazil. The CTFB is updated in near real time by a team of more than 800 specialists. By January 1, 2024, the CTFB compiled 133,691 nominal species, with 125,138 that were considered valid. Most of the valid species were arthropods (82.3%, with more than 102,000 species) and chordates (7.69%, with over 11,000 species). These taxa were followed by a cluster composed of Mollusca (3,567 species), Platyhelminthes (2,292 species), Annelida (1,833 species), and Nematoda (1,447 species). All remaining groups had less than 1,000 species reported in Brazil, with Cnidaria (831 species), Porifera (628 species), Rotifera (606 species), and Bryozoa (520 species) representing those with more than 500 species. Analysis of the CTFB database can facilitate and direct efforts towards the discovery of new species in Brazil, but it is also fundamental in providing the best available list of valid nominal species to users, including those in science, health, conservation efforts, and any initiative involving animals. The importance of the CTFB is evidenced by the elevated number of citations in the scientific literature in diverse areas of biology, law, anthropology, education, forensic science, and veterinary science, among others

    Running resilient MPI applications on a Dynamic Group of Recommended Processes

    No full text
    Abstract High-performance computing systems run applications that can take several hours to execute and have to deal with the occurrence of a potentially large number of faults. Most of the existing fault-tolerant strategies for these systems assume crash faults that are permanent events are easily detected. This is not the case in several real systems, in particular in shared clusters, in which even the load variation may cause performance problems that are virtually equivalent to faults. In this work, we present a new model to deal with this problem in which processes execute tests among themselves in order to determine whether the processors (or cores) on which they are running are recommended or non-recommended. Processes classified as recommended form a Dynamic Group of Recommended Processes (DGRP) that runs the application. The DGRP is formed only by processes that have not been tested as non-recommended by all DGRP processes. A process not in the DGRP that is continuously tested as recommended can rejoin the DGRP after a round of consensus executed by DGRP processes. Experimental results are presented obtained from a MPI-based implementation in which the HyperQuickSort parallel sorting algorithm reconfigures itself at runtime to tolerate up to N − 1 faults (in a system with N processes) while sorting up to 1 billion integers
    corecore