147 research outputs found

    Analytical Modeling of High Performance Reconfigurable Computers: Prediction and Analysis of System Performance.

    Get PDF
    The use of a network of shared, heterogeneous workstations each harboring a Reconfigurable Computing (RC) system offers high performance users an inexpensive platform for a wide range of computationally demanding problems. However, effectively using the full potential of these systems can be challenging without the knowledge of the system’s performance characteristics. While some performance models exist for shared, heterogeneous workstations, none thus far account for the addition of Reconfigurable Computing systems. This dissertation develops and validates an analytic performance modeling methodology for a class of fork-join algorithms executing on a High Performance Reconfigurable Computing (HPRC) platform. The model includes the effects of the reconfigurable device, application load imbalance, background user load, basic message passing communication, and processor heterogeneity. Three fork-join class of applications, a Boolean Satisfiability Solver, a Matrix-Vector Multiplication algorithm, and an Advanced Encryption Standard algorithm are used to validate the model with homogeneous and simulated heterogeneous workstations. A synthetic load is used to validate the model under various loading conditions including simulating heterogeneity by making some workstations appear slower than others by the use of background loading. The performance modeling methodology proves to be accurate in characterizing the effects of reconfigurable devices, application load imbalance, background user load and heterogeneity for applications running on shared, homogeneous and heterogeneous HPRC resources. The model error in all cases was found to be less than five percent for application runtimes greater than thirty seconds and less than fifteen percent for runtimes less than thirty seconds. The performance modeling methodology enables us to characterize applications running on shared HPRC resources. Cost functions are used to impose system usage policies and the results of vii the modeling methodology are utilized to find the optimal (or near-optimal) set of workstations to use for a given application. The usage policies investigated include determining the computational costs for the workstations and balancing the priority of the background user load with the parallel application. The applications studied fall within the Master-Worker paradigm and are well suited for a grid computing approach. A method for using NetSolve, a grid middleware, with the model and cost functions is introduced whereby users can produce optimal workstation sets and schedules for Master-Worker applications running on shared HPRC resources

    REQUIREMENT- AWARE STRATEGIES FOR SCHEDULING MULTIPLE DIVISIBLE LOADS IN CLUSTER ENVIRONMENTS

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Personal mobile grids with a honeybee inspired resource scheduler

    Get PDF
    The overall aim of the thesis has been to introduce Personal Mobile Grids (PMGrids) as a novel paradigm in grid computing that scales grid infrastructures to mobile devices and extends grid entities to individual personal users. In this thesis, architectural designs as well as simulation models for PM-Grids are developed. The core of any grid system is its resource scheduler. However, virtually all current conventional grid schedulers do not address the non-clairvoyant scheduling problem, where job information is not available before the end of execution. Therefore, this thesis proposes a honeybee inspired resource scheduling heuristic for PM-Grids (HoPe) incorporating a radical approach to grid resource scheduling to tackle this problem. A detailed design and implementation of HoPe with a decentralised self-management and adaptive policy are initiated. Among the other main contributions are a comprehensive taxonomy of grid systems as well as a detailed analysis of the honeybee colony and its nectar acquisition process (NAP), from the resource scheduling perspective, which have not been presented in any previous work, to the best of our knowledge. PM-Grid designs and HoPe implementation were evaluated thoroughly through a strictly controlled empirical evaluation framework with a well-established heuristic in high throughput computing, the opportunistic scheduling heuristic (OSH), as a benchmark algorithm. Comparisons with optimal values and worst bounds are conducted to gain a clear insight into HoPe behaviour, in terms of stability, throughput, turnaround time and speedup, under different running conditions of number of jobs and grid scales. Experimental results demonstrate the superiority of HoPe performance where it has successfully maintained optimum stability and throughput in more than 95% of the experiments, with HoPe achieving three times better than the OSH under extremely heavy loads. Regarding the turnaround time and speedup, HoPe has effectively achieved less than 50% of the turnaround time incurred by the OSH, while doubling its speedup in more than 60% of the experiments. These results indicate the potential of both PM-Grids and HoPe in realising futuristic grid visions. Therefore considering the deployment of PM-Grids in real life scenarios and the utilisation of HoPe in other parallel processing and high throughput computing systems are recommended.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Application level runtime load management : a bayesian approach

    Get PDF
    A computação paralela em sistemas distribuídos partilhados exige novas abordagens ao problema da gestão da carga computacional, uma vez que os algoritmos existentes ficam aquém das expectativas. A execução eficiente de aplicações paralelas irregulares em clusters de computadores partilhados dinamicamente exibe um comportamento imprevisível, devido à à variabilidade dos requisitos da aplicação e da disponibilidade dos recursos do sistema. Esta tese investiga as vantagens de incluir explicitamente no modelo de execução de um escalonador ao nível da aplicação a incerteza que este tem sobre o estado do ambiente em cada instante. Propõe-se um mecanismo de decisão baseado em redes de decisão de Bayes, complementado por uma estrutura genérica para estas redes, vocacionada para o escalonamento ao nível da aplicação; a utilização de um algoritmo de inferência probabilística permite ao escalonador tomar decisões mais eficazes, baseadas em previsões escolásticas das consequências destas decisões, geradas a partir de informação incompleta e desactualizada sobre o estado do ambiente. É proposto um modelo de desempenho da aplicação e respectivas métricas, que permite prever o comportamento da aplicação e do sistema distribuído; estas métricas são utilizadas quer no mecanismo de decisão do escalonador, quer para avaliar o desempenho do mesmo. Para verificar se esta abordagem contribui para melhorar o tempo de execução das aplicações e a eficiência do escalonador, foi desenvolvido um ray tracer paralelo, representativo de uma classe de aplicações baseada em passagem de mensagens com paralelismo no domínio dos dados e comportamento irregular. Este protótipo foi executado num cluster com sete nodos partilhados no tempo e submetidos a vários padrões sintéticos de cargas de trabalho dinâmicas. Para avaliar a eficácia da gestão de carga proposta, o desempenho do escalonador estocástico foi comparado com três escalonadores de referência: uma distribuição estática e uniforme da carga, uma estratégia orientada ao pedido e uma política de escalonamento determinística baseada em sensores. Os resultados obtidos demonstram que estratégias dinâmicas baseadas em sensores obtêm grandes melhorias de desempenho sobre estratégias que não usam informação sobre o estado do ambiente, e realçam as vantagens do escalonador estocástico relativamente a um escalonador determinístico com um nível de complexidade equivalente.Affordable parallel computing on distributed shared systems requires novel approaches to manage the runtime load distribution, since current algorithms fall below expectations. The efficient execution of irregular parallel applications, on dynamically shared computing clusters, has an unpredictable dynamic behaviour, due both to the application requirements and to the available system's resources. This thesis addresses the explicit inclusion of the uncertainty an application level scheduling agent has about the environment, on its internal model of the world and on its decision making mechanism. Bayesian decision networks are introduced and a generic framework is proposed for application level scheduling, where a probabilistic inference algorithm helps the scheduler to efficiently make decisions with improved predictions, based on available incomplete and aged measured data. An application level performance model and associated metrics (performance, environment and overheads) are proposed to obtain application and system behaviour estimates, to include in the scheduling agent's model and to help the evaluation. To verify that this novel approach improves the overall application execution time and the scheduling efficiency, a parallel ray tracer was developed as a message passing irregular data parallel application, and an execution model prototype was built to run on a seven time-shared nodes computing cluster, with dynamically variable synthetic workloads. To assess the effectiveness of the load management, the stochastic scheduler was evaluated rendering several complex scenes, and compared with three reference scheduling strategies: a uniform work distribution, a demand driven work allocation and a sensor based deterministic scheduling strategy. The evaluation results show considerable performance improvements over blind strategies, and stress the decision network based scheduler improvements over the sensor based deterministic approach of identical complexity.Fundação para a Ciência e Tecnologia - PRAXIS XXI 2/2.1/TTT/1557/95

    Parallel and Distributed Computing

    Get PDF
    The 14 chapters presented in this book cover a wide variety of representative works ranging from hardware design to application development. Particularly, the topics that are addressed are programmable and reconfigurable devices and systems, dependability of GPUs (General Purpose Units), network topologies, cache coherence protocols, resource allocation, scheduling algorithms, peertopeer networks, largescale network simulation, and parallel routines and algorithms. In this way, the articles included in this book constitute an excellent reference for engineers and researchers who have particular interests in each of these topics in parallel and distributed computing
    corecore