2 research outputs found

    Java virtual machine performance improvement in Cell B.E. architecture

    Get PDF
    Orientador: Rodolfo Jardim de AzevedoDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Esta dissertação concentra-se no atual momento de transição entre as atuais e as novas arquiteturas de processadores, oferecendo uma alternativa para minimizar o impacto desta mudança. Para tal utiliza-se a plataforma Java, que possibilita que o desenvolvimento de aplicações seja independente da arquitetura em que serão executadas. Considerando a arquitetura Cell B.E. como uma nova plataforma que promete desempenho elevado, este trabalho propõe melhorias na Máquina Virtual Java que propiciem um ganho de desempenho na execução de aplicações Java executadas sobre o processador Cell. O objetivo proposto é atingido por meio da utilização do ambiente disponível na própria plataforma Java, o Java Native Interface (JNI), para a implementação de interfaces entre bibliotecas nativas construídas para a arquitetura Cell - com a intenção de obter o máximo desempenho possível - e as aplicações Java. É proposto um modelo para porte e criação das interfaces para bibliotecas e mostra-se a viabilidade da abordagem proposta através de implementações de bibliotecas selecionadas, consolidando a metodologia utilizada. Duas bibliotecas foram portadas completamente como prova de conceito, uma multiplicação de matrizes grandes e o algoritmo RC5. A multiplicação de matrizes obteve um desempenho e escalablidade comparável ao código original em C e em escala muitas vezes superior ao código JNI para arquitetura x86 a ao código Java executando em arquiteturas x86 e Cell. O RC5 executou apenas aproximadamente 0,3 segundos mais lento que o código C original (perda citada em segundos pois se manteve constante independente do tempo levado para as diferentes configurações de execução)Abstract: This dissertation focuses on the present moment of transition between the current and new processor architectures, offering an alternative to minimize the impact of this change. For this, we use the Java platform, which enables an architecture-independent application development. Considering the Cell BE architecture as a new platform that promises high performance, this paper proposes improvements in the Java Virtual Machine that provide performance gains in the execution of Java applications running on the Cell processor. The proposed objective is achieved through the use of the environment available on the Java platform itself, the Java Native Interface (JNI), to implement interfaces between native libraries built for the Cell architecture - with the intention of obtaining the maximum possible performance - and the Java applications. It is proposed a model to port and build interfaces to libraries and it shows the viability of the proposed methodology with the implementation of selected libraries, consolidating the used methodology. Two libraries were completely ported as proof of concept, a multiplication of large matrices and a RC5 algorithm implementation. The matrices multiplication achieved scalability and performance in the same basis as the native implementation and incomparable with JNI implementation targering x86 architecture and Java implementation running in x86 and Cell architectures. The RC5 was just 0.3 seconds slower than the original C code (the loss is put in seconds since it was constant, independent of the execution time taken by different configurations of execution)MestradoComputaçãoMestre em Ciência da Computaçã

    Scalable parallel evolutionary optimisation based on high performance computing

    Get PDF
    Evolutionary algorithms (EAs) have been successfully applied to solve various challenging optimisation problems. Due to their stochastic nature, EAs typically require considerable time to find desirable solutions; especially for increasingly complex and large-scale problems. As a result, many works studied implementing EAs on parallel computing facilities to accelerate the time-consuming processes. Recently, the rapid development of modern parallel computing facilities such as the high performance computing (HPC) bring not only unprecedented computational capabilities but also challenges on designing parallel algorithms. This thesis mainly focuses on designing scalable parallel evolutionary optimisation (SPEO) frameworks which run efficiently on the HPC. Motivated by the interesting phenomenon that many EAs begin to employ increasingly large population sizes, this thesis firstly studies the effect of a large population size through comprehensive experiments. Numerical results indicate that a large population benefits to the solving of complex problems but requires a large number of maximal fitness evaluations (FEs). However, since sequential EAs usually requires a considerable computing time to achieve extensive FEs, we propose a scalable parallel evolutionary optimisation framework that can efficiently deploy parallel EAs over many CPU cores at CPU-only HPC. On the other hand, since EAs using a large number of FEs can produce massive useful information in the course of evolution, we design a surrogate-based approach to learn from this historical information and to better solve complex problems. Then this approach is implemented in parallel based on the proposed scalable parallel framework to achieve remarkable speedups. Since demanding a great computing power on CPU-only HPC is usually very expensive, we design a framework based on GPU-enabled HPC to improve the cost-effectiveness of parallel EAs. The proposed framework can efficiently accelerate parallel EAs using many GPUs and can achieve superior cost-effectiveness. However, since it is very challenging to correctly implement parallel EAs on the GPU, we propose a set of guidelines to verify the correctness of GPU-based EAs. In order to examine these guidelines, they are employed to verify a GPU-based brain storm optimisation that is also proposed in this thesis. In conclusion, the comprehensively experimental study is firstly conducted to investigate the impacts of a large population. After that, a SPEO framework based on CPU-only HPC is proposed and is employed to accelerate a time-consuming implementation of EA. Finally, the correctness verification of implementing EAs based on a single GPU is discussed and the SPEO framework is then extended to be deployed based on GPU-enabled HPC
    corecore