11 research outputs found

    Desenvolvimento de uma aplicação distribuída em um ambiente corporativo utilizando PVM

    Get PDF
    Há duas ações pelas quais se pode elevar o poder de processamento dos dados de uma empresa. A primeira é comprar um supercomputador multiprocessado, o que implicará gastos consideráveis com hardware, no desenvolvimento de aplicativo para controlar esses processadores. A segunda forma seria a construção de um cluster de workstation, usando algum ambiente de troca de mensagens. A segunda opção é bem mais acessível do que a primeira. Este projeto tem como prioridade valorizar a performance da aplicação paralela que será projetada, bem como desenvolver um algoritmo que seja capaz de balancear as cargas entre as diversas máquinas existentes no projeto. Dar autonomia para as máquinas escravos bem com gerencias os recursos dessas maquinas como memória, espaço em disco. Como exemplo para testar a máquina virtual, será utilizado um pequeno programa de cálculo de integral, utilizando a técnica de áreas de trapézios. O balanceamento de carga será dividido proporcionalmente às velocidades do processador de cada máquina

    Memory Access Patterns for Cellular Automata Using GPGPUs

    Get PDF
    Today\u27s graphical processing units have hundreds of individual processing cores that can be used for general purpose computation of mathematical and scientific problems. Due to their hardware architecture, these devices are especially effective when solving problems that exhibit a high degree of spatial locality. Cellular automata use small, local neighborhoods to determine successive states of individual elements and therefore, provide an excellent opportunity for the application of general purpose GPU computing. However, the GPU presents a challenging environment because it lacks many of the features of traditional CPUs, such as automatic, on-chip caching of data. To fully realize the potential of a GPU, specialized memory techniques and patterns must be employed to account for their unique architecture. Several techniques are presented which not only dramatically improve performance, but, in many cases, also simplify implementation. Many of the approaches discussed relate to the organization of data in memory or patterns for accessing that data, while others detail methods of increasing the computation to memory access ratio. The ideas presented are generic, and applicable to cellular automata models as a whole. Example implementations are given for several problems, including the Game of Life and Gaussian blurring, while performance characteristics, such as instruction and memory accesses counts, are analyzed and compared. A case study is detailed, showing the effectiveness of the various techniques when applied to a larger, real-world problem. Lastly, the reasoning behind each of the improvements is explained, providing general guidelines for determining when a given technique will be most and least effective

    AUTOMATED TREE-LEVEL FOREST QUANTIFICATION USING AIRBORNE LIDAR

    Get PDF
    Traditional forest management relies on a small field sample and interpretation of aerial photography that not only are costly to execute but also yield inaccurate estimates of the entire forest in question. Airborne light detection and ranging (LiDAR) is a remote sensing technology that records point clouds representing the 3D structure of a forest canopy and the terrain underneath. We present a method for segmenting individual trees from the LiDAR point clouds without making prior assumptions about tree crown shapes and sizes. We then present a method that vertically stratifies the point cloud to an overstory and multiple understory tree canopy layers. Using the stratification method, we modeled the occlusion of higher canopy layers with respect to point density. We also present a distributed computing approach that enables processing the massive data of an arbitrarily large forest. Lastly, we investigated using deep learning for coniferous/deciduous classification of point cloud segments representing individual tree crowns. We applied the developed methods to the University of Kentucky Robinson Forest, a natural, majorly deciduous, closed-canopy forest. 90% of overstory and 47% of understory trees were detected with false positive rates of 14% and 2% respectively. Vertical stratification improved the detection rate of understory trees to 67% at the cost of increasing their false positive rate to 12%. According to our occlusion model, a point density of about 170 pt/m² is needed to segment understory trees located in the third layer as accurately as overstory trees. Using our distributed processing method, we segmented about two million trees within a 7400-ha forest in 2.5 hours using 192 processing cores, showing a speedup of ~170. Our deep learning experiments showed high classification accuracies (~82% coniferous and ~90% deciduous) without the need to manually assemble the features. In conclusion, the methods developed are steps forward to remote, accurate quantification of large natural forests at the individual tree level

    MPI-style Web services: An investigation into the potential of using Web services for MPI-style applications

    Get PDF
    This research investigates the potential of the Web services architecture to act as a platform for the execution of MPI-style applications. The work in this thesis is based upon extending current Web service methodologies and merging them with ideas from other research domains, such as high performance computing. MPIWS, an API to extend the functionality of standard Web services is introduced. MPIWS provides MPI-style message passing functionality to facilitate the execution of MPI-style applications using Web service based communication protocols. The thesis then presents a large selection of experiments that perform a comprehensive evaluation of MPIWS's performance. This performance is compared with an existing MPI implementation that has the option of transmitting data either via Java serialised objects, or via the Java native interface to an underlying C implementation of MPI. From the results obtained from these experiments, it can be concluded that using MPIWS for applications requiring MPI-style message passing between services is potentially a practical and efficient way of distributing coarse grained parallel applications. The results also show that the use of collective communication techniques within the Web services architecture can significantly improve the efficiency of suitable applications such as molecular dynamics simulation. MPI-style communication can also be used to enhance the performance of Web service based workflow execution. Tests conducted have evaluated a range of functionality that can be provided by the MPIWS tool. This evaluation shows that direct messaging between services, without sending data via the workflow manager, can improve the efficiency of Web service based workflow execution

    An efficient parallelization of a real scientific application

    Get PDF
    Bibliography: leaves 137-145.In the past decade the cost of computing has come down considerably making high-powered computing more easily affordable. As a result many institutions and organisations now have networks of high-powered workstations. Such networks provide a large, generally untapped, source of computing power which can be used for running large scientific applications which previously could only be run on supercomputers. This dissertation shows that a substantial improvement in performance can be achieved by the parallelization of a real scientific application for a heterogeneous network of Sun and Silicon Graphics workstations connected by an Ethernet network, but that this is affected by a number of factors. These factors include communication delays, load balancing, and the number of slaves used. This dissertation shows that performance can be improved by sending more, shorter messages, and by overlapping communication with computation. Part of this thesis concerns the difficulties involved in the evaluation of parallel performance on a heterogeneous network. This dissertation shows that conventional methods such as speedup and efficiency are not appropriate for evaluating the performance of a heterogeneous system, and that linear speed gives a much more representative indication of the actual performance achieved. We also proposed new concepts of perfect linear speed and linear efficiency, which help to evaluate the improvement in parallel performance on a heterogeneous system

    Engineering Physics and Mathematics Division progress report for period ending December 31, 1994

    Full text link
    corecore