19 research outputs found

    A simple parallel prefix algorithm for compact finite-difference schemes

    Get PDF
    A compact scheme is a discretization scheme that is advantageous in obtaining highly accurate solutions. However, the resulting systems from compact schemes are tridiagonal systems that are difficult to solve efficiently on parallel computers. Considering the almost symmetric Toeplitz structure, a parallel algorithm, simple parallel prefix (SPP), is proposed. The SPP algorithm requires less memory than the conventional LU decomposition and is highly efficient on parallel machines. It consists of a prefix communication pattern and AXPY operations. Both the computation and the communication can be truncated without degrading the accuracy when the system is diagonally dominant. A formal accuracy study was conducted to provide a simple truncation formula. Experimental results were measured on a MasPar MP-1 SIMD machine and on a Cray 2 vector machine. Experimental results show that the simple parallel prefix algorithm is a good algorithm for the compact scheme on high-performance computers

    Generalized disjunction decomposition for the evolution of programmable logic array structures

    Get PDF
    Evolvable hardware refers to a self reconfigurable electronic circuit, where the circuit configuration is under the control of an evolutionary algorithm. Evolvable hardware has shown one of its main deficiencies, when applied to solving real world applications, to be scalability. In the past few years several techniques have been proposed to avoid and/or solve this problem. Generalized disjunction decomposition (GDD) is one of these proposed methods. GDD was successful for the evolution of large combinational logic circuits based on a FPGA structure when used together with bi-directional incremental evolution and with (1+ë) evolution strategy. In this paper a modified generalized disjunction decomposition, together with a recently introduced multi-population genetic algorithm, are implemented and tested for its scalability for solving large combinational logic circuits based on Programmable Logic Array (PLA) structures

    RADIC scalability analysis: functional Mmodel

    Get PDF
    In parallel systems, a number of measures of performance are not accurate or representative of their functioning. These measures allow to quantify the benefit of parallelism. Very often, programs are designed and tested for smaller problems on fewer processing elements. However, the real problems these programs are intended to solve are much larger, and the machines contain a great number of processing elements. Hence, it is necessary to create a model that allows to extrapolate the application execution over a few processing elements to larger machine configurations. These measures are more complex if we consider faults. When we take measures we must understand the interaction among system architecture, application architecture and fault tolerant system. In this paper we present a model which analyzes the combination parallel computer, parallel application and RADIC fault tolerance architecture.Presentado en el IX Workshop Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

    Applications and accuracy of the parallel diagonal dominant algorithm

    Get PDF
    The Parallel Diagonal Dominant (PDD) algorithm is a highly efficient, ideally scalable tridiagonal solver. In this paper, a detailed study of the PDD algorithm is given. First the PDD algorithm is introduced. Then the algorithm is extended to solve periodic tridiagonal systems. A variant, the reduced PDD algorithm, is also proposed. Accuracy analysis is provided for a class of tridiagonal systems, the symmetric, and anti-symmetric Toeplitz tridiagonal systems. Implementation results show that the analysis gives a good bound on the relative error, and the algorithm is a good candidate for the emerging massively parallel machines

    Generalized disjunction decomposition for evolvable hardware

    Get PDF
    Evolvable hardware (EHW) refers to self-reconfiguration hardware design, where the configuration is under the control of an evolutionary algorithm (EA). One of the main difficulties in using EHW to solve real-world problems is scalability, which limits the size of the circuit that may be evolved. This paper outlines a new type of decomposition strategy for EHW, the “generalized disjunction decomposition” (GDD), which allows the evolution of large circuits. The proposed method has been extensively tested, not only with multipliers and parity bit problems traditionally used in the EHW community, but also with logic circuits taken from the Microelectronics Center of North Carolina (MCNC) benchmark library and randomly generated circuits. In order to achieve statistically relevant results, each analyzed logic circuit has been evolved 100 times, and the average of these results is presented and compared with other EHW techniques. This approach is necessary because of the probabilistic nature of EA; the same logic circuit may not be solved in the same way if tested several times. The proposed method has been examined in an extrinsic EHW system using the(1+lambda)(1 + lambda)evolution strategy. The results obtained demonstrate that GDD significantly improves the evolution of logic circuits in terms of the number of generations, reduces computational time as it is able to reduce the required time for a single iteration of the EA, and enables the evolution of larger circuits never before evolved. In addition to the proposed method, a short overview of EHW systems together with the most recent applications in electrical circuit design is provided

    RADIC scalability analysis: functional Mmodel

    Get PDF
    In parallel systems, a number of measures of performance are not accurate or representative of their functioning. These measures allow to quantify the benefit of parallelism. Very often, programs are designed and tested for smaller problems on fewer processing elements. However, the real problems these programs are intended to solve are much larger, and the machines contain a great number of processing elements. Hence, it is necessary to create a model that allows to extrapolate the application execution over a few processing elements to larger machine configurations. These measures are more complex if we consider faults. When we take measures we must understand the interaction among system architecture, application architecture and fault tolerant system. In this paper we present a model which analyzes the combination parallel computer, parallel application and RADIC fault tolerance architecture.Presentado en el IX Workshop Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

    Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

    Get PDF
    Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper

    Economic Viability of Software Defined Networking (SDN)

    Get PDF
    Economical and operational facets of networks drive the necessity for significant changes towards fundamentals of networking architectures. Recently, the momentum of programmable networking attempts illustrates the significance of economic aspects of network technologies. Software Defined Networking (SDN) has got the attention of researchers from both academia and industry as a means to decrease network costs and generate revenue for service providers due to features it promises in networking. In this article, we investigate how programmable network architectures, i.e. SDN technology, affect the network economics compared to traditional network architectures, i.e. MPLS technology. We define two metrics, Unit Service Cost Scalability and Cost-to-Service, to evaluate how SDN architecture performs compared to MPLS architecture. Also, we present mathematical models to calculate certain cost parts of a network. In addition, we compare different popular SDN control plane models, Centralized Control Plane (CCP), Distributed Control Plane (DCP), and Hierarchical Control Plane (HCP), to understand the economic impact of them with regards to the defined metrics. We use video traffic with different patterns for the comparison. This work aims at being a useful primer to providing insights regarding which technology and control plane model are appropriate for a specific service, i.e. video, for network owners to plan their investments

    A parallel two-level hybrid method for tridiagonal systems and its application to fast poisson solvers

    Full text link