797 research outputs found

    Solution of partial differential equations on vector and parallel computers

    Get PDF
    The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

    Droplet routing for digital microfluidic biochips based on microelectrode dot array architecture

    Get PDF
    A digital microfluidic biochip (DMFB) is a device that digitizes fluidic samples into tiny droplets and operates chemical processes on a single chip. Movement control of droplets can be realized by using electrowetting-on-dielectric (EWOD) technology. DMFBs have high configurability, high sensitivity, low cost and reduced human error as well as a promising future in the applications of point-of-care medical diagnostic, and DNA sequencing. As the demands of scalability, configurability and portability increase, a new DMFB architecture called Microelectrode Dot Array (MEDA) has been introduced recently to allow configurable electrodes shape and more precise control of droplets. The objective of this work is to investigate a routing algorithm which can not only handle the routing problem for traditional DMFBs, but also be able to route different sizes of droplets and incorporate diagonal movements for MEDA. The proposed droplet routing algorithm is based on 3D-A* search algorithm. The simulation results show that the proposed algorithm can reduce the maximum latest arrival time, average latest arrival time and total number of used cells. By enabling channel-based routing in MEDA, the equivalent total number of used cells can be significantly reduced. Compared to all existing algorithms, the proposed algorithm can achieve so far the least average latest arrival time

    Performance Analysis of Hardware/Software Co-Design of Matrix Solvers

    Get PDF
    Solving a system of linear and nonlinear equations lies at the heart of many scientific and engineering applications such as circuit simulation, applications in electric power networks, and structural analysis. The exponentially increasing complexity of these computing applications and the high cost of supercomputing force us to explore affordable high performance computing platforms. The ultimate goal of this research is to develop hardware friendly parallel processing algorithms and build cost effective high performance parallel systems using hardware in order to enable the solution of large linear systems. In this thesis, FPGA-based general hardware architectures of selected iterative methods and direct methods are discussed. Xilinx Embedded Development Kit (EDK) hardware/software (HW/SW) codesigns of these methods are also presented. For iterative methods, FPGA based hardware architectures of Jacobi, combined Jacobi and Gauss-Seidel, and conjugate gradient (CG) are proposed. The convergence analysis of the LNS-based Jacobi processor demonstrates to what extent the hardware resource constraints and additional conversion error affect the convergence of Jacobi iterative method. Matlab simulations were performed to compare the performance of three iterative methods in three ways, i.e., number of iterations for any given tolerance, number of iterations for different matrix sizes, and computation time for different matrix sizes. The simulation results indicate that the key to a fast implementation of the three methods is a fast implementation of matrix multiplication. The simulation results also show that CG method takes less number of iterations for any given tolerance, but more computation time as matrix size increases compared to other two methods, since matrix-vector multiplication is a more dominant factor in CG method than in the other two methods. By implementing matrix multiplications of the three methods in hardware with Xilinx EDK HW/SW codesign, the performance is significantly improved over pure software Power PC (PPC) based implementation. The EDK implementation results show that CG takes less computation time for any size of matrices compared to other two methods in HW/SW codesign, due to that fact that matrix multiplications dominate the computation time of all three methods while CG requires less number of iterations to converge compared to other two methods. For direct methods, FPGA-based general hardware architecture and Xilinx EDK HW/SW codesign of WZ factorization are presented. Single unit and scalable hardware architectures of WZ factorization are proposed and analyzed under different constraints. The results of Matlab simulations show that WZ runs faster than the LU on parallel processors but slower on a single processor. The simulation results also indicate that the most time consuming part of WZ factorization is matrix update. By implementing the matrix update of WZ factorization in hardware with Xilinx EDK HW/SW codesign, the performance is also apparently improved over PPC based pure software implementation

    Bordered Block-Diagonal Preserved Model-Order Reduction for RLC Circuits

    Get PDF
    This thesis details the research of the bordered block-diagonal preserved model-order reduction (BVOR) method and implementation of the corresponding tool designed for facilitating the simulation of industrial, very large sized linear circuits or linear sub-circuits of a nonlinear circuit. The BVOR tool is able to extract the linear RLC parts of the circuit from any given typical SPICE netlist and perform reduction using an appropriate algorithm for optimum efficiency. The implemented algorithms in this tool are bordered block-diagonal matrix solver and bordered block-diagonal matrix based block Arnoldi method

    Fast time- and frequency-domain finite-element methods for electromagnetic analysis

    Get PDF
    Fast electromagnetic analysis in time and frequency domain is of critical importance to the design of integrated circuits (IC) and other advanced engineering products and systems. Many IC structures constitute a very large scale problem in modeling and simulation, the size of which also continuously grows with the advancement of the processing technology. This results in numerical problems beyond the reach of existing most powerful computational resources. Different from many other engineering problems, the structure of most ICs is special in the sense that its geometry is of Manhattan type and its dielectrics are layered. Hence, it is important to develop structure-aware algorithms that take advantage of the structure specialties to speed up the computation. In addition, among existing time-domain methods, explicit methods can avoid solving a matrix equation. However, their time step is traditionally restricted by the space step for ensuring the stability of a time-domain simulation. Therefore, making explicit time-domain methods unconditionally stable is important to accelerate the computation. In addition to time-domain methods, frequency-domain methods have suffered from an indefinite system that makes an iterative solution difficult to converge fast. The first contribution of this work is a fast time-domain finite-element algorithm for the analysis and design of very large-scale on-chip circuits. The structure specialty of on-chip circuits such as Manhattan geometry and layered permittivity is preserved in the proposed algorithm. As a result, the large-scale matrix solution encountered in the 3-D circuit analysis is turned into a simple scaling of the solution of a small 1-D matrix, which can be obtained in linear (optimal) complexity with negligible cost. Furthermore, the time step size is not sacrificed, and the total number of time steps to be simulated is also significantly reduced, thus achieving a total cost reduction in CPU time. The second contribution is a new method for making an explicit time-domain finite-element method (TDFEM) unconditionally stable for general electromagnetic analysis. In this method, for a given time step, we find the unstable modes that are the root cause of instability, and deduct them directly from the system matrix resulting from a TDFEM based analysis. As a result, an explicit TDFEM simulation is made stable for an arbitrarily large time step irrespective of the space step. The third contribution is a new method for full-wave applications from low to very high frequencies in a TDFEM based on matrix exponential. In this method, we directly deduct the eigenmodes having large eigenvalues from the system matrix, thus achieving a significantly increased time step in the matrix exponential based TDFEM. The fourth contribution is a new method for transforming the indefinite system matrix of a frequency-domain FEM to a symmetric positive definite one. We deduct non-positive definite component directly from the system matrix resulting from a frequency-domain FEM-based analysis. The resulting new representation of the finite-element operator ensures an iterative solution to converge in a small number of iterations. We then add back the non-positive definite component to synthesize the original solution with negligible cost

    Krylov's methods in function space for waveform relaxation.

    Get PDF
    by Wai-Shing Luk.Thesis (Ph.D.)--Chinese University of Hong Kong, 1996.Includes bibliographical references (leaves 104-113).Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Functional Extension of Iterative Methods --- p.2Chapter 1.2 --- Applications in Circuit Simulation --- p.2Chapter 1.3 --- Multigrid Acceleration --- p.3Chapter 1.4 --- Why Hilbert Space? --- p.4Chapter 1.5 --- Parallel Implementation --- p.5Chapter 1.6 --- Domain Decomposition --- p.5Chapter 1.7 --- Contributions of This Thesis --- p.6Chapter 1.8 --- Outlines of the Thesis --- p.7Chapter 2 --- Waveform Relaxation Methods --- p.9Chapter 2.1 --- Basic Idea --- p.10Chapter 2.2 --- Linear Operators between Banach Spaces --- p.14Chapter 2.3 --- Waveform Relaxation Operators for ODE's --- p.16Chapter 2.4 --- Convergence Analysis --- p.19Chapter 2.4.1 --- Continuous-time Convergence Analysis --- p.20Chapter 2.4.2 --- Discrete-time Convergence Analysis --- p.21Chapter 2.5 --- Further references --- p.24Chapter 3 --- Waveform Krylov Subspace Methods --- p.25Chapter 3.1 --- Overview of Krylov Subspace Methods --- p.26Chapter 3.2 --- Krylov Subspace methods in Hilbert Space --- p.30Chapter 3.3 --- Waveform Krylov Subspace Methods --- p.31Chapter 3.4 --- Adjoint Operator for WBiCG and WQMR --- p.33Chapter 3.5 --- Numerical Experiments --- p.35Chapter 3.5.1 --- Test Circuits --- p.36Chapter 3.5.2 --- Unstructured Grid Problem --- p.39Chapter 4 --- Parallel Implementation Issues --- p.50Chapter 4.1 --- DECmpp 12000/Sx Computer and HPF --- p.50Chapter 4.2 --- Data Mapping Strategy --- p.55Chapter 4.3 --- Sparse Matrix Format --- p.55Chapter 4.4 --- Graph Coloring for Unstructured Grid Problems --- p.57Chapter 5 --- The Use of Inexact ODE Solver in Waveform Methods --- p.61Chapter 5.1 --- Inexact ODE Solver for Waveform Relaxation --- p.62Chapter 5.1.1 --- Convergence Analysis --- p.63Chapter 5.2 --- Inexact ODE Solver for Waveform Krylov Subspace Methods --- p.65Chapter 5.3 --- Experimental Results --- p.68Chapter 5.4 --- Concluding Remarks --- p.72Chapter 6 --- Domain Decomposition Technique --- p.80Chapter 6.1 --- Introduction --- p.80Chapter 6.2 --- Overlapped Schwarz Methods --- p.81Chapter 6.3 --- Numerical Experiments --- p.83Chapter 6.3.1 --- Delay Circuit --- p.83Chapter 6.3.2 --- Unstructured Grid Problem --- p.86Chapter 7 --- Conclusions --- p.90Chapter 7.1 --- Summary --- p.90Chapter 7.2 --- Future Works --- p.92Chapter A --- Pseudo Codes for Waveform Krylov Subspace Methods --- p.94Chapter B --- Overview of Recursive Spectral Bisection Method --- p.101Bibliography --- p.10
    • …
    corecore