797 research outputs found
Solution of partial differential equations on vector and parallel computers
The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed
Droplet routing for digital microfluidic biochips based on microelectrode dot array architecture
A digital microfluidic biochip (DMFB) is a device that digitizes fluidic samples into tiny droplets and operates chemical processes on a single chip. Movement control of droplets can be realized by using electrowetting-on-dielectric (EWOD) technology. DMFBs have high configurability, high sensitivity, low cost and reduced human error as well as a promising future in the applications of point-of-care medical diagnostic, and DNA sequencing. As the demands of scalability, configurability and portability increase, a new DMFB architecture called Microelectrode Dot Array (MEDA) has been introduced recently to allow configurable electrodes shape and more precise control of droplets.
The objective of this work is to investigate a routing algorithm which can not only handle the routing problem for traditional DMFBs, but also be able to route different sizes of droplets and incorporate diagonal movements for MEDA. The proposed droplet routing algorithm is based on 3D-A* search algorithm. The simulation results show that the proposed algorithm can reduce the maximum latest arrival time, average latest arrival time and total number of used cells. By enabling channel-based routing in MEDA, the equivalent total number of used cells can be significantly reduced. Compared to all existing algorithms, the proposed algorithm can achieve so far the least average latest arrival time
Performance Analysis of Hardware/Software Co-Design of Matrix Solvers
Solving a system of linear and nonlinear equations lies at the heart of many scientific and engineering applications such as circuit simulation, applications in electric power networks, and structural analysis. The exponentially increasing complexity of these computing applications and the high cost of supercomputing force us to explore affordable high performance computing platforms. The ultimate goal of this research is to develop hardware friendly parallel processing algorithms and build cost effective high performance parallel systems using hardware in order to enable the solution of large linear systems.
In this thesis, FPGA-based general hardware architectures of selected iterative methods and direct methods are discussed. Xilinx Embedded Development Kit (EDK) hardware/software (HW/SW) codesigns of these methods are also presented. For iterative methods, FPGA based hardware architectures of Jacobi, combined Jacobi and Gauss-Seidel, and conjugate gradient (CG) are proposed. The convergence analysis of the LNS-based Jacobi processor demonstrates to what extent the hardware resource constraints and additional conversion error affect the convergence of Jacobi iterative method. Matlab simulations were performed to compare the performance of three iterative methods in three ways, i.e., number of iterations for any given tolerance, number of iterations for different matrix sizes, and computation time for different matrix sizes. The simulation results indicate that the key to a fast implementation of the three methods is a fast implementation of matrix multiplication. The simulation results also show that CG method takes less number of iterations for any given tolerance, but more computation time as matrix size increases compared to other two methods, since matrix-vector multiplication is a more dominant factor in CG method than in the other two methods. By implementing matrix multiplications of the three methods in hardware with Xilinx EDK HW/SW codesign, the performance is significantly improved over pure software Power PC (PPC) based implementation. The EDK implementation results show that CG takes less computation time for any size of matrices compared to other two methods in HW/SW codesign, due to that fact that matrix multiplications dominate the computation time of all three methods while CG requires less number of iterations to converge compared to other two methods.
For direct methods, FPGA-based general hardware architecture and Xilinx EDK HW/SW codesign of WZ factorization are presented. Single unit and scalable hardware architectures of WZ factorization are proposed and analyzed under different constraints. The results of Matlab simulations show that WZ runs faster than the LU on parallel processors but slower on a single processor. The simulation results also indicate that the most time consuming part of WZ factorization is matrix update. By implementing the matrix update of WZ factorization in hardware with Xilinx EDK HW/SW codesign, the performance is also apparently improved over PPC based pure software implementation
Bordered Block-Diagonal Preserved Model-Order Reduction for RLC Circuits
This thesis details the research of the bordered block-diagonal preserved model-order reduction (BVOR) method and implementation of the corresponding tool designed for facilitating the simulation of industrial, very large sized linear circuits or linear sub-circuits of a nonlinear circuit. The BVOR tool is able to extract the linear RLC parts of the circuit from any given typical SPICE netlist and perform reduction using an appropriate algorithm for optimum efficiency.
The implemented algorithms in this tool are bordered block-diagonal matrix solver and bordered block-diagonal matrix based block Arnoldi method
Fast time- and frequency-domain finite-element methods for electromagnetic analysis
Fast electromagnetic analysis in time and frequency domain is of critical importance to the design of integrated circuits (IC) and other advanced engineering products and systems. Many IC structures constitute a very large scale problem in modeling and simulation, the size of which also continuously grows with the advancement of the processing technology. This results in numerical problems beyond the reach of existing most powerful computational resources. Different from many other engineering problems, the structure of most ICs is special in the sense that its geometry is of Manhattan type and its dielectrics are layered. Hence, it is important to develop structure-aware algorithms that take advantage of the structure specialties to speed up the computation. In addition, among existing time-domain methods, explicit methods can avoid solving a matrix equation. However, their time step is traditionally restricted by the space step for ensuring the stability of a time-domain simulation. Therefore, making explicit time-domain methods unconditionally stable is important to accelerate the computation. In addition to time-domain methods, frequency-domain methods have suffered from an indefinite system that makes an iterative solution difficult to converge fast. The first contribution of this work is a fast time-domain finite-element algorithm for the analysis and design of very large-scale on-chip circuits. The structure specialty of on-chip circuits such as Manhattan geometry and layered permittivity is preserved in the proposed algorithm. As a result, the large-scale matrix solution encountered in the 3-D circuit analysis is turned into a simple scaling of the solution of a small 1-D matrix, which can be obtained in linear (optimal) complexity with negligible cost. Furthermore, the time step size is not sacrificed, and the total number of time steps to be simulated is also significantly reduced, thus achieving a total cost reduction in CPU time. The second contribution is a new method for making an explicit time-domain finite-element method (TDFEM) unconditionally stable for general electromagnetic analysis. In this method, for a given time step, we find the unstable modes that are the root cause of instability, and deduct them directly from the system matrix resulting from a TDFEM based analysis. As a result, an explicit TDFEM simulation is made stable for an arbitrarily large time step irrespective of the space step. The third contribution is a new method for full-wave applications from low to very high frequencies in a TDFEM based on matrix exponential. In this method, we directly deduct the eigenmodes having large eigenvalues from the system matrix, thus achieving a significantly increased time step in the matrix exponential based TDFEM. The fourth contribution is a new method for transforming the indefinite system matrix of a frequency-domain FEM to a symmetric positive definite one. We deduct non-positive definite component directly from the system matrix resulting from a frequency-domain FEM-based analysis. The resulting new representation of the finite-element operator ensures an iterative solution to converge in a small number of iterations. We then add back the non-positive definite component to synthesize the original solution with negligible cost
Recommended from our members
Simulation for Reliability, Hardware Security, and Ising Computing in VLSI Chip Design
The continued scaling of VLSI circuits has provided a wealth of opportunities andchallenges to the VLSI circuit design area. Both these challenges and opportunities, however,require new simulation tools that can enable their solution or exploitation as classicalmethods typically dealt with problem domains with smaller scales or less complexity. Inthis dissertation, simulation methods are presented to address the emerging VLSI designtopics of Electromigration induced aging and Ising computing and are then applied to theapplication areas of hardware security and graph partitioning respectively.The Electromigration aging effect in VLSI circuits is a long-term reliability issueaffecting current carrying metal wires leading to IR drop degradation. Typically, simpleanalytical equations can determine a wire’s effective age or if it will be affected by the EMaging effect at all. However, these classical methods are overly conservative and can lead toover design or unnecessary design iterations. Furthermore, it is expected that the EM agingeffect will become more severe in future Integrated Cirucits (ICs) due to increasing currentdensities and the prevalance of polycrystaline copper atom structures seen at small wiredimensions. For this reason, more comprehensive simulation techniques that can efficientlysimulate the EM effect with less conservative results can help mitigate overdesign andincrease design margins while reducing design iterations.The area of Hardware Security is becoming increasingly important as the chipsupply chain becomes more globalized and the integrity of chips becomes more diffiuclt toverify. Utilizing the accurate simulation techniques for EM, we can utilize this reliabilityeffect to demonstrate how a reliability based attack could be perpatrated. Furthermore, wecan utilize this aging effect as a defense mechanism to help us validate the integrity of anIC and detect counterfeit chips in the component supply chain market.Ising computing is an emerging method of solving combinatorial optimization problemsby simulating the interactions of so-called spin glasses and their interactions. Borrowingconcepts from quantum computing, this methods mimics the quantum interaction betweenspin glasses in such a way that finding a ground state of these spin glass models leadsto the solution of a particular problem. In this dissertation, effective methods of simulatingthe spin glass interactions using General Purpose Graphics Processing Units (GPGPUs)and finding their ground state are developed.In addition to the GPU based Ising model simulations, important combinatorialproblems can be mapped to the Ising model. In this dissertation the Ising solver is appliedto graph partitioning which can be utilized in VLSI design and many other domains as well.Specifically, solvers for the maxcut problem and the balanced min-cut partitioning problemare developed
Krylov's methods in function space for waveform relaxation.
by Wai-Shing Luk.Thesis (Ph.D.)--Chinese University of Hong Kong, 1996.Includes bibliographical references (leaves 104-113).Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Functional Extension of Iterative Methods --- p.2Chapter 1.2 --- Applications in Circuit Simulation --- p.2Chapter 1.3 --- Multigrid Acceleration --- p.3Chapter 1.4 --- Why Hilbert Space? --- p.4Chapter 1.5 --- Parallel Implementation --- p.5Chapter 1.6 --- Domain Decomposition --- p.5Chapter 1.7 --- Contributions of This Thesis --- p.6Chapter 1.8 --- Outlines of the Thesis --- p.7Chapter 2 --- Waveform Relaxation Methods --- p.9Chapter 2.1 --- Basic Idea --- p.10Chapter 2.2 --- Linear Operators between Banach Spaces --- p.14Chapter 2.3 --- Waveform Relaxation Operators for ODE's --- p.16Chapter 2.4 --- Convergence Analysis --- p.19Chapter 2.4.1 --- Continuous-time Convergence Analysis --- p.20Chapter 2.4.2 --- Discrete-time Convergence Analysis --- p.21Chapter 2.5 --- Further references --- p.24Chapter 3 --- Waveform Krylov Subspace Methods --- p.25Chapter 3.1 --- Overview of Krylov Subspace Methods --- p.26Chapter 3.2 --- Krylov Subspace methods in Hilbert Space --- p.30Chapter 3.3 --- Waveform Krylov Subspace Methods --- p.31Chapter 3.4 --- Adjoint Operator for WBiCG and WQMR --- p.33Chapter 3.5 --- Numerical Experiments --- p.35Chapter 3.5.1 --- Test Circuits --- p.36Chapter 3.5.2 --- Unstructured Grid Problem --- p.39Chapter 4 --- Parallel Implementation Issues --- p.50Chapter 4.1 --- DECmpp 12000/Sx Computer and HPF --- p.50Chapter 4.2 --- Data Mapping Strategy --- p.55Chapter 4.3 --- Sparse Matrix Format --- p.55Chapter 4.4 --- Graph Coloring for Unstructured Grid Problems --- p.57Chapter 5 --- The Use of Inexact ODE Solver in Waveform Methods --- p.61Chapter 5.1 --- Inexact ODE Solver for Waveform Relaxation --- p.62Chapter 5.1.1 --- Convergence Analysis --- p.63Chapter 5.2 --- Inexact ODE Solver for Waveform Krylov Subspace Methods --- p.65Chapter 5.3 --- Experimental Results --- p.68Chapter 5.4 --- Concluding Remarks --- p.72Chapter 6 --- Domain Decomposition Technique --- p.80Chapter 6.1 --- Introduction --- p.80Chapter 6.2 --- Overlapped Schwarz Methods --- p.81Chapter 6.3 --- Numerical Experiments --- p.83Chapter 6.3.1 --- Delay Circuit --- p.83Chapter 6.3.2 --- Unstructured Grid Problem --- p.86Chapter 7 --- Conclusions --- p.90Chapter 7.1 --- Summary --- p.90Chapter 7.2 --- Future Works --- p.92Chapter A --- Pseudo Codes for Waveform Krylov Subspace Methods --- p.94Chapter B --- Overview of Recursive Spectral Bisection Method --- p.101Bibliography --- p.10
- …