196 research outputs found

    SCALABLE INTEGRATED CIRCUIT SIMULATION ALGORITHMS FOR ENERGY-EFFICIENT TERAFLOP HETEROGENEOUS PARALLEL COMPUTING PLATFORMS

    Get PDF
    Integrated circuit technology has gone through several decades of aggressive scaling.It is increasingly challenging to analyze growing design complexity. Post-layout SPICE simulation can be computationally prohibitive due to the huge amount of parasitic elements, which can easily boost the computation and memory cost. As the decrease in device size, the circuits become more vulnerable to process variations. Designers need to statistically simulate the probability that a circuit does not meet the performance metric, which requires millions times of simulations to capture rare failure events. Recent, multiprocessors with heterogeneous architecture have emerged as mainstream computing platforms. The heterogeneous computing platform can achieve highthroughput energy efficient computing. However, the application of such platform is not trivial and needs to reinvent existing algorithms to fully utilize the computing resources. This dissertation presents several new algorithms to address those aforementioned two significant and challenging issues on the heterogeneous platform. Harmonic Balance (HB) analysis is essential for efficient verification of large postlayout RF and microwave integrated circuits (ICs). However, existing methods either suffer from excessively long simulation time and prohibitively large memory consumption or exhibit poor stability. This dissertation introduces a novel transient-simulation guided graph sparsification technique, as well as an efficient runtime performance modeling approach tailored for heterogeneous manycore CPU-GPU computing system to build nearly-optimal subgraph preconditioners that can lead to minimum HB simulation runtime. Additionally, we propose a novel heterogeneous parallel sparse block matrix algorithm by taking advantages of the structure of HB Jacobian matrices as well as GPU’s streaming multiprocessors to achieve optimal workload balancing during the preconditioning phase of HB analysis. We also show how the proposed preconditioned iterative algorithm can efficiently adapt to heterogeneous computing systems with different CPU and GPU computing capabilities. Extensive experimental results show that our HB solver can achieve up to 20X speedups and 5X memory reduction when compared with the state-of-the-art direct solver highly optimized for twelve-core CPUs. In nowadays variation-aware IC designs, cell characterizations and SRAM memory yield analysis require many thousands or even millions of repeated SPICE simulations for relatively small nonlinear circuits. In this dissertation, for the first time, we present a massively parallel SPICE simulator on GPU, TinySPICE, for efficiently analyzing small nonlinear circuits. TinySPICE integrates a highly-optimized shared-memory based matrix solver and fast parametric three-dimensional (3D) LUTs based device evaluation method. A novel circuit clustering method is also proposed to improve the stability and efficiency of the matrix solver. Compared with CPU-based SPICE simulator, TinySPICE achieves up to 264X speedups for parametric SRAM yield analysis without loss of accuracy

    Nonparametric Sparsification of Complex Multiscale Networks

    Get PDF
    Many real-world networks tend to be very dense. Particular examples of interest arise in the construction of networks that represent pairwise similarities between objects. In these cases, the networks under consideration are weighted, generally with positive weights between any two nodes. Visualization and analysis of such networks, especially when the number of nodes is large, can pose significant challenges which are often met by reducing the edge set. Any effective “sparsification” must retain and reflect the important structure in the network. A common method is to simply apply a hard threshold, keeping only those edges whose weight exceeds some predetermined value. A more principled approach is to extract the multiscale “backbone” of a network by retaining statistically significant edges through hypothesis testing on a specific null model, or by appropriately transforming the original weight matrix before applying some sort of threshold. Unfortunately, approaches such as these can fail to capture multiscale structure in which there can be small but locally statistically significant similarity between nodes. In this paper, we introduce a new method for backbone extraction that does not rely on any particular null model, but instead uses the empirical distribution of similarity weight to determine and then retain statistically significant edges. We show that our method adapts to the heterogeneity of local edge weight distributions in several paradigmatic real world networks, and in doing so retains their multiscale structure with relatively insignificant additional computational costs. We anticipate that this simple approach will be of great use in the analysis of massive, highly connected weighted networks

    Parametric and kinetic minimum spanning trees

    Get PDF
    We consider the parametric minimum spanning tree problem, in which we are given a graph with edge weights that are linear functions of a parameter, and wish to computethe sequence of minimum spanning trees generated as, varies. We also consider the kinetic minimum spanning tree problem, in which, represents time and the graph is subject in addition to changes such as edge insertions, deletions, and modifications of the weight functions as time progresses. We solve both problems in time O(n.pow2(2/3).log(4/3).n) per combinatorial change in the tree (or randomized O(n.pow2(2/3).log(n)) per change). Our time bounds reduce to O(n.pow2(1/2).log(3/2).n) per change (O(n.pow2(1/2).log(n)) randomized) for planar graphs or other minor-closed families of graphs, and O(n.pow2(1/4).log(3/2).n) per change (O(n.pow2(1/4).log(n) randomized) for planar graphs with weight changes but no insertions or deletions

    Minimum Spanning Trees in Weakly Dynamic Graphs

    Get PDF
    International audienceIn this paper, we study weakly dynamic undirected graphs, that can be used to represent some logistic networks. The goal is to deliver all the delivery points in the network. The network exists in a mostly stable environment, except for a few edges known to be non-stable. The weight of each of these non-stable edges may change at any time (bascule or lift bridge, elevator, traffic congestion...). All other edges have stable weights that never change. This problem can be now considered as a Minimum Spanning Tree (MST) problem on a dynamic graph. We propose an efficient polynomial algorithm that computes in advance alternative MSTs for all possible configurations. No additional computation is then needed after any change in the problem because the MSTs are already known in all cases. We use these results to compute critical values for the non-stable weights and to pre-compute best paths. When the non-stable weights change, the appropriate MST may then directly and immediately be used without any recomputation

    Sparsification Upper and Lower Bounds for Graphs Problems and Not-All-Equal SAT

    Get PDF
    We present several sparsification lower and upper bounds for classic problems in graph theory and logic. For the problems 4-Coloring, (Directed) Hamiltonian Cycle, and (Connected) Dominating Set, we prove that there is no polynomial-time algorithm that reduces any n-vertex input to an equivalent instance, of an arbitrary problem, with bitsize O(n^{2-epsilon}) for epsilon > 0, unless NP is a subset of coNP/poly and the polynomial-time hierarchy collapses. These results imply that existing linear-vertex kernels for k-Nonblocker and k-Max Leaf Spanning Tree (the parametric duals of (Connected) Dominating Set) cannot be improved to have O(k^{2-epsilon}) edges, unless NP is a subset of NP/poly. We also present a positive result and exhibit a non-trivial sparsification algorithm for d-Not-All-Equal-SAT. We give an algorithm that reduces an n-variable input with clauses of size at most d to an equivalent input with O(n^{d-1}) clauses, for any fixed d. Our algorithm is based on a linear-algebraic proof of LovĂĄsz that bounds the number of hyperedges in critically 3-chromatic d-uniform n-vertex hypergraphs by binom{n}{d-1}. We show that our kernel is tight under the assumption that NP is not a subset of NP/poly
    • 

    corecore