30 research outputs found

    Asynchronous and Multiprecision Linear Solvers - Scalable and Fault-Tolerant Numerics for Energy Efficient High Performance Computing

    Get PDF
    Asynchronous methods minimize idle times by removing synchronization barriers, and therefore allow the efficient usage of computer systems. The implied high tolerance with respect to communication latencies improves the fault tolerance. As asynchronous methods also enable the usage of the power and energy saving mechanisms provided by the hardware, they are suitable candidates for the highly parallel and heterogeneous hardware platforms that are expected for the near future

    Reduction of large-scale RLCk models via low-rank balanced truncation

    Full text link
    Model order reduction (MOR) is an important step in the design process of integrated circuits. Specifically, the electromagnetic models extracted from modern complex designs result in a large number of passive elements that introduce limitations in the simulation process. MOR techniques based on balanced truncation (BT) can overcome these limitations by producing compact reduced-order models (ROMs) that approximate the behavior of the original models at the input/output ports. In this paper, we present a low-rank BT method that exploits the extended Krylov subspace and efficient implementation techniques for the reduction of large-scale models. Experimental evaluation on a diverse set of analog and mixed-signal circuits with millions of elements indicates that up to x5.5 smaller ROMs can be produced with similar accuracy to ANSYS RaptorX ROMs

    Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing

    Get PDF
    Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM

    Power grid verification and optimization

    Get PDF
    IR-drop is the voltage drop that is caused by the impedance of power grid and devices' switchings. It is important to verify voltage values of nodes on power grids. To make the circuit work reliably, it is preferable to reduce the voltage drops. Nowadays, power grids are usually in large size, which results in the runtime and memory bottleneck with traditional methods. In order to address these issues, we focus on developing efficient methods to perform power grid verification and optimization. There are three topics related to power grid verification. Based on the distributed memory system, we propose an efficient parallel domain decomposition method for power grid DC analysis. The largest power grid size that can be solved is not limited by the memory of a single processor. We develop an efficient method to balance the data load of all the processors. Only voltage values of boundary nodes are extracted and exchanged for data communication. The communication overhead is minimal. With over 1000 processors, the proposed method achieves a 110X speedup over a state-of-art LU solver. A power grid with 192 M nodes can be processed within minutes. To accelerate the power grid transient analysis, we present PGT_SOLVER. This direct method based solver is developed on a shared memory system. Advanced techniques such as sparse vector and solution mapping are developed or utilized to accelerate the forward and backward substitutions in each time step. Multiple threads are utilized to further reduce runtime. As the first-place winner in the ``TAU_2012 power grid simulation contest'', PGT_SOLVER effectively reduces the runtime of transient analysis without introducing any error. Memory consumption of this solver is also very affordable. Combining the flow of parallel DC analysis and techniques of PGT_SOLVER, we develop an effective parallel method for power grid transient analysis. Special considerations are made to achieve better performance, such as power grid partitioning. With only a few hundred processors, over 69X speedup is achieved compared to the sequential PGT_SOLVER. To alleviate the memory usage of solving a large size power grid, the parallel process can be operated in multiple steps. With fewer processors, the propose method is still capable of performing efficient simulation of large power grids. Besides developing parallel solvers to accelerate DC and transient analysis, we also explore a few methods to reduce the IR-drop values of the power grids. These include the optimization of power pads and the on-chip low-dropout voltage regulator (LDO). With a fixed number of power pads, we develop a method to relocate the pads to optimize DC IR-drop values. A novel IR-drop driven method is proposed to calculate effective locations of the power pads. Moving pads to these locations effectively reduces the IR-drop values. Multiple power pads are moved simultaneously, which accelerates the optimization. Within limited iterations, IR-drop values of the power grids are effectively reduced. By integrating the on-chip low-dropout voltage regulator, transient IR-drop values can be reduced. We propose a simulation-based method to integrate LDOs into the power grids. A hybrid flow is utilized to perform the transient analysis. The Cholesky direct solver and SPICE are utilized in the simulation flow. With an effective optimization method, a set of LDOs is added into the power grids and placed at locations which effectively reduce transient IR-drop values
    corecore