2,309 research outputs found

    Distributed Semidefinite Programming with Application to Large-scale System Analysis

    Full text link
    Distributed algorithms for solving coupled semidefinite programs (SDPs) commonly require many iterations to converge. They also put high computational demand on the computational agents. In this paper we show that in case the coupled problem has an inherent tree structure, it is possible to devise an efficient distributed algorithm for solving such problems. This algorithm can potentially enjoy the same efficiency as centralized solvers that exploit sparsity. The proposed algorithm relies on predictor-corrector primal-dual interior-point methods, where we use a message-passing algorithm to compute the search directions distributedly. Message-passing here is closely related to dynamic programming over trees. This allows us to compute the exact search directions in a finite number of steps. Furthermore this number can be computed a priori and only depends on the coupling structure of the problem. We use the proposed algorithm for analyzing robustness of large-scale uncertain systems distributedly. We test the performance of this algorithm using numerical examples.Comment: 14 pages and 6 figurs. Submitted to IEEE Transactions on Automatic Contro

    Extension and optimization of the FIND algorithm: computing Green's and less-than Green's functions (with technical appendix)

    Full text link
    The FIND algorithm is a fast algorithm designed to calculate certain entries of the inverse of a sparse matrix. Such calculation is critical in many applications, e.g., quantum transport in nano-devices. We extended the algorithm to other matrix inverse related calculations. Those are required for example to calculate the less-than Green's function and the current density through the device. For a 2D device discretized as an N_x x N_y mesh, the best known algorithms have a running time of O(N_x^3 N_y), whereas FIND only requires O(N_x^2 N_y). Even though this complexity has been reduced by an order of magnitude, the matrix inverse calculation is still the most time consuming part in the simulation of transport problems. We could not reduce the order of complexity, but we were able to significantly reduce the constant factor involved in the computation cost. By exploiting the sparsity and symmetry, the size of the problem beyond which FIND is faster than other methods typically decreases from a 130x130 2D mesh down to a 40x40 mesh. These improvements make the optimized FIND algorithm even more competitive for real-life applications

    Distributed Localization of Tree-structured Scattered Sensor Networks

    Full text link
    Many of the distributed localization algorithms are based on relaxed optimization formulations of the localization problem. These algorithms commonly rely on first-order optimization methods, and hence may require many iterations or communications among computational agents. Furthermore, some of these distributed algorithms put a considerable computational demand on the agents. In this paper, we show that for tree-structured scattered sensor networks, which are networks that their inter-sensor range measurement graphs have few edges (few range measurements among sensors) and can be represented using a tree, it is possible to devise an efficient distributed localization algorithm that solely relies on second-order methods. Particularly, we apply a state-of-the-art primal-dual interior-point method to a semidefinite relaxation of the maximum-likelihood formulation of the localization problem. We then show how it is possible to exploit the tree-structure in the network and use message-passing or dynamic programming over trees, to distribute computations among different computational agents. The resulting algorithm requires far fewer iterations and communications among agents to converge to an accurate estimate. Moreover, the number of required communications among agents, seems to be less sensitive and more robust to the number of sensors in the network, the number of available measurements and the quality of the measurements. This is in stark contrast to distributed algorithms that rely on first-order methods. We illustrate the performance of our algorithm using experiments based on simulated and real data.Comment: 14 pages and 11 Figure

    A distributed-memory hierarchical solver for general sparse linear systems

    Full text link
    We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by every processor. We present various numerical results to demonstrate the versatility and scalability of the parallel algorithm

    Data-driven approximations of dynamical systems operators for control

    Full text link
    The Koopman and Perron Frobenius transport operators are fundamentally changing how we approach dynamical systems, providing linear representations for even strongly nonlinear dynamics. Although there is tremendous potential benefit of such a linear representation for estimation and control, transport operators are infinite-dimensional, making them difficult to work with numerically. Obtaining low-dimensional matrix approximations of these operators is paramount for applications, and the dynamic mode decomposition has quickly become a standard numerical algorithm to approximate the Koopman operator. Related methods have seen rapid development, due to a combination of an increasing abundance of data and the extensibility of DMD based on its simple framing in terms of linear algebra. In this chapter, we review key innovations in the data-driven characterization of transport operators for control, providing a high-level and unified perspective. We emphasize important recent developments around sparsity and control, and discuss emerging methods in big data and machine learning.Comment: 37 pages, 4 figure

    Tracking Switched Dynamic Network Topologies from Information Cascades

    Full text link
    Contagions such as the spread of popular news stories, or infectious diseases, propagate in cascades over dynamic networks with unobservable topologies. However, "social signals" such as product purchase time, or blog entry timestamps are measurable, and implicitly depend on the underlying topology, making it possible to track it over time. Interestingly, network topologies often "jump" between discrete states that may account for sudden changes in the observed signals. The present paper advocates a switched dynamic structural equation model to capture the topology-dependent cascade evolution, as well as the discrete states driving the underlying topologies. Conditions under which the proposed switched model is identifiable are established. Leveraging the edge sparsity inherent to social networks, a recursive â„“1\ell_1-norm regularized least-squares estimator is put forth to jointly track the states and network topologies. An efficient first-order proximal-gradient algorithm is developed to solve the resulting optimization problem. Numerical experiments on both synthetic data and real cascades measured over the span of one year are conducted, and test results corroborate the efficacy of the advocated approach

    Embedded Ensemble Propagation for Improving Performance, Portability and Scalability of Uncertainty Quantification on Emerging Computational Architectures

    Full text link
    Quantifying simulation uncertainties is a critical component of rigorous predictive simulation. A key component of this is forward propagation of uncertainties in simulation input data to output quantities of interest. Typical approaches involve repeated sampling of the simulation over the uncertain input data, and can require numerous samples when accurately propagating uncertainties from large numbers of sources. Often simulation processes from sample to sample are similar and much of the data generated from each sample evaluation could be reused. We explore a new method for implementing sampling methods that simultaneously propagates groups of samples together in an embedded fashion, which we call embedded ensemble propagation. We show how this approach takes advantage of properties of modern computer architectures to improve performance by enabling reuse between samples, reducing memory bandwidth requirements, improving memory access patterns, improving opportunities for fine-grained parallelization, and reducing communication costs. We describe a software technique for implementing embedded ensemble propagation based on the use of C++ templates and describe its integration with various scientific computing libraries within Trilinos. We demonstrate improved performance, portability and scalability for the approach applied to the simulation of partial differential equations on a variety of CPU, GPU, and accelerator architectures, including up to 131,072 cores on a Cray XK7 (Titan)

    A Relaxation-based Network Decomposition Algorithm for Parallel Transient Stability Simulation with Improved Convergence

    Full text link
    Transient stability simulation of a large-scale and interconnected electric power system involves solving a large set of differential algebraic equations (DAEs) at every simulation time-step. With the ever-growing size and complexity of power grids, dynamic simulation becomes more time-consuming and computationally difficult using conventional sequential simulation techniques. To cope with this challenge, this paper aims to develop a fully distributed approach intended for implementation on High Performance Computer (HPC) clusters. A novel, relaxation-based domain decomposition algorithm known as Parallel-General-Norton with Multiple-port Equivalent (PGNME) is proposed as the core technique of a two-stage decomposition approach to divide the overall dynamic simulation problem into a set of subproblems that can be solved concurrently to exploit parallelism and scalability. While the convergence property has traditionally been a concern for relaxation-based decomposition, an estimation mechanism based on multiple-port network equivalent is adopted as the preconditioner to enhance the convergence of the proposed algorithm. The proposed algorithm is illustrated using rigorous mathematics and validated both in terms of speed-up and capability. Moreover, a complexity analysis is performed to support the observation that PGNME scales well when the size of the subproblems are sufficiently large

    Deformation corrected compressed sensing (DC-CS): a novel framework for accelerated dynamic MRI

    Full text link
    We propose a novel deformation corrected compressed sensing (DC-CS) framework to recover dynamic magnetic resonance images from undersampled measurements. We introduce a generalized formulation that is capable of handling a wide class of sparsity/compactness priors on the deformation corrected dynamic signal. In this work, we consider example compactness priors such as sparsity in temporal Fourier domain, sparsity in temporal finite difference domain, and nuclear norm penalty to exploit low rank structure. Using variable splitting, we decouple the complex optimization problem to simpler and well understood sub problems; the resulting algorithm alternates between simple steps of shrinkage based denoising, deformable registration, and a quadratic optimization step. Additionally, we employ efficient continuation strategies to minimize the risk of convergence to local minima. The proposed formulation contrasts with existing DC-CS schemes that are customized for free breathing cardiac cine applications, and other schemes that rely on fully sampled reference frames or navigator signals to estimate the deformation parameters. The efficient decoupling enabled by the proposed scheme allows its application to a wide range of applications including contrast enhanced dynamic MRI. Through experiments on numerical phantom and in vivo myocardial perfusion MRI datasets, we demonstrate the utility of the proposed DC-CS scheme in providing robust reconstructions with reduced motion artifacts over classical compressed sensing schemes that utilize the compact priors on the original deformation un-corrected signal

    Using the VBARMS method in parallel computing

    Full text link
    The paper describes an improved parallel MPI-based implementation of VBARMS, a variable block variant of the pARMS preconditioner proposed by Li,~Saad and Sosonkina [NLAA, 2003] for solving general nonsymmetric linear systems. The parallel VBARMS solver can detect automatically exact or approximate dense structures in the linear system, and exploits this information to achieve improved reliability and increased throughput during the factorization. A novel graph compression algorithm is discussed that finds these approximate dense blocks structures and requires only one simple to use parameter. A complete study of the numerical and parallel performance of parallel VBARMS is presented for the analysis of large turbulent Navier-Stokes equations on a suite of three-dimensional test cases
    • …
    corecore