1,939 research outputs found

    Paramotopy: Parameter homotopies in parallel

    Full text link
    Numerical algebraic geometry provides a number of efficient tools for approximating the solutions of polynomial systems. One such tool is the parameter homotopy, which can be an extremely efficient method to solve numerous polynomial systems that differ only in coefficients, not monomials. This technique is frequently used for solving a parameterized family of polynomial systems at multiple parameter values. Parameter homotopies have recently been useful in several areas of application and have been implemented in at least two software packages. This article describes Paramotopy, a new, parallel, optimized implementation of this technique, making use of the Bertini software package. The novel features of this implementation, not available elsewhere, include allowing for the simultaneous solutions of arbitrary polynomial systems in a parameterized family on an automatically generated (or manually provided) mesh in the parameter space of coefficients, front ends and back ends that are easily specialized to particular classes of problems, and adaptive techniques for solving polynomial systems near singular points in the parameter space. This last feature automates and simplifies a task that is important but often misunderstood by non-experts.Comment: Long version of ICMS extended abstrac

    Automatic differentiation in ML: Where we are and where we should be going

    Full text link
    We review the current state of automatic differentiation (AD) for array programming in machine learning (ML), including the different approaches such as operator overloading (OO) and source transformation (ST) used for AD, graph-based intermediate representations for programs, and source languages. Based on these insights, we introduce a new graph-based intermediate representation (IR) which specifically aims to efficiently support fully-general AD for array programming. Unlike existing dataflow programming representations in ML frameworks, our IR naturally supports function calls, higher-order functions and recursion, making ML models easier to implement. The ability to represent closures allows us to perform AD using ST without a tape, making the resulting derivative (adjoint) program amenable to ahead-of-time optimization using tools from functional language compilers, and enabling higher-order derivatives. Lastly, we introduce a proof of concept compiler toolchain called Myia which uses a subset of Python as a front end

    A Python Extension for the Massively Parallel Multiphysics Simulation Framework waLBerla

    Full text link
    We present a Python extension to the massively parallel HPC simulation toolkit waLBerla. waLBerla is a framework for stencil based algorithms operating on block-structured grids, with the main application field being fluid simulations in complex geometries using the lattice Boltzmann method. Careful performance engineering results in excellent node performance and good scalability to over 400,000 cores. To increase the usability and flexibility of the framework, a Python interface was developed. Python extensions are used at all stages of the simulation pipeline: They simplify and automate scenario setup, evaluation, and plotting. We show how our Python interface outperforms the existing text-file-based configuration mechanism, providing features like automatic nondimensionalization of physical quantities and handling of complex parameter dependencies. Furthermore, Python is used to process and evaluate results while the simulation is running, leading to smaller output files and the possibility to adjust parameters dependent on the current simulation state. C++ data structures are exported such that a seamless interfacing to other numerical Python libraries is possible. The expressive power of Python and the performance of C++ make development of efficient code with low time effort possible

    A Randomized Algorithm for Approximating the Log Determinant of a Symmetric Positive Definite Matrix

    Full text link
    We introduce a novel algorithm for approximating the logarithm of the determinant of a symmetric positive definite (SPD) matrix. The algorithm is randomized and approximates the traces of a small number of matrix powers of a specially constructed matrix, using the method of Avron and Toledo~\cite{AT11}. From a theoretical perspective, we present additive and relative error bounds for our algorithm. Our additive error bound works for any SPD matrix, whereas our relative error bound works for SPD matrices whose eigenvalues lie in the interval (θ1,1)(\theta_1,1), with 0<θ1<10<\theta_1<1; the latter setting was proposed in~\cite{icml2015_hana15}. From an empirical perspective, we demonstrate that a C++ implementation of our algorithm can approximate the logarithm of the determinant of large matrices very accurately in a matter of seconds.Comment: working pape

    Assessing Excel VBA Suitability for Monte Carlo Simulation

    Full text link
    Monte Carlo (MC) simulation includes a wide range of stochastic techniques used to quantitatively evaluate the behavior of complex systems or processes. Microsoft Excel spreadsheets with Visual Basic for Applications (VBA) software is, arguably, the most commonly employed general purpose tool for MC simulation. Despite the popularity of the Excel in many industries and educational institutions, it has been repeatedly criticized for its flaws and often described as questionable, if not completely unsuitable, for statistical problems. The purpose of this study is to assess suitability of the Excel (specifically its 2010 and 2013 versions) with VBA programming as a tool for MC simulation. The results of the study indicate that Microsoft Excel (versions 2010 and 2013) is a strong Monte Carlo simulation application offering a solid framework of core simulation components including spreadsheets for data input and output, VBA development environment and summary statistics functions. This framework should be complemented with an external high-quality pseudo-random number generator added as a VBA module. A large and diverse category of Excel incidental simulation components that includes statistical distributions, linear and non-linear regression and other statistical, engineering and business functions require execution of due diligence to determine their suitability for a specific MC project

    An Optimization Framework to Improve 4D-Var Data Assimilation System Performance

    Full text link
    This paper develops a computational framework for optimizing the parameters of data assimilation systems in order to improve their performance. The approach formulates a continuous meta-optimization problem for parameters; the meta-optimization is constrained by the original data assimilation problem. The numerical solution process employs adjoint models and iterative solvers. The proposed framework is applied to optimize observation values, data weighting coefficients, and the location of sensors for a test problem. The ability to optimize a distributed measurement network is crucial for cutting down operating costs and detecting malfunctions

    A performance spectrum for parallel computational frameworks that solve PDEs

    Full text link
    Important computational physics problems are often large-scale in nature, and it is highly desirable to have robust and high performing computational frameworks that can quickly address these problems. However, it is no trivial task to determine whether a computational framework is performing efficiently or is scalable. The aim of this paper is to present various strategies for better understanding the performance of any parallel computational frameworks for solving PDEs. Important performance issues that negatively impact time-to-solution are discussed, and we propose a performance spectrum analysis that can enhance one's understanding of critical aforementioned performance issues. As proof of concept, we examine commonly used finite element simulation packages and software and apply the performance spectrum to quickly analyze the performance and scalability across various hardware platforms, software implementations, and numerical discretizations. It is shown that the proposed performance spectrum is a versatile performance model that is not only extendable to more complex PDEs such as hydrostatic ice sheet flow equations, but also useful for understanding hardware performance in a massively parallel computing environment. Potential applications and future extensions of this work are also discussed

    An efficient null space inexact Newton method for hydraulic simulation of water distribution networks

    Full text link
    Null space Newton algorithms are efficient in solving the nonlinear equations arising in hydraulic analysis of water distribution networks. In this article, we propose and evaluate an inexact Newton method that relies on partial updates of the network pipes' frictional headloss computations to solve the linear systems more efficiently and with numerical reliability. The update set parameters are studied to propose appropriate values. Different null space basis generation schemes are analysed to choose methods for sparse and well-conditioned null space bases resulting in a smaller update set. The Newton steps are computed in the null space by solving sparse, symmetric positive definite systems with sparse Cholesky factorizations. By using the constant structure of the null space system matrices, a single symbolic factorization in the Cholesky decomposition is used multiple times, reducing the computational cost of linear solves. The algorithms and analyses are validated using medium to large-scale water network models.Comment: 15 pages, 9 figures, Preprint extension of Abraham and Stoianov, 2015 (https://dx.doi.org/10.1061/(ASCE)HY.1943-7900.0001089), September 2015. Includes extended exposition, additional case studies and new simulations and analysi

    A Novel Partitioning Method for Accelerating the Block Cimmino Algorithm

    Get PDF
    We propose a novel block-row partitioning method in order to improve the convergence rate of the block Cimmino algorithm for solving general sparse linear systems of equations. The convergence rate of the block Cimmino algorithm depends on the orthogonality among the block rows obtained by the partitioning method. The proposed method takes numerical orthogonality among block rows into account by proposing a row inner-product graph model of the coefficient matrix. In the graph partitioning formulation defined on this graph model, the partitioning objective of minimizing the cutsize directly corresponds to minimizing the sum of inter-block inner products between block rows thus leading to an improvement in the eigenvalue spectrum of the iteration matrix. This in turn leads to a significant reduction in the number of iterations required for convergence. Extensive experiments conducted on a large set of matrices confirm the validity of the proposed method against a state-of-the-art method

    Generalized Rybicki Press algorithm

    Full text link
    This article discusses a more general and numerically stable Rybicki Press algorithm, which enables inverting and computing determinants of covariance matrices, whose elements are sums of exponentials. The algorithm is true in exact arithmetic and relies on introducing new variables and corresponding equations, thereby converting the matrix into a banded matrix of larger size. Linear complexity banded algorithms for solving linear systems and computing determinants on the larger matrix enable linear complexity algorithms for the initial semi-separable matrix as well. Benchmarks provided illustrate the linear scaling of the algorithm.Comment: 13 pages, 11 figures, 1 tabl
    corecore