37 research outputs found

    Solving Burgersʼ equation using optimal rational approximations

    Get PDF
    AbstractWe solve viscous Burgersʼ equation using a fast and accurate algorithm—referred to here as the reduction algorithm—for computing near optimal rational approximations.Given a proper rational function with n poles, the reduction algorithm computes (for a desired L∞-approximation error) a rational approximation of the same form, but with a (near) optimally small number m≪n of poles. Although it is well known that (nonlinear) optimal rational approximations are much more efficient than linear representations of functions via a fixed basis (e.g. wavelets), their use in numerical computations has been limited by a lack of efficient, robust, and accurate algorithms. The reduction algorithm presented here computes reliably (near) optimal rational approximations with high accuracy (e.g., ≈10−14) and a complexity that is essentially linear in the number n of original poles. A key tool is a recently developed algorithm for computing small con-eigenvalues of Cauchy matrices with high relative accuracy, an impossible task for standard algorithms without extended precision.Using the reduction algorithm, we develop a numerical calculus for rational representations of functions. Indeed, while operations such as multiplication and convolution increase the number of poles in the representation, we use the reduction algorithm to maintain an optimally small number of poles.To demonstrate the efficiency, robustness, and accuracy of our approach, we solve Burgersʼ equation with small viscosity ν. It is well known that its solutions exhibit moving transition regions of width O(ν), so that this equation provides a stringent test for adaptive PDE solvers. We show that optimal rational approximations capture the solutions with high accuracy using a small number of poles. In particular, we solve the equation with local accuracy ϵ=10−9 for viscosity as small as ν=10−5

    Flexible rational approximation and its application for matrix functions

    Full text link
    This paper proposes a unique optimization approach for estimating the minimax rational approximation and its application for evaluating matrix functions. Our method enables the extension to generalized rational approximations and has the flexibility of adding constraints. In particular, the latter allows us to control specific properties preferred in matrix function evaluation. For example, in the case of a normal matrix, we can guarantee a bound over the condition number of the matrix, which one needs to invert for evaluating the rational matrix function. We demonstrate the efficiency of our approach for several applications of matrix functions based on direct spectrum filtering

    Rational approximation preconditioners for multiphysics problems

    Full text link
    We consider a class of mathematical models describing multiphysics phenomena interacting through interfaces. On such interfaces, the traces of the fields lie (approximately) in the range of a weighted sum of two fractional differential operators. We use a rational function approximation to precondition such operators. We first demonstrate the robustness of the approximation for ordinary functions given by weighted sums of fractional exponents. Additionally, we present more realistic examples utilizing the proposed preconditioning techniques in interface coupling between Darcy and Stokes equations

    Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions

    Full text link
    Recent advances in attention-free sequence models rely on convolutions as alternatives to the attention operator at the core of Transformers. In particular, long convolution sequence models have achieved state-of-the-art performance in many domains, but incur a significant cost during auto-regressive inference workloads -- naively requiring a full pass (or caching of activations) over the input sequence for each generated token -- similarly to attention-based models. In this paper, we seek to enable O(1)\mathcal O(1) compute and memory cost per token in any pre-trained long convolution architecture to reduce memory footprint and increase throughput during generation. Concretely, our methods consist in extracting low-dimensional linear state-space models from each convolution layer, building upon rational interpolation and model-order reduction techniques. We further introduce architectural improvements to convolution-based layers such as Hyena: by weight-tying the filters across channels into heads, we achieve higher pre-training quality and reduce the number of filters to be distilled. The resulting model achieves 10x higher throughput than Transformers and 1.5x higher than Hyena at 1.3B parameters, without any loss in quality after distillation
    corecore