10,571 research outputs found
Recursive double-size fixed precision arithmetic
International audienceThis work is a part of the SHIVA (Secured Hardware Immune Versatile Architecture) project whose purpose is to provide a programmable and reconfigurable hardware module with high level of security. We propose a recursive double-size fixed precision arithmetic called RecInt. Our work can be split in two parts. First we developped a C++ software library with performances comparable to GMP ones. Secondly our simple representation of the integers allows an implementation on FPGA. Our idea is to consider sizes that are a power of 2 and to apply doubling techniques to implement them efficiently: we design a recursive data structure where integers of size 2^k, for k>k0 can be stored as two integers of size 2^{k-1}. Obviously for k<=k0 we use machine arithmetic instead (k0 depending on the architecture)
Rectangular Full Packed Format for Cholesky's Algorithm: Factorization, Solution and Inversion
We describe a new data format for storing triangular, symmetric, and
Hermitian matrices called RFPF (Rectangular Full Packed Format). The standard
two dimensional arrays of Fortran and C (also known as full format) that are
used to represent triangular and symmetric matrices waste nearly half of the
storage space but provide high performance via the use of Level 3 BLAS.
Standard packed format arrays fully utilize storage (array space) but provide
low performance as there is no Level 3 packed BLAS. We combine the good
features of packed and full storage using RFPF to obtain high performance via
using Level 3 BLAS as RFPF is a standard full format representation. Also, RFPF
requires exactly the same minimal storage as packed format. Each LAPACK full
and/or packed triangular, symmetric, and Hermitian routine becomes a single new
RFPF routine based on eight possible data layouts of RFPF. This new RFPF
routine usually consists of two calls to the corresponding LAPACK full format
routine and two calls to Level 3 BLAS routines. This means {\it no} new
software is required. As examples, we present LAPACK routines for Cholesky
factorization, Cholesky solution and Cholesky inverse computation in RFPF to
illustrate this new work and to describe its performance on several commonly
used computer platforms. Performance of LAPACK full routines using RFPF versus
LAPACK full routines using standard format for both serial and SMP parallel
processing is about the same while using half the storage. Performance gains
are roughly one to a factor of 43 for serial and one to a factor of 97 for SMP
parallel times faster using vendor LAPACK full routines with RFPF than with
using vendor and/or reference packed routines
A study of systems implementation languages for the POCCNET system
The results are presented of a study of systems implementation languages for the Payload Operations Control Center Network (POCCNET). Criteria are developed for evaluating the languages, and fifteen existing languages are evaluated on the basis of these criteria
Fast recursive filters for simulating nonlinear dynamic systems
A fast and accurate computational scheme for simulating nonlinear dynamic
systems is presented. The scheme assumes that the system can be represented by
a combination of components of only two different types: first-order low-pass
filters and static nonlinearities. The parameters of these filters and
nonlinearities may depend on system variables, and the topology of the system
may be complex, including feedback. Several examples taken from neuroscience
are given: phototransduction, photopigment bleaching, and spike generation
according to the Hodgkin-Huxley equations. The scheme uses two slightly
different forms of autoregressive filters, with an implicit delay of zero for
feedforward control and an implicit delay of half a sample distance for
feedback control. On a fairly complex model of the macaque retinal horizontal
cell it computes, for a given level of accuracy, 1-2 orders of magnitude faster
than 4th-order Runge-Kutta. The computational scheme has minimal memory
requirements, and is also suited for computation on a stream processor, such as
a GPU (Graphical Processing Unit).Comment: 20 pages, 8 figures, 1 table. A comparison with 4th-order Runge-Kutta
integration shows that the new algorithm is 1-2 orders of magnitude faster.
The paper is in press now at Neural Computatio
NumGfun: a Package for Numerical and Analytic Computation with D-finite Functions
This article describes the implementation in the software package NumGfun of
classical algorithms that operate on solutions of linear differential equations
or recurrence relations with polynomial coefficients, including what seems to
be the first general implementation of the fast high-precision numerical
evaluation algorithms of Chudnovsky & Chudnovsky. In some cases, our
descriptions contain improvements over existing algorithms. We also provide
references to relevant ideas not currently used in NumGfun
Efficient implementation of the Hardy-Ramanujan-Rademacher formula
We describe how the Hardy-Ramanujan-Rademacher formula can be implemented to
allow the partition function to be computed with softly optimal
complexity and very little overhead. A new implementation
based on these techniques achieves speedups in excess of a factor 500 over
previously published software and has been used by the author to calculate
, an exponent twice as large as in previously reported
computations.
We also investigate performance for multi-evaluation of , where our
implementation of the Hardy-Ramanujan-Rademacher formula becomes superior to
power series methods on far denser sets of indices than previous
implementations. As an application, we determine over 22 billion new
congruences for the partition function, extending Weaver's tabulation of 76,065
congruences.Comment: updated version containing an unconditional complexity proof;
accepted for publication in LMS Journal of Computation and Mathematic
Affine functions and series with co-inductive real numbers
We extend the work of A. Ciaffaglione and P. Di Gianantonio on mechanical
verification of algorithms for exact computation on real numbers, using
infinite streams of digits implemented as co-inductive types. Four aspects are
studied: the first aspect concerns the proof that digit streams can be related
to the axiomatized real numbers that are already axiomatized in the proof
system (axiomatized, but with no fixed representation). The second aspect
re-visits the definition of an addition function, looking at techniques to let
the proof search mechanism perform the effective construction of an algorithm
that is correct by construction. The third aspect concerns the definition of a
function to compute affine formulas with positive rational coefficients. This
should be understood as a testbed to describe a technique to combine
co-recursion and recursion to obtain a model for an algorithm that appears at
first sight to be outside the expressive power allowed by the proof system. The
fourth aspect concerns the definition of a function to compute series, with an
application on the series that is used to compute Euler's number e. All these
experiments should be reproducible in any proof system that supports
co-inductive types, co-recursion and general forms of terminating recursion,
but we performed with the Coq system [12, 3, 14]
Efficient implementation of symplectic implicit Runge-Kutta schemes with simplified Newton iterations
We are concerned with the efficient implementation of symplectic implicit
Runge-Kutta (IRK) methods applied to systems of (non-necessarily Hamiltonian)
ordinary differential equations by means of Newton-like iterations. We pay
particular attention to symmetric symplectic IRK schemes (such as collocation
methods with Gaussian nodes). For a -stage IRK scheme used to integrate a
-dimensional system of ordinary differential equations, the application of
simplified versions of Newton iterations requires solving at each step several
linear systems (one per iteration) with the same real
coefficient matrix. We propose rewriting such -dimensional linear systems
as an equivalent -dimensional systems that can be solved by performing
the LU decompositions of real matrices of size . We
present a C implementation (based on Newton-like iterations) of Runge-Kutta
collocation methods with Gaussian nodes that make use of such a rewriting of
the linear system and that takes special care in reducing the effect of
round-off errors. We report some numerical experiments that demonstrate the
reduced round-off error propagation of our implementation
- …