130 research outputs found

    Finding the "truncated" polynomial that is closest to a function

    Get PDF
    When implementing regular enough functions (e.g., elementary or special functions) on a computing system, we frequently use polynomial approximations. In most cases, the polynomial that best approximates (for a given distance and in a given interval) a function has coefficients that are not exactly representable with a finite number of bits. And yet, the polynomial approximations that are actually implemented do have coefficients that are represented with a finite - and sometimes small - number of bits: this is due to the finiteness of the floating-point representations (for software implementations), and to the need to have small, hence fast and/or inexpensive, multipliers (for hardware implementations). We then have to consider polynomial approximations for which the degree-ii coefficient has at most mim_i fractional bits (in other words, it is a rational number with denominator 2mi2^{m_i}). We provide a general method for finding the best polynomial approximation under this constraint. Then, we suggest refinements than can be used to accelerate our method.Comment: 14 pages, 1 figur

    Correctly rounded multiplication by arbitrary precision constants

    Get PDF
    We introduce an algorithm for multiplying a floating-point number xx by a constant CC that is not exactly representable in floating-point arithmetic. Our algorithm uses a multiplication and a fused multiply accumulate instruction. We give methods for checking whether, for a given value of CC and a given floating-point format, our algorithm returns a correctly rounded result for any xx. When it does not, our methods give the values xx for which the multiplication is not correctly rounded.Nous proposons un algorithme permettant de multiplier un nombre virgule flottante x par une constante C qui n’est pas exactement représentable en virgule flottante.Notre algorithme nécessite la disponibilité d’une instruction “multiplication-accumulation”. Nous donnons des méthodes pour tester si,pour une constante C et un format virgule flottante donnés, notre algorithme donnera un arrondi correct pour toutes les valeurs de x.Quand ce n’est pas le cas,nos méthodes permettent de connaître toutes les valeurs de x pour lesquelles la multiplication par C n’est pas arrondie correctement

    Chebyshev Interpolation Polynomial-based Tools for Rigorous Computing

    Get PDF
    17 pagesInternational audiencePerforming numerical computations, yet being able to provide rigorous mathematical statements about the obtained result, is required in many domains like global optimization, ODE solving or integration. Taylor models, which associate to a function a pair made of a Taylor approximation polynomial and a rigorous remainder bound, are a widely used rigorous computation tool. This approach benefits from the advantages of numerical methods, but also gives the ability to make reliable statements about the approximated function. Despite the fact that approximation polynomials based on interpolation at Chebyshev nodes offer a quasi-optimal approximation to a function, together with several other useful features, an analogous to Taylor models, based on such polynomials, has not been yet well-established in the field of validated numerics. This paper presents a preliminary work for obtaining such interpolation polynomials together with validated interval bounds for approximating univariate functions. We propose two methods that make practical the use of this: one is based on a representation in Newton basis and the other uses Chebyshev polynomial basis. We compare the quality of the obtained remainders and the performance of the approaches to the ones provided by Taylor models

    Integer and Floating-Point Constant Multipliers for FPGAs

    Get PDF
    International audienceReconfigurable circuits now have a capacity that allows them to be used as floating-point accelerators. They offer massive parallelism, but also the opportunity to design optimised floating-point hardware operators not available in microprocessors. Multiplication by a constant is an important example of such an operator. This article presents an architecture generator for the correctly rounded multiplication of a floating-point number by a constant. This constant can be a floating-point value, but also an arbitrary irrational number. The multiplication of the significands is an instance of the well-studied problem of constant integer multiplication, for which improvement to existing algorithms are also proposed and evaluated

    (M,p,k)-friendly points: a table-based method for trigonometric function evaluation

    Get PDF
    International audienceWe present a new way of approximating the sine and cosine functions by a few table look-ups and additions. It consists in first reducing the input range to a very small interval by using rotations with "(M, p, k) friendly angles", proposed in this work, and then by using a bipartite table method in a small interval. An implementation of the method for 24- bit case is described and compared with CORDIC. Roughly, the proposed scheme offers a speedup of 2 compared with an unfolded double-rotation radix-2 CORDIC

    A path-norm toolkit for modern networks: consequences, promises and challenges

    Full text link
    This work introduces the first toolkit around path-norms that fully encompasses general DAG ReLU networks with biases, skip connections and any operation based on the extraction of order statistics: max pooling, GroupSort etc. This toolkit notably allows us to establish generalization bounds for modern neural networks that are not only the most widely applicable path-norm based ones, but also recover or beat the sharpest known bounds of this type. These extended path-norms further enjoy the usual benefits of path-norms: ease of computation, invariance under the symmetries of the network, and improved sharpness on layered fully-connected networks compared to the product of operator norms, another complexity measure most commonly used. The versatility of the toolkit and its ease of implementation allow us to challenge the concrete promises of path-norm-based generalization bounds, by numerically evaluating the sharpest known bounds for ResNets on ImageNet

    Comparison between binary and decimal floating-point numbers

    Get PDF
    International audienceWe introduce an algorithm to compare a binary floating-point (FP) number and a decimal FP number, assuming the "binary encoding" of the decimal formats is used, and with a special emphasis on the basic interchange formats specified by the IEEE 754-2008 standard for FP arithmetic. It is a two-step algorithm: a first pass, based on the exponents only, quickly eliminates most cases, then, when the first pass does not suffice, a more accurate second pass is performed. We provide an implementation of several variants of our algorithm, and compare them

    Approximation speed of quantized vs. unquantized ReLU neural networks and beyond

    Full text link
    We deal with two complementary questions about approximation properties of ReLU networks. First, we study how the uniform quantization of ReLU networks with real-valued weights impacts their approximation properties. We establish an upper-bound on the minimal number of bits per coordinate needed for uniformly quantized ReLU networks to keep the same polynomial asymptotic approximation speeds as unquantized ones. We also characterize the error of nearest-neighbour uniform quantization of ReLU networks. This is achieved using a new lower-bound on the Lipschitz constant of the map that associates the parameters of ReLU networks to their realization, and an upper-bound generalizing classical results. Second, we investigate when ReLU networks can be expected, or not, to have better approximation properties than other classical approximation families. Indeed, several approximation families share the following common limitation: their polynomial asymptotic approximation speed of any set is bounded from above by the encoding speed of this set. We introduce a new abstract property of approximation families, called infinite-encodability, which implies this upper-bound. Many classical approximation families, defined with dictionaries or ReLU networks, are shown to be infinite-encodable. This unifies and generalizes several situations where this upper-bound is known
    • …
    corecore