Search CORE

2,134 research outputs found

Division and square root for mobile and scientific computing markets

Author: Holimath Vijaykumar
Publication venue: Universidade de Santiago de Compostela. Servizo de Publicacións e Intercambio Científico
Publication date: 01/01/2007
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional da Universidade de Santiago de Compostela

Recent advances in approximation concepts for optimum structural design

Author: Barthelemy Jean-Francois M.
Haftka Raphael T.
Publication venue
Publication date
Field of study

The basic approximation concepts used in structural optimization are reviewed. Some of the most recent developments in that area since the introduction of the concept in the mid-seventies are discussed. The paper distinguishes between local, medium-range, and global approximations; it covers functions approximations and problem approximations. It shows that, although the lack of comparative data established on reference test cases prevents an accurate assessment, there have been significant improvements. The largest number of developments have been in the areas of local function approximations and use of intermediate variable and response quantities. It also appears that some new methodologies are emerging which could greatly benefit from the introduction of new computer architecture

NASA Technical Reports Server

The complexity of class polynomial computation via floating point approximations

Author: Enge Andreas
Publication venue
Publication date: 25/07/2008
Field of study

We analyse the complexity of computing class polynomials, that are an important ingredient for CM constructions of elliptic curves, via complex floating point approximations of their roots. The heart of the algorithm is the evaluation of modular functions in several arguments. The fastest one of the presented approaches uses a technique devised by Dupont to evaluate modular functions by Newton iterations on an expression involving the arithmetic-geometric mean. It runs in time

O (|D| \log^5 |D| \log \log |D|) = O (|D|^{1 + \epsilon}) = O (h^{2 + \epsilon})

for any

\epsilon > 0

, where

D

is the CM discriminant and

h

is the degree of the class polynomial. Another fast algorithm uses multipoint evaluation techniques known from symbolic computation; its asymptotic complexity is worse by a factor of

\log |D|

. Up to logarithmic factors, this running time matches the size of the constructed polynomials. The estimate also relies on a new result concerning the complexity of enumerating the class group of an imaginary-quadratic order and on a rigorously proven upper bound for the height of class polynomials

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Oskar Bordeaux

HAL-Polytechnique

Algorithms and architectures for decimal transcendental function computation

Author: Chen Dongdong
Publication venue: 'University of Saskatchewan Library'
Publication date: 01/01/2011
Field of study

Nowadays, there are many commercial demands for decimal floating-point (DFP) arithmetic operations such as financial analysis, tax calculation, currency conversion, Internet based applications, and e-commerce. This trend gives rise to further development on DFP arithmetic units which can perform accurate computations with exact decimal operands. Due to the significance of DFP arithmetic, the IEEE 754-2008 standard for floating-point arithmetic includes it in its specifications. The basic decimal arithmetic unit, such as decimal adder, subtracter, multiplier, divider or square-root unit, as a main part of a decimal microprocessor, is attracting more and more researchers' attentions. Recently, the decimal-encoded formats and DFP arithmetic units have been implemented in IBM's system z900, POWER6, and z10 microprocessors. Increasing chip densities and transistor count provide more room for designers to add more essential functions on application domains into upcoming microprocessors. Decimal transcendental functions, such as DFP logarithm, antilogarithm, exponential, reciprocal and trigonometric, etc, as useful arithmetic operations in many areas of science and engineering, has been specified as the recommended arithmetic in the IEEE 754-2008 standard. Thus, virtually all the computing systems that are compliant with the IEEE 754-2008 standard could include a DFP mathematical library providing transcendental function computation. Based on the development of basic decimal arithmetic units, more complex DFP transcendental arithmetic will be the next building blocks in microprocessors. In this dissertation, we researched and developed several new decimal algorithms and architectures for the DFP transcendental function computation. These designs are composed of several different methods: 1) the decimal transcendental function computation based on the table-based first-order polynomial approximation method; 2) DFP logarithmic and antilogarithmic converters based on the decimal digit-recurrence algorithm with selection by rounding; 3) a decimal reciprocal unit using the efficient table look-up based on Newton-Raphson iterations; and 4) a first radix-100 division unit based on the non-restoring algorithm with pre-scaling method. Most decimal algorithms and architectures for the DFP transcendental function computation developed in this dissertation have been the first attempt to analyze and implement the DFP transcendental arithmetic in order to achieve faithful results of DFP operands, specified in IEEE 754-2008. To help researchers evaluate the hardware performance of DFP transcendental arithmetic units, the proposed architectures based on the different methods are modeled, verified and synthesized using FPGAs or with CMOS standard cells libraries in ASIC. Some of implementation results are compared with those of the binary radix-16 logarithmic and exponential converters; recent developed high performance decimal CORDIC based architecture; and Intel's DFP transcendental function computation software library. The comparison results show that the proposed architectures have significant speed-up in contrast to the above designs in terms of the latency. The algorithms and architectures developed in this dissertation provide a useful starting point for future hardware-oriented DFP transcendental function computation researches

eCommons@USASK

University of Saskatchewan Research Archive

On digit-recurrence division algorithms for self-timed circuits

Author: Boullis Nicolas
Tisserand Arnaud
Publication venue: HAL CCSD
Publication date: 01/01/2001
Field of study

The optimization of algorithms for self-timed or asynchronous circuits requires specific solutions. Due to the variable-time capabilities of asynchronous circuits, the average computation time should be optimized and not only the worst case of the signal propagation. If efficient algorithms and implementations are known for asynchronous addition and multiplication, only straightforward algorithms have been studied for division. This paper compares several digit-recurrence division algorithms (speed, area and circuit activity for estimating the power consumption). The comparison is based on simulations of the different operators described at the gate level. This work shows that the best solutions for asynchronous circuits are quite different from those used in synchronous circuits

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Survey of Floating-Point Software Arithmetics and Basic Library Mathematical Functions

Author: Lee Keng Ho
Publication venue: ProQuest Dissertations & Theses,
Publication date: 01/01/1973
Field of study

Abstract Not Provided

Glasgow Theses Service

Hybrid preconditioning for iterative diagonalization of ill-conditioned generalized eigenvalue problems in electronic structure calculations

Author: Bai Zhaojun
Cai Yunfeng
Pask John E.
Sukumar N.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

The iterative diagonalization of a sequence of large ill-conditioned generalized eigenvalue problems is a computational bottleneck in quantum mechanical methods employing a nonorthogonal basis for {\em ab initio} electronic structure calculations. We propose a hybrid preconditioning scheme to effectively combine global and locally accelerated preconditioners for rapid iterative diagonalization of such eigenvalue problems. In partition-of-unity finite-element (PUFE) pseudopotential density-functional calculations, employing a nonorthogonal basis, we show that the hybrid preconditioned block steepest descent method is a cost-effective eigensolver, outperforming current state-of-the-art global preconditioning schemes, and comparably efficient for the ill-conditioned generalized eigenvalue problems produced by PUFE as the locally optimal block preconditioned conjugate-gradient method for the well-conditioned standard eigenvalue problems produced by planewave methods

arXiv.org e-Print Archive