6,496 research outputs found

    Discontinuities in recurrent neural networks

    Get PDF
    This paper studies the computational power of various discontinuous real computational models that are based on the classical analog recurrent neural network (ARNN). This ARNN consists of finite number of neurons; each neuron computes a polynomial net-function and a sigmoid-like continuous activation-function. The authors introducePostprint (published version

    An O(N squared) method for computing the eigensystem of N by N symmetric tridiagonal matrices by the divide and conquer approach

    Get PDF
    An efficient method is proposed to solve the eigenproblem of N by N Symmetric Tridiagonal (ST) matrices. Unlike the standard eigensolvers which necessitate O(N cubed) operations to compute the eigenvectors of such ST matrices, the proposed method computes both the eigenvalues and eigenvectors with only O(N squared) operations. The method is based on serial implementation of the recently introduced Divide and Conquer (DC) algorithm. It exploits the fact that by O(N squared) of DC operations, one can compute the eigenvalues of N by N ST matrix and a finite number of pairs of successive rows of its eigenvector matrix. The rest of the eigenvectors--all of them or one at a time--are computed by linear three-term recurrence relations. Numerical examples are presented which demonstrate the superiority of the proposed method by saving an order of magnitude in execution time at the expense of sacrificing a few orders of accuracy

    Object-oriented domain specific compilers for programming FPGAs

    No full text
    Published versio

    Periodic orbits of the ensemble of Sinai-Arnold cat maps and pseudorandom number generation

    Full text link
    We propose methods for constructing high-quality pseudorandom number generators (RNGs) based on an ensemble of hyperbolic automorphisms of the unit two-dimensional torus (Sinai-Arnold map or cat map) while keeping a part of the information hidden. The single cat map provides the random properties expected from a good RNG and is hence an appropriate building block for an RNG, although unnecessary correlations are always present in practice. We show that introducing hidden variables and introducing rotation in the RNG output, accompanied with the proper initialization, dramatically suppress these correlations. We analyze the mechanisms of the single-cat-map correlations analytically and show how to diminish them. We generalize the Percival-Vivaldi theory in the case of the ensemble of maps, find the period of the proposed RNG analytically, and also analyze its properties. We present efficient practical realizations for the RNGs and check our predictions numerically. We also test our RNGs using the known stringent batteries of statistical tests and find that the statistical properties of our best generators are not worse than those of other best modern generators.Comment: 18 pages, 3 figures, 9 table

    Parametrizable Architecture for Function Recursive Evaluation

    Get PDF
    Paper submitted to the XVIII Conference on Design of Circuits and Integrated Systems (DCIS), Ciudad Real, España, 2003.This paper presents a function evaluation method developed under the scope of recursive expression of function convolution. This approach is based on a unique parametrizable formula capable of providing function points by successive iteration. When tackling design level, it also shows suitable for developing architectural schemes capable of dealing with different speed and precision issues. An architecture for reconfigurable FPGA based in serial distributed arithmetic implements the design for fast prototyping. The case of combined trigonometric functions involved in rotation is analyzed under this scope. Compared with others methods, our proposal offers a good balance between speed and precision

    Design and implementation of digital wave filter adaptors

    Get PDF

    Processor evaluation for low power frequency converter product family

    Get PDF
    TÀssÀ työssÀ tutkitaan markkinoilla olevia tai lÀhitulevaisuudessa markkinoille saapuvia prosessoreja kÀytettÀvÀksi pienitehoisissa taajuusmuuttajissa. Tutkimuksen tarkoitus on selvittÀÀ prosessorin sopivuutta sovellukseen, jossa hinta on merkittÀvÀ tekijÀ. Tutkimuksessa esitettyjen vaatimusten perusteella houkuttelevimmat prosessorit otetaan tarkempaan tutkimukseen. Tarkemman selvityksen jÀlkeen vaatimuksia teknisesti mahdollisimman tarkasti vastaavat prosessorit pyydettiin valmistajalta testattavaksi. Testaaminen suoritettiin lopulta viidelle eri prosessorille, joista kaksi perustui samaan ytimeen. Testaamisen tavoitteena on selvittÀÀ prosessorin sopivuus kÀyttökohteeseensa. Sopivuus testattiin suorittamalla prosessoreissa taajuusmuuttajakÀyttöÀ mallintavaa testikoodia. Tuloksina testikoodin ajamisesta saatiin tietyissÀ aliohjelmissa kulutettu aika sekÀ kulutetut kellosyklit. Suorituskyvyn lisÀksi testaukseen kuului prosessorikohtaisen kÀÀntÀjÀn aikaansaaman koodin koko. Aliohjelmat sisÀlsivÀt sekÀ aritmeettisia, ettÀ loogisia operaatioita, joiden kombinaationa mahdollisimman hyvÀ sopivuus saatiin selvitettyÀ.The aim of this thesis is to study processors to be used in a low power frequency converter. Processors under investigation must be currently or in the near future in the market. The purpose is to examine suitability of a processor to an application in which price is an essential factor. The requirements presented in this study will determine which processor will be reviewed more closely. After a precise review, processor vendors was asked to provide as corresponding device as possible to a test. Testing was accomplished eventually with five different processors of which two were based on a same core. The aim of the testing was to investigate suitability of the processors to their target task. Suitability was tested by executing code that models frequency converter application. As a result, spent time and clock cycles are presented in certain functions. In addition to performance, the testing included evaluation of the size of the output code the compilers created. Functions under test consisted of a combination of arithmetic and logic operations that was used to interpret the suitability of the processor

    Implementation and Evaluation of Algorithmic Skeletons: Parallelisation of Computer Algebra Algorithms

    Get PDF
    This thesis presents design and implementation approaches for the parallel algorithms of computer algebra. We use algorithmic skeletons and also further approaches, like data parallel arithmetic and actors. We have implemented skeletons for divide and conquer algorithms and some special parallel loops, that we call ‘repeated computation with a possibility of premature termination’. We introduce in this thesis a rational data parallel arithmetic. We focus on parallel symbolic computation algorithms, for these algorithms our arithmetic provides a generic parallelisation approach. The implementation is carried out in Eden, a parallel functional programming language based on Haskell. This choice enables us to encode both the skeletons and the programs in the same language. Moreover, it allows us to refrain from using two different languages—one for the implementation and one for the interface—for our implementation of computer algebra algorithms. Further, this thesis presents methods for evaluation and estimation of parallel execution times. We partition the parallel execution time into two components. One of them accounts for the quality of the parallelisation, we call it the ‘parallel penalty’. The other is the sequential execution time. For the estimation, we predict both components separately, using statistical methods. This enables very confident estimations, although using drastically less measurement points than other methods. We have applied both our evaluation and estimation approaches to the parallel programs presented in this thesis. We haven also used existing estimation methods. We developed divide and conquer skeletons for the implementation of fast parallel multiplication. We have implemented the Karatsuba algorithm, Strassen’s matrix multiplication algorithm and the fast Fourier transform. The latter was used to implement polynomial convolution that leads to a further fast multiplication algorithm. Specially for our implementation of Strassen algorithm we have designed and implemented a divide and conquer skeleton basing on actors. We have implemented the parallel fast Fourier transform, and not only did we use new divide and conquer skeletons, but also developed a map-and-transpose skeleton. It enables good parallelisation of the Fourier transform. The parallelisation of Karatsuba multiplication shows a very good performance. We have analysed the parallel penalty of our programs and compared it to the serial fraction—an approach, known from literature. We also performed execution time estimations of our divide and conquer programs. This thesis presents a parallel map+reduce skeleton scheme. It allows us to combine the usual parallel map skeletons, like parMap, farm, workpool, with a premature termination property. We use this to implement the so-called ‘parallel repeated computation’, a special form of a speculative parallel loop. We have implemented two probabilistic primality tests: the Rabin–Miller test and the Jacobi sum test. We parallelised both with our approach. We analysed the task distribution and stated the fitting configurations of the Jacobi sum test. We have shown formally that the Jacobi sum test can be implemented in parallel. Subsequently, we parallelised it, analysed the load balancing issues, and produced an optimisation. The latter enabled a good implementation, as verified using the parallel penalty. We have also estimated the performance of the tests for further input sizes and numbers of processing elements. Parallelisation of the Jacobi sum test and our generic parallelisation scheme for the repeated computation is our original contribution. The data parallel arithmetic was defined not only for integers, which is already known, but also for rationals. We handled the common factors of the numerator or denominator of the fraction with the modulus in a novel manner. This is required to obtain a true multiple-residue arithmetic, a novel result of our research. Using these mathematical advances, we have parallelised the determinant computation using the Gauß elimination. As always, we have performed task distribution analysis and estimation of the parallel execution time of our implementation. A similar computation in Maple emphasised the potential of our approach. Data parallel arithmetic enables parallelisation of entire classes of computer algebra algorithms. Summarising, this thesis presents and thoroughly evaluates new and existing design decisions for high-level parallelisations of computer algebra algorithms
    • 

    corecore