Search CORE

1,437 research outputs found

A bibliography on parallel and vector numerical algorithms

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also

NASA Technical Reports Server

A survey of the state of the art and focused research in range systems, task 2

Author: Yao K.
Publication venue
Publication date
Field of study

Contract generated publications are compiled which describe the research activities for the reporting period. Study topics include: equivalent configurations of systolic arrays; least squares estimation algorithms with systolic array architectures; modeling and equilization of nonlinear bandlimited satellite channels; and least squares estimation and Kalman filtering by systolic arrays

NASA Technical Reports Server

Solution of partial differential equations on vector and parallel computers

Author: Ortega J. M.
Voigt R. G.
Publication venue
Publication date
Field of study

The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed

NASA Technical Reports Server

Tensor Computation: A New Framework for High-Dimensional Problems in EDA

Author: Batselier Kim
Daniel Luca
Liu Haotian
Wong Ngai
Zhang Zheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2016
Field of study

Many critical EDA problems suffer from the curse of dimensionality, i.e. the very fast-scaling computational burden produced by large number of parameters and/or unknown variables. This phenomenon may be caused by multiple spatial or temporal factors (e.g. 3-D field solvers discretizations and multi-rate circuit simulation), nonlinearity of devices and circuits, large number of design or optimization parameters (e.g. full-chip routing/placement and circuit sizing), or extensive process variations (e.g. variability/reliability analysis and design for manufacturability). The computational challenges generated by such high dimensional problems are generally hard to handle efficiently with traditional EDA core algorithms that are based on matrix and vector computation. This paper presents "tensor computation" as an alternative general framework for the development of efficient EDA algorithms and tools. A tensor is a high-dimensional generalization of a matrix and a vector, and is a natural choice for both storing and solving efficiently high-dimensional EDA problems. This paper gives a basic tutorial on tensors, demonstrates some recent examples of EDA applications (e.g., nonlinear circuit modeling and high-dimensional uncertainty quantification), and suggests further open EDA problems where the use of tensor computation could be of advantage.Comment: 14 figures. Accepted by IEEE Trans. CAD of Integrated Circuits and System

arXiv.org e-Print Archive

DSpace@MIT

HKU Scholars Hub

Recommended from our members

Harmonic scheduling of linear recurrences in digital filter design

Author: Dutt Nikil
Nicolau Alexandru
Wang Haigeng
Publication venue: eScholarship, University of California
Publication date: 14/02/1992
Field of study

Linear difference equations involving recurrences are fundamental equations that describe many important signal processing applications. For many high sample rate digital filter applications, we need to effectively parallelize the linear difference equations used to describe digital filters - a difficult task due to the recurrences inherent in the data dependences. We present a novel approach, Harmonic Scheduling, that exploits parallelism in these recurrences beyond loop-carried dependencies, and which generates optimal schedules for parallel evaluation of linear difference equations with resource constraints. This approach also enables us to derive a parallel schedule with minimum control overhead, given an execution time with resource constraints. We also present a Harmonic Scheduling algorithm that generates optimal schedules for digital filters described by second-order difference equations with resource constraints

eScholarship - University of California

Fast recursive filters for simulating nonlinear dynamic systems

Author: van Hateren J. H.
Publication venue
Publication date: 30/08/2007
Field of study

A fast and accurate computational scheme for simulating nonlinear dynamic systems is presented. The scheme assumes that the system can be represented by a combination of components of only two different types: first-order low-pass filters and static nonlinearities. The parameters of these filters and nonlinearities may depend on system variables, and the topology of the system may be complex, including feedback. Several examples taken from neuroscience are given: phototransduction, photopigment bleaching, and spike generation according to the Hodgkin-Huxley equations. The scheme uses two slightly different forms of autoregressive filters, with an implicit delay of zero for feedforward control and an implicit delay of half a sample distance for feedback control. On a fairly complex model of the macaque retinal horizontal cell it computes, for a given level of accuracy, 1-2 orders of magnitude faster than 4th-order Runge-Kutta. The computational scheme has minimal memory requirements, and is also suited for computation on a stream processor, such as a GPU (Graphical Processing Unit).Comment: 20 pages, 8 figures, 1 table. A comparison with 4th-order Runge-Kutta integration shows that the new algorithm is 1-2 orders of magnitude faster. The paper is in press now at Neural Computatio

arXiv.org e-Print Archive

CiteSeerX

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

University of Groningen Digital Archive

Dissertations of the University of Groningen

Efficient Parallel Algorithms and VLSI Architectures for Manipulator Jacobian Computation

Author: Lee C. S. G.
Yeung T. B.
Publication venue: 'Purdue University (bepress)'
Publication date: 01/05/1987
Field of study

Real-time computations of manipulator Jacobian are examined for executing on uniprocessor computers, parallel computers, and VLSI pipelines. The characteristics of the Jacobian equations are found to be in the form of the first-order linear recurrence. The time lower bound of computing the first-order linear recurrence, and hence the Jacobian, is of order O(N) on uniprocessor computers, and of order O(log2N) on parallel SIMD computers, where TV is the number of degrees-of-freedom of the manipulator. The Generalized-^ method, which achieves the time lower bound on uniprocessor computers, is derived to compute the Jacobian at any desired reference coordinate frame A; from the base coordinate frame to the end-effector coordinate frame. We find that if the reference coordinate frame k is in the range [3 , N—4], then the computational effort is the minimum. To reduce the computational complexity from the order of O (N) to O (log2N), we derive the parallel forward and backward recursive doubling algorithm to compute the Jacobian on parallel computers. Again, any reference coordinate frame k can be used, and the minimum computation occurs at k = (N—1)/2. To further reduce the Jacobian computation complexity, we design two VLSI systolic pipelined architectures. A linear VLSI pipe, which uses the least number of modular processors, takes 3N floating-point operations to compute the Jacobian, and a parallel VLSI pipe takes 3 floating-point operations. We also show that if the reference coordinate frame is selected at k — (N—1)/2, then the parallel pipe will require the least number of modular processors, and the communication paths are much shorter

Purdue E-Pubs