195,334 research outputs found
Fast matrix multiplication techniques based on the Adleman-Lipton model
On distributed memory electronic computers, the implementation and
association of fast parallel matrix multiplication algorithms has yielded
astounding results and insights. In this discourse, we use the tools of
molecular biology to demonstrate the theoretical encoding of Strassen's fast
matrix multiplication algorithm with DNA based on an -moduli set in the
residue number system, thereby demonstrating the viability of computational
mathematics with DNA. As a result, a general scalable implementation of this
model in the DNA computing paradigm is presented and can be generalized to the
application of \emph{all} fast matrix multiplication algorithms on a DNA
computer. We also discuss the practical capabilities and issues of this
scalable implementation. Fast methods of matrix computations with DNA are
important because they also allow for the efficient implementation of other
algorithms (i.e. inversion, computing determinants, and graph theory) with DNA.Comment: To appear in the International Journal of Computer Engineering
Research. Minor changes made to make the preprint as similar as possible to
the published versio
Parallel matrix multiplication on heterogeneous networks of workstations
Matrix multiplication is taken as a test bed for parallel processing on heterogeneous networks of workstations (local area networks) used as parallel machines. Two algorithms are proposed taking into account the specific kind of parallel hardware provided by local area networks, and experimentation is used to drive the evaluation and identification of possible performance loss. A specific broadcast communication between processes of a parallel application is also proposed, taking advantage of the Ethernet interconnection network to achieve optimized performance. A special emphasis is place on already installed networks of workstations, which provide a hardware zero cost parallel computer; but a homogeneous Beowulf-class system is used to show how the algorithms are also useful on current classical high performance parallel computing with clusters.Eje: LenguajesRed de Universidades con Carreras en Informática (RedUNCI
PARALLEL MATRIX MULTIPLICATION CIRCUITS FOR USE IN KALMAN FILTERING
In this work we propose several ways of the CMOS implementation of a circuit for the multiplication of matrices. We mainly focus on parallel and asynchronous solutions, however serial and mixed approaches are also discussed for the comparison. Practical applications are the motivation behind our investigations. They include fast Kalman filtering commonly used in automotive active safety functions, for example. In such filters, numerous time-consuming operations on matrices are performed. An additional problem is the growing amount of data to be processed. It results from the growing number of sensors in the vehicle as fully autonomous driving is developed. Software solutions may prove themselves to be insuffucient in the nearest future. That is why hardware coprocessors are in the area of our interests as they could take over some of the most time-consuming operations. The paper presents possible solutions, tailored to specific problems (sizes of multiplied matrices, number of bits in signals, etc.). The estimates of the performance made on the basis of selected simulation and measurement results show that multiplication of 3×3 matrices with data rate of 20 100 MSps is achievable in the CMOS 130 nm technology
Applicability of approximate multipliers in hardware neural networks
In recent years there has been a growing interest in hardware neural networks, which express many benefits over conventional software models, mainly in applications where speed, cost, reliability, or energy efficiency are of great importance. These hardware neural networks require many resource-, power- and time-consuming multiplication operations, thus special care must be taken during their design. Since the neural network processing can be performed in parallel, there is usually a requirement for designs with as many concurrent multiplication circuits as possible.
One option to achieve this goal is to replace the complex exact multiplying circuits with simpler, approximate ones. The present work demonstrates the application of approximate multiplying circuits in the design of a feed-forward neural network model with on-chip learning ability. The experiments performed on a heterogeneous Proben1 benchmark dataset show that the adaptive nature of the neural network model successfully compensates for the calculation errors of the approximate multiplying circuits. At the same time, the proposed designs also profit from more computing power and increased energy efficiency
Applicability of approximate multipliers in hardware neural networks
In recent years there has been a growing interest in hardware neural networks, which express many benefits over conventional software models, mainly in applications where speed, cost, reliability, or energy efficiency are of great importance. These hardware neural networks require many resource-, power- and time-consuming multiplication operations, thus special care must be taken during their design. Since the neural network processing can be performed in parallel, there is usually a requirement for designs with as many concurrent multiplication circuits as possible.
One option to achieve this goal is to replace the complex exact multiplying circuits with simpler, approximate ones. The present work demonstrates the application of approximate multiplying circuits in the design of a feed-forward neural network model with on-chip learning ability. The experiments performed on a heterogeneous Proben1 benchmark dataset show that the adaptive nature of the neural network model successfully compensates for the calculation errors of the approximate multiplying circuits. At the same time, the proposed designs also profit from more computing power and increased energy efficiency
Recommended from our members
The AND/OR process model for parallel interpretation of logic programs
Current techniques for interpretation of logic programs involve a sequential search of a global tree of procedure invocations. This dissertation introduces the AND/OR Process Model, a method for interpretation by a system of asychronous, independent processes that communicate only by messages. The method makes it possible to exploit two distinct forms of parallelism. OR parallelism is obtained from evaluating nondeterministic choices in parallel. AND parallelism arises in the execution of deterministic fuctions, such as matrix multiplication of divide and conquer algorithms, that are inherently parallel. The two forms of parallelism can be exploited at the same time. This means AND parallelism can be applied to clauses that are composed of several nondeterministic components, and it can recover from incorrect choices in the solution of these components. In addition to defining parallel computations, the model provides a more defined procedural semantics for logic programs; that is, parallel interpreters based on this model are able to generate answers to queries that cause standard interpreters to go into an infinite loop. The interpretation method is intended to form the theoretical framework of a highly parallel non von Neumann computer architecture; the dissertation concludes with a discussion of issues involved in implementing the abstract interpreter on a multiprocessor
Recommended from our members
The scheduling of sparse matrix-vector multiplication on a massively parallel dap computer
An efficient data structure is presented which supports general unstructured sparse matrix-vector multiplications on a Distributed Array of Processors (DAP). This approach seeks to reduce the inter-processor data movements and organises the operations in batches of massively parallel steps by a heuristic scheduling procedure performed on the host computer.
The resulting data structure is of particular relevance to iterative schemes for solving linear systems. Performance results for matrices taken from well known Linear Programming (LP) test problems are presented and analysed
Collaborative Computation in Self-Organizing Particle Systems
Many forms of programmable matter have been proposed for various tasks. We
use an abstract model of self-organizing particle systems for programmable
matter which could be used for a variety of applications, including smart paint
and coating materials for engineering or programmable cells for medical uses.
Previous research using this model has focused on shape formation and other
spatial configuration problems (e.g., coating and compression). In this work we
study foundational computational tasks that exceed the capabilities of the
individual constant size memory of a particle, such as implementing a counter
and matrix-vector multiplication. These tasks represent new ways to use these
self-organizing systems, which, in conjunction with previous shape and
configuration work, make the systems useful for a wider variety of tasks. They
can also leverage the distributed and dynamic nature of the self-organizing
system to be more efficient and adaptable than on traditional linear computing
hardware. Finally, we demonstrate applications of similar types of computations
with self-organizing systems to image processing, with implementations of image
color transformation and edge detection algorithms
- …