Search CORE

222 research outputs found

Composite Iterative Algorithm and Architecture for q-th Root Calculation

Author: Bruguera Javier
Vazquez Alvaro
Publication venue: HAL CCSD
Publication date: 10/03/2011
Field of study

An algorithm for the q-th root extraction, being q any integer, is presented in this paper. The algorithm is based on an optimized implementation of X^{1/q} by a sequence of parallel and/or overlapped operations: (1) reciprocal, (2) digit-recurrence logarithm, (3) left-to-right carry-free multiplication and (4) on-line exponential. A detailed error analysis and two architectures are proposed, for low precision q and for higher precision q. The execution time and hardware requirements are estimated for single and double precision floating-point computations for several radices; this helps to determine which radices result in the most efficient implementations. The architectures proposed improve the features of other architectures for q-th root extraction.Dans cet article, nous présentons un algorithme matériel pour l'extraction de la racine q-ième d'un nombre X, où q est un entier naturel non nul. Cet algorithme est basé sur une implantation optimisée de la fonction X^{1/q} par une séquence d'opérations parallèles et/ou superposées: (1) réciproque, (2) logarithme chiffre par chiffre, (3) multiplication de gauche-à-droite sans propagation de retenue et (4) exponentielle en ligne. Une analyse détaillée des erreurs et deux architectures sont proposées, pour q de basse précision et pour q de précision plus haute. Le temps d'exécution et les composants matériels à utiliser sont estimés pour des calculs en virgule flottante simple et double précision et pour plusieurs bases. Cette étude aide à déterminer quelles bases mènent aux implantations les plus efficaces. Les architectures proposées améliorent les caractéristiques d'architectures précédentes destinées à l'extraction des racines

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Computing cube root of a positive number

Author: Singh Yumnam Kirani
Publication venue: Assam Don Bosco University
Publication date: 01/03/2016
Field of study

Proposed here is a new algorithm to compute the cube root of large positive integer. The algorithm is based on the implementation of long division method also known as manual method we usually use to find the square root of a number. To implement the long division method, the given number is first represented in a radix-10 representa and then Binoâ€™s Model of Multiplication is used to systematically implement the long division method. A representa is a special array to represent a number in the form of an array so as to enable us to treat the representas in the same way as we treat numbers. This simplifies the difficulty of dealing large numbers in a computer. Also, at the same time it simplifies the implementation of long division method to find the cube root of positive number, ranging from single digit number to arbitrarily large positive number such as RSA challenge numbers. The algorithm can be used to compute cube root of a non-perfect cube number up to desired precision and each computed digit of cube root gives the best precision. Cube root of 2, 5, 10 up to 30 digits and integer parts of cube roots of first few and last few RSA challenge numbers are also provided in the experimental result to show that the algorithm works perfectly to compute the cube root of any positive integer, however small or large it may be.Keywords:Binoâ€˜s Model of Multiplication, Convolution, Cube of a large number, Large number manipulation, Long division method, RSA challenge numbers, Representa, Cube root computatio

Assam Don Bosco University Journals

Evaluating the communications capabilities of the generalized hypercube interconnection network

Author: Krishnamurthy Sanjay
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/1998
Field of study

This thesis presents results of evaluating the communications capabilities of the generalized hypercube interconnection network. The generalized hypercube has outstanding topological properties, but it has not been implemented in a large scale because of its very high wiring complexity. For this reason, this network has not been studied extensively in the past. However, recent and expected technological advancements will soon render this network viable for massively parallel systems. We first present implementations of randomized many-to-all broadcasting and multicasting on generalized hypercubes, using as the basis the one-to-all broadcast algorithm presented in [3]. We test the proposed implementations under realistic communication traffic patterns and message generations, for the all-port model of communication. Our results show that the size of the intermediate message buffers has a significant effect on the total communication time, and this effect becomes very dramatic for large systems with large numbers of dimensions. We also propose a modification of this multicast algorithm that applies congestion control to improve its performance. The results illustrate a significant improvement in the total execution time and a reduction in the number of message contentions, and also prove that the generalized hypercube is a very versatile interconnection network

Digital Commons @ New Jersey Institute of Technology (NJIT)

Recommended from our members

Properties and communication algorithms for k-ary n-cube interconnection networks

Author: Ashir Yaagoub A.
Publication venue: 'Oregon State University'
Publication date
Field of study

The k-ary n-cube structure is presented in this paper for interconnecting a network of microcomputers in parallel and distributed environments. Machines based on the k-ary n-cube topology have been advocated as ideal parallel architectures for their powerful interconnecting features. In this paper, we examine the k-ary n-cube from the graph theory point of view and consider those features that make its connectivity so attractive. Among other things, we propose several effective global data communication algorithms on the k-ary n-cube interconnection network

ScholarsArchive@OSU

A comparison between hypercube and binary de Bruijin networks

Author: Arkaah Ato Y.
Publication venue: Lehigh Preserve
Publication date
Field of study

Lehigh University: Lehigh Preserve

Performance analysis of wormhole routing in multicomputer interconnection networks

Author: Sarbazi-Azad Hamid
Publication venue
Publication date: 01/01/2001
Field of study

Perhaps the most critical component in determining the ultimate performance potential of a multicomputer is its interconnection network, the hardware fabric supporting communication among individual processors. The message latency and throughput of such a network are affected by many factors of which topology, switching method, routing algorithm and traffic load are the most significant. In this context, the present study focuses on a performance analysis of k-ary n-cube networks employing wormhole switching, virtual channels and adaptive routing, a scenario of especial interest to current research. This project aims to build upon earlier work in two main ways: constructing new analytical models for k-ary n-cubes, and comparing the performance merits of cubes of different dimensionality. To this end, some important topological properties of k-ary n-cubes are explored initially; in particular, expressions are derived to calculate the number of nodes at/within a given distance from a chosen centre. These results are important in their own right but their primary significance here is to assist in the construction of new and more realistic analytical models of wormhole-routed k-ary n-cubes. An accurate analytical model for wormhole-routed k-ary n-cubes with adaptive routing and uniform traffic is then developed, incorporating the use of virtual channels and the effect of locality in the traffic pattern. New models are constructed for wormhole k-ary n-cubes, with the ability to simulate behaviour under adaptive routing and non-uniform communication workloads, such as hotspot traffic, matrix-transpose and digit-reversal permutation patterns. The models are equally applicable to unidirectional and bidirectional k-ary n-cubes and are significantly more realistic than any in use up to now. With this level of accuracy, the effect of each important network parameter on the overall network performance can be investigated in a more comprehensive manner than before. Finally, k-ary n-cubes of different dimensionality are compared using the new models. The comparison takes account of various traffic patterns and implementation costs, using both pin-out and bisection bandwidth as metrics. Networks with both normal and pipelined channels are considered. While previous similar studies have only taken account of network channel costs, our model incorporates router costs as well thus generating more realistic results. In fact the results of this work differ markedly from those yielded by earlier studies which assumed deterministic routing and uniform traffic, illustrating the importance of using accurate models to conduct such analyses

Glasgow Theses Service

A computer-aided design for digital filter implementation

Author: Lai P. K. M. J.
Lai P. K. M. J.
Publication venue: Department of Electrical Engineering, Imperial College London
Publication date: 01/01/1979
Field of study

Imperial Users onl

Spiral - Imperial College Digital Repository

An algorithm for redundant binary bit-pipelined rational arithmetic

Author: D.W. Matula
P. Kornerup
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Distributed Duplicate Removal

Author: Schlag Sebastian
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2013
Field of study

Ziel der verteilten Duplikaterkennung ist die Identifikation von Elementen, welche mehrfach in einer großen, über mehrere Rechenknoten verteilten Datenmenge vorkommen. Sanders et al. [48] präsentieren einen verteilten Algorithmus, welcher dieses Problem in einer besonders kommunikationseffizienten Art und Weise löst. In einer Vorverarbeitungsphase werden mit Hilfe eines verteilten, platz-effizienten Bloom Filters zunächst möglichst viele distinkte Elemente als solche identifiziert und somit die Gesamtmenge der noch zu betrachtenden Elemente stark reduziert. Da hierbei jedoch auch falsch positive Ergebnisse auftreten, müssen alle als potentiell nicht distinkt erkannten Elemente in einer zweiten Phase noch einmal überprüft werden. Hierzu wird ein klassischer Hash-basierter Algorithmus zur verteilten Duplikaterkennung angewendet. Die vorliegende Arbeit ergänzt die theoretische Analyse durch eine praktische Evaluation. Wir erarbeiten hierzu eine effiziente Implementierung für Shared-Nothing Systeme. Besonders rechenintensive Schritte des Algorithmus werden zusätzlich durch Shared-Memory-Programmierung innerhalb eines Knotens parallelisiert. Die Ergebnisse unserer experimentellen Untersuchung untermauern die durch die Theorie vorhergesagten Vorteile des Algorithmus. Unsere Implementierung ist signifikant schneller als der am besten geeignete klassische Ansatz solange die Eingabedaten zu weniger als 50% aus Duplikaten bestehen. Wird der Algorithmus auf Datensätzen ausgeführt, die zu weniger als 10% aus Duplikaten bestehen, so ist das gesamte Kommunikationsvolumen zudem mehr als eine Größenordnung kleiner als das des klassischen Konkurrenten

KITopen

Serial-data computation in VLSI

Author: Smith Stewart Gresty
Publication venue: The University of Edinburgh
Publication date: 01/01/1987
Field of study

Edinburgh Research Archive