Search CORE

345 research outputs found

Composite Iterative Algorithm and Architecture for q-th Root Calculation

Author: Bruguera Javier
Vazquez Alvaro
Publication venue: HAL CCSD
Publication date: 10/03/2011
Field of study

An algorithm for the q-th root extraction, being q any integer, is presented in this paper. The algorithm is based on an optimized implementation of X^{1/q} by a sequence of parallel and/or overlapped operations: (1) reciprocal, (2) digit-recurrence logarithm, (3) left-to-right carry-free multiplication and (4) on-line exponential. A detailed error analysis and two architectures are proposed, for low precision q and for higher precision q. The execution time and hardware requirements are estimated for single and double precision floating-point computations for several radices; this helps to determine which radices result in the most efficient implementations. The architectures proposed improve the features of other architectures for q-th root extraction.Dans cet article, nous présentons un algorithme matériel pour l'extraction de la racine q-ième d'un nombre X, où q est un entier naturel non nul. Cet algorithme est basé sur une implantation optimisée de la fonction X^{1/q} par une séquence d'opérations parallèles et/ou superposées: (1) réciproque, (2) logarithme chiffre par chiffre, (3) multiplication de gauche-à-droite sans propagation de retenue et (4) exponentielle en ligne. Une analyse détaillée des erreurs et deux architectures sont proposées, pour q de basse précision et pour q de précision plus haute. Le temps d'exécution et les composants matériels à utiliser sont estimés pour des calculs en virgule flottante simple et double précision et pour plusieurs bases. Cette étude aide à déterminer quelles bases mènent aux implantations les plus efficaces. Les architectures proposées améliorent les caractéristiques d'architectures précédentes destinées à l'extraction des racines

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Balancing Minimum Free Energy and Codon Adaptation Index for Pareto Optimal RNA Design

Author: El-Kebir Mohammed
Gu Xinyu
Qi Yuanyuan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Workshop on Algorithms in Bioinformatics (WABI 2023)
Publication date: 01/01/2023
Field of study

Dagstuhl Research Online Publication Server

A linear time algorithm for the orbit problem over cyclic groups

Author: Anthony Widjaja
Lin
Sanming Zhou
Publication venue
Publication date: 01/01/2014
Field of study

The orbit problem is at the heart of symmetry reduction methods for model checking concurrent systems. It asks whether two given configurations in a concurrent system (represented as finite strings over some finite alphabet) are in the same orbit with respect to a given finite permutation group (represented by their generators) acting on this set of configurations by permuting indices. It is known that the problem is in general as hard as the graph isomorphism problem, whose precise complexity (whether it is solvable in polynomial-time) is a long-standing open problem. In this paper, we consider the restriction of the orbit problem when the permutation group is cyclic (i.e. generated by a single permutation), an important restriction of the problem. It is known that this subproblem is solvable in polynomial-time. Our main result is a linear-time algorithm for this subproblem.Comment: Accepted in Acta Informatica in Nov 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Formulas for Continued Fractions. An Automated Guess and Prove Approach

Author: Abramowitz M.
Benoit A.
Cooper K. D.
Cuyt A.
Khovanskii A. N.
Olver F. W. J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/07/2015
Field of study

We describe a simple method that produces automatically closed forms for the coefficients of continued fractions expansions of a large number of special functions. The function is specified by a non-linear differential equation and initial conditions. This is used to generate the first few coefficients and from there a conjectured formula. This formula is then proved automatically thanks to a linear recurrence satisfied by some remainder terms. Extensive experiments show that this simple approach and its straightforward generalization to difference and

q

-difference equations capture a large part of the formulas in the literature on continued fractions.Comment: Maple worksheet attache

arXiv.org e-Print Archive

HAL-ENS-LYON

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

Bipartite Matching with Linear Edge Weights

Author: Domanic Nevzat Onur
Lam Chi-Kit
Plaxton C. Gregory
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th International Symposium on Algorithms and Computation (ISAAC 2016)
Publication date: 01/01/2016
Field of study

Consider a complete weighted bipartite graph G in which each left vertex u has two real numbers intercept and slope, each right vertex v has a real number quality, and the weight of any edge (u, v) is defined as the intercept of u plus the slope of u times the quality of v. Let m (resp., n) denote the number of left (resp., right) vertices, and assume that m geq n. We develop a fast algorithm for computing a maximum weight matching (MWM) of such a graph. Our algorithm begins by computing an MWM of the subgraph induced by the n right vertices and an arbitrary subset of n left vertices; this step is straightforward to perform in O(n log n) time. The remaining m - n left vertices are then inserted into the graph one at a time, in arbitrary order. As each left vertex is inserted, the MWM is updated. It is relatively straightforward to process each such insertion in O(n) time; our main technical contribution is to improve this time bound to O(sqrt{n} log^2 n). This result has an application related to unit-demand auctions. It is well known that the VCG mechanism yields a suitable solution (allocation and prices) for any unit-demand auction. The graph G may be viewed as encoding a special kind of unit-demand auction in which each left vertex u represents a unit-demand bid, each right vertex v represents an item, and the weight of an edge (u, v) represents the offer of bid u on item v. In this context, our fast insertion algorithm immediately provides an O(sqrt{n} log^2 n)-time algorithm for updating a VCG allocation when a new bid is received. We show how to generalize the insertion algorithm to update (an efficient representation of) the VCG prices within the same time bound

Dagstuhl Research Online Publication Server

Parallelization of dynamic programming recurrences in computational biology

Author: Jacob Arpith
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2010
Field of study

The rapid growth of biosequence databases over the last decade has led to a performance bottleneck in the applications analyzing them. In particular, over the last five years DNA sequencing capacity of next-generation sequencers has been doubling every six months as costs have plummeted. The data produced by these sequencers is overwhelming traditional compute systems. We believe that in the future compute performance, not sequencing, will become the bottleneck in advancing genome science. In this work, we investigate novel computing platforms to accelerate dynamic programming algorithms, which are popular in bioinformatics workloads. We study algorithm-specific hardware architectures that exploit fine-grained parallelism in dynamic programming kernels using field-programmable gate arrays: FPGAs). We advocate a high-level synthesis approach, using the recurrence equation abstraction to represent dynamic programming and polyhedral analysis to exploit parallelism. We suggest a novel technique within the polyhedral model to optimize for throughput by pipelining independent computations on an array. This design technique improves on the state of the art, which builds latency-optimal arrays. We also suggest a method to dynamically switch between a family of designs using FPGA reconfiguration to achieve a significant performance boost. We have used polyhedral methods to parallelize the Nussinov RNA folding algorithm to build a family of accelerators that can trade resources for parallelism and are between 15-130x faster than a modern dual core CPU implementation. A Zuker RNA folding accelerator we built on a single workstation with four Xilinx Virtex 4 FPGAs outperforms 198 3 GHz Intel Core 2 Duo processors. Furthermore, our design running on a single FPGA is an order of magnitude faster than competing implementations on similar-generation FPGAs and graphics processors. Our work is a step toward the goal of automated synthesis of hardware accelerators for dynamic programming algorithms

Washington University St. Louis: Open Scholarship