Search CORE

192 research outputs found

Cyclotomic Identity Testing and Applications

Author: Balaji Nikhil
Perifel Sylvain
Shirmohammadi Mahsa
Worrell James
Publication venue
Publication date: 26/07/2020
Field of study

We consider the cyclotomic identity testing problem: given a polynomial

f(x_1,\ldots,x_k)

, decide whether

f(\zeta_n^{e_1},\ldots,\zeta_n^{e_k})

is zero, for

\zeta_n = e^{2\pi i/n}

a primitive complex

n

-th root of unity and integers

e_1,\ldots,e_k

. We assume that

n

and

e_1,\ldots,e_k

are represented in binary and consider several versions of the problem, according to the representation of

f

. For the case that

f

is given by an algebraic circuit we give a randomized polynomial-time algorithm with two-sided errors, showing that the problem lies in BPP. In case

f

is given by a circuit of polynomially bounded syntactic degree, we give a randomized algorithm with two-sided errors that runs in poly-logarithmic parallel time, showing that the problem lies in BPNC. In case

f

is given by a depth-2

\Sigma\Pi

circuit (or, equivalently, as a list of monomials), we show that the cyclotomic identity testing problem lies in NC. Under the generalised Riemann hypothesis, we are able to extend this approach to obtain a polynomial-time algorithm also for a very simple subclass of depth-3

\Sigma\Pi\Sigma

circuits. We complement this last result by showing that for a more general class of depth-3

\Sigma\Pi\Sigma

circuits, a polynomial-time algorithm for the cyclotomic identity testing problem would yield a sub-exponential-time algorithm for polynomial identity testing. Finally, we use cyclotomic identity testing to give a new proof that equality of compressed strings, i.e., strings presented using context-free grammars, can be decided in coRNC: randomized NC with one-sided errors

arXiv.org e-Print Archive

HAL Descartes

Oxford University Research Archive

Optimisations arithmétiques et synthèse de haut niveau

Author: Uguen Yohann
Publication venue: HAL CCSD
Publication date: 13/11/2019
Field of study

High-level synthesis (HLS) tools offer increased productivity regarding FPGA programming.However, due to their relatively young nature, they still lack many arithmetic optimizations.This thesis proposes safe arithmetic optimizations that should always be applied.These optimizations are simple operator specializations, following the C semantic.Other require to a lift the semantic embedded in high-level input program languages, which are inherited from software programming, for an improved accuracy/cost/performance ratio.To demonstrate this claim, the sum-of-product of floating-point numbers is used as a case study. The sum is performed on a fixed-point format, which is tailored to the application, according to the context in which the operator is instantiated.In some cases, there is not enough information about the input data to tailor the fixed-point accumulator.The fall-back strategy used in this thesis is to generate an accumulator covering the entire floating-point range.This thesis explores different strategies for implementing such a large accumulator, including new ones.The use of a 2's complement representation instead of a sign+magnitude is demonstrated to save resources and to reduce the accumulation loop delay.Based on a tapered precision scheme and an exact accumulator, the posit number systems claims to be a candidate to replace the IEEE floating-point format.A throughout analysis of posit operators is performed, using the same level of hardware optimization as state-of-the-art floating-point operators.Their cost remains much higher that their floating-point counterparts in terms of resource usage and performance. Finally, this thesis presents a compatibility layer for HLS tools that allows one code to be deployed on multiple tools.This library implements a strongly typed custom size integer type along side a set of optimized custom operators.À cause de la nature relativement jeune des outils de synthèse de haut-niveau (HLS), de nombreuses optimisations arithmétiques n'y sont pas encore implémentées. Cette thèse propose des optimisations arithmétiques se servant du contexte spécifique dans lequel les opérateurs sont instanciés.Certaines optimisations sont de simples spécialisations d'opérateurs, respectant la sémantique du C.D'autres nécéssitent de s'éloigner de cette sémantique pour améliorer le compromis précision/coût/performance.Cette proposition est démontré sur des sommes de produits de nombres flottants.La somme est réalisée dans un format en virgule-fixe défini par son contexte.Quand trop peu d’informations sont disponibles pour définir ce format en virgule-fixe, une stratégie est de générer un accumulateur couvrant l'intégralité du format flottant.Cette thèse explore plusieurs implémentations d'un tel accumulateur.L'utilisation d'une représentation en complément à deux permet de réduire le chemin critique de la boucle d'accumulation, ainsi que la quantité de ressources utilisées. Un format alternatif aux nombres flottants, appelé posit, propose d'utiliser un encodage à précision variable.De plus, ce format est augmenté par un accumulateur exact.Pour évaluer précisément le coût matériel de ce format, cette thèse présente des architectures d'opérateurs posits, implémentés avec le même degré d'optimisation que celui de l'état de l'art des opérateurs flottants.Une analyse détaillée montre que le coût des opérateurs posits est malgré tout bien plus élevé que celui de leurs équivalents flottants.Enfin, cette thèse présente une couche de compatibilité entre outils de HLS, permettant de viser plusieurs outils avec un seul code. Cette bibliothèque implémente un type d'entiers de taille variable, avec de plus une sémantique strictement typée, ainsi qu'un ensemble d'opérateurs ad-hoc optimisés

Landau and Ramanujan approximations for divisor sums and coefficients of cusp forms

Author: Ciolan A.
Languasco A.
Moree P.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2023
Field of study

MPG.PuRe

Design for Implementation of Image Processing Algorithms

Author: Whitesell Jamison D
Publication venue: RIT Scholar Works
Publication date: 01/01/2014
Field of study

Color image processing algorithms are first developed using a high-level mathematical modeling language. Current integrated development environments offer libraries of intrinsic functions, which on one hand enable faster development, but on the other hand hide the use of fundamental operations. The latter have to be detailed for an efficient hardware and/or software physical implementation. Based on the experience accumulated in the process of implementing a segmentation algorithm, this thesis outlines a design for implementation methodology comprised of a development flow and associated guidelines. The methodology enables algorithm developers to iteratively optimize their algorithms while maintaining the level of image integrity required by their application. Furthermore, it does not require algorithm developers to change their current development process. Rather, the design for implementation methodology is best suited for optimizing a functionally correct algorithm, thus appending to an algorithm developer\u27s design process of choice. The application of this methodology to four segmentation algorithm steps produced measured results with 2-D correlation coefficients (CORR2) better than 0.99, peak-signal-to-noise-ratio (PSNR) better than 70 dB, and structural-similarity-index (SSIM) better than 0.98, for a majority of test cases

RIT Scholar Works

Simple optimizing JIT compilation of higher-order dynamic programming languages

Author: Saleil Baptiste
Publication venue
Publication date: 01/05/2019
Field of study

Implémenter efficacement les langages de programmation dynamiques demande beaucoup d’effort de développement. Les compilateurs ne cessent de devenir de plus en plus complexes. Aujourd’hui, ils incluent souvent une phase d’interprétation, plusieurs phases de compilation, plusieurs représentations intermédiaires et des analyses de code. Toutes ces techniques permettent d’implémenter efficacement un langage de programmation dynamique, mais leur mise en oeuvre est difficile dans un contexte où les ressources de développement sont limitées. Nous proposons une nouvelle approche et de nouvelles techniques dynamiques permettant de développer des compilateurs performants pour les langages dynamiques avec de relativement bonnes performances et un faible effort de développement. Nous présentons une approche simple de compilation à la volée qui permet d’implémenter un langage en une seule phase de compilation, sans transformation vers des représentations intermédiaires. Nous expliquons comment le versionnement de blocs de base, une technique de compilation existante, peut être étendue, sans effort de développement significatif, pour fonctionner interprocéduralement avec les langages de programmation d’ordre supérieur, permettant d’appliquer des optimisations interprocédurales sur ces langages. Nous expliquons également comment le versionnement de blocs de base permet de supprimer certaines opérations utilisées pour implémenter les langages dynamiques et qui impactent les performances comme les vérifications de type. Nous expliquons aussi comment les compilateurs peuvent exploiter les représentations dynamiques des valeurs par Tagging et NaN-boxing pour optimiser le code généré avec peu d’effort de développement. Nous présentons également notre expérience de développement d’un compilateur à la volée pour le langage de programmation Scheme, pour montrer que ces techniques permettent effectivement de construire un compilateur avec un effort moins important que les compilateurs actuels et qu’elles permettent de générer du code efficace, qui rivalise avec les meilleures implémentations du langage Scheme.Efficiently implementing dynamic programming languages requires a significant development effort. Over the years, compilers have become more complex. Today, they typically include an interpretation phase, several compilation phases, several intermediate representations and code analyses. These techniques allow efficiently implementing these programming languages but are difficult to implement in contexts in which development resources are limited. We propose a new approach and new techniques to build optimizing just-in-time compilers for dynamic languages with relatively good performance and low development effort. We present a simple just-in-time compilation approach to implement a language with a single compilation phase, without the need to use code transformations to intermediate representations. We explain how basic block versioning, an existing compilation technique, can be extended without significant development effort, to work interprocedurally with higherorder programming languages allowing interprocedural optimizations on these languages. We also explain how basic block versioning allows removing operations used to implement dynamic languages that degrade performance, such as type checks, and how compilers can use Tagging and NaN-boxing to optimize the generated code with low development effort. We present our experience of building a JIT compiler using these techniques for the Scheme programming language to show that they indeed allow building compilers with less development effort than other implementations and that they allow generating efficient code that competes with current mature implementations of the Scheme language

Dépôt Institutionnel Numérique

Performance Evaluation of Optimal Ate Pairing on Low-Cost Single Microprocessor Platform

Author: Pesonen Mikko
Publication venue
Publication date: 15/10/2020
Field of study

The framework of low-cost interconnected devices forms a new kind of cryptographic environment with diverse requirements. Due to the minimal resource capacity of the devices, light-weight cryptographic algorithms are favored. Many applications of IoT work autonomously and process sensible data, which emphasizes security needs, and might also cause a need for specific security measures. A bilinear pairing is a mapping based on groups formed by elliptic curves over extension fields. The pairings are the key-enabler for versatile cryptosystems, such as certificateless signatures and searchable encryption. However, they have a major computational overhead, which coincides with the requirements of the low-cost devices. Nonetheless, the bilinear pairings are the only known approach for many cryptographic protocols so their feasibility should certainly be studied, as they might turn out to be necessary for some future IoT solutions. Promising results already exist for high-frequency CPU:s and platforms with hardware extensions. In this work, we study the feasibility of computing the optimal ate pairing over the BN254 curve, on a 64 MHz Cortex-M33 based platform by utilizing an optimized open-source library. The project is carried out for the company Nordic Semiconductor. As a result, the pairing was effectively computed in under 26* 10^6 cycles, or in 410 ms. The resulting pairing enables a limited usage of pairing-based cryptography, with a capacity of at most few cryptographic operations, such as ID-based key verifications per second. Referring to other relevant works, a competent pairing application would require either a high-frequency - and thus high consuming - microprocessor, or a customized FPGA. Moreover, it is noted that the research in efficient pairing-based cryptography is constantly taking steps forward in every front-line: efficient algorithms, protocols, and hardware-solutions

UTUPub

The Mordell-Weil sieve: Proving non-existence of rational points on curves

Author: Baker
Bruin
Bruin
Cantor
Cassels
Chabauty
Flynn
Hartshorne
Murty
Poonen
Shafarevich (ed.)
Stoll
Stoll
Stoll
Publication venue: 'Wiley'
Publication date: 30/11/2009
Field of study

We discuss the Mordell-Weil sieve as a general technique for proving results concerning rational points on a given curve. In the special case of curves of genus 2, we describe quite explicitly how the relevant local information can be obtained if one does not want to restrict to mod p information at primes of good reduction. We describe our implementation of the Mordell-Weil sieve algorithm and discuss its efficiency.Comment: 47 pages, 4 figures. Revised version, following suggestions/questions by the referee. Main changes: (1) We fixed a gap in the proof of Lemma 4.3 and added a discussion of the case r < g-1. (2) We added a heuristic discussion of the success probability at a single prime in Section

arXiv.org e-Print Archive

CiteSeerX

Crossref