Search CORE

8,305 research outputs found

Wildcard dimensions, coding theory and fault-tolerant meshes and hypercubes

Author: Bruck Jehoshua
Cypher Robert
Ho Ching-Tien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1995
Field of study

Hypercubes, meshes and tori are well known interconnection networks for parallel computers. The sets of edges in those graphs can be partitioned to dimensions. It is well known that the hypercube can be extended by adding a wildcard dimension resulting in a folded hypercube that has better fault-tolerant and communication capabilities. First we prove that the folded hypercube is optimal in the sense that only a single wildcard dimension can be added to the hypercube. We then investigate the idea of adding wildcard dimensions to d-dimensional meshes and tori. Using techniques from error correcting codes we construct d-dimensional meshes and tori with wildcard dimensions. Finally, we show how these constructions can be used to tolerate edge and node faults in mesh and torus networks

CiteSeerX

Caltech Authors

Simple and Effective Type Check Removal through Lazy Basic Block Versioning

Author: Chevalier-Boisvert Maxime
Feeley Marc
Publication venue
Publication date: 01/01/2015
Field of study

Dynamically typed programming languages such as JavaScript and Python defer type checking to run time. In order to maximize performance, dynamic language VM implementations must attempt to eliminate redundant dynamic type checks. However, type inference analyses are often costly and involve tradeoffs between compilation time and resulting precision. This has lead to the creation of increasingly complex multi-tiered VM architectures. This paper introduces lazy basic block versioning, a simple JIT compilation technique which effectively removes redundant type checks from critical code paths. This novel approach lazily generates type-specialized versions of basic blocks on-the-fly while propagating context-dependent type information. This does not require the use of costly program analyses, is not restricted by the precision limitations of traditional type analyses and avoids the implementation complexity of speculative optimization techniques. We have implemented intraprocedural lazy basic block versioning in a JavaScript JIT compiler. This approach is compared with a classical flow-based type analysis. Lazy basic block versioning performs as well or better on all benchmarks. On average, 71% of type tests are eliminated, yielding speedups of up to 50%. We also show that our implementation generates more efficient machine code than TraceMonkey, a tracing JIT compiler for JavaScript, on several benchmarks. The combination of implementation simplicity, low algorithmic complexity and good run time performance makes basic block versioning attractive for baseline JIT compilers

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

Polynomial-Time Amoeba Neighborhood Membership and Faster Localized Solving

Author: A. Dickenstein
A.G. Khovanskii
A.J. Sommese
C. Beltrán
C. D’Andrea
CTD Ng
D.A. Plaisted
D.J. Bates
F. Bihan
G. Mikhalkin
G.M. Ziegler
I.M. Gel’fand
I.Z. Emiris
J. Verschelde
J.-C. Faugère
J.A. Loera De
L. Bettale
L. Khachiyan
L. Nilsson
L.J. Huan
M. Grötschel
M. Nisse
M. Shub
M. Shub
M. Shub
M. Shub
M.R. Garey
P. Bürgisser
P. Gritzmann
P. Gritzmann
P. Gritzmann
R.M. Karp
S. Arora
Steve Smale
W. Hao
Y. Ning
Publication venue
Publication date: 23/12/2013
Field of study

We derive efficient algorithms for coarse approximation of algebraic hypersurfaces, useful for estimating the distance between an input polynomial zero set and a given query point. Our methods work best on sparse polynomials of high degree (in any number of variables) but are nevertheless completely general. The underlying ideas, which we take the time to describe in an elementary way, come from tropical geometry. We thus reduce a hard algebraic problem to high-precision linear optimization, proving new upper and lower complexity estimates along the way.Comment: 15 pages, 9 figures. Submitted to a conference proceeding

arXiv.org e-Print Archive

CiteSeerX

Crossref

Type-II/III DCT/DST algorithms with reduced number of arithmetic operations

Author: Ahmed
Arai
Arguello
Astola
Chan
Chen
Crochiere
Duhamel
Duhamel
Duhamel
Feig
Frigo
Frigo
Gentleman
Gopinath
Guo
Haralick
Hou
Johnson
Kamar
Kok
Krot
Lee
Lee
Li
Lundy
Makhoul
Malvar
Martens
Narasimha
Pennebaker
Plonka
Press
Püschel
Qian
Rao
Schatzman
Sorensen
Steidl
Steven G. Johnson
Suehiro
Swarztrauber
Takala
Tasche
Tseng
van Loan
Vetterli
Wang
Wang
Xuancheng Shao
Yaroslavskii
Yavne
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

We present algorithms for the discrete cosine transform (DCT) and discrete sine transform (DST), of types II and III, that achieve a lower count of real multiplications and additions than previously published algorithms, without sacrificing numerical accuracy. Asymptotically, the operation count is reduced from ~ 2N log_2 N to ~ (17/9) N log_2 N for a power-of-two transform size N. Furthermore, we show that a further N multiplications may be saved by a certain rescaling of the inputs or outputs, generalizing a well-known technique for N=8 by Arai et al. These results are derived by considering the DCT to be a special case of a DFT of length 4N, with certain symmetries, and then pruning redundant operations from a recent improved fast Fourier transform algorithm (based on a recursive rescaling of the conjugate-pair split radix algorithm). The improved algorithms for DCT-III, DST-II, and DST-III follow immediately from the improved count for the DCT-II.Comment: 9 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Fast and Accurate Multiclass Inference for MI-BCIs Using Large Multiscale Temporal and Spectral Features

Author: Benini Luca
Cavigelli Lukas
Hersche Michael
Rahimi Abbas
Rellstab Tino
Schiavone Pasquale Davide
Publication venue
Publication date: 01/01/2018
Field of study

Accurate, fast, and reliable multiclass classification of electroencephalography (EEG) signals is a challenging task towards the development of motor imagery brain-computer interface (MI-BCI) systems. We propose enhancements to different feature extractors, along with a support vector machine (SVM) classifier, to simultaneously improve classification accuracy and execution time during training and testing. We focus on the well-known common spatial pattern (CSP) and Riemannian covariance methods, and significantly extend these two feature extractors to multiscale temporal and spectral cases. The multiscale CSP features achieve 73.70

\pm

15.90% (mean

\pm

standard deviation across 9 subjects) classification accuracy that surpasses the state-of-the-art method [1], 70.6

\pm

14.70%, on the 4-class BCI competition IV-2a dataset. The Riemannian covariance features outperform the CSP by achieving 74.27

\pm

15.5% accuracy and executing 9x faster in training and 4x faster in testing. Using more temporal windows for Riemannian features results in 75.47

\pm

12.8% accuracy with 1.6x faster testing than CSP.Comment: Published as a conference paper at the IEEE European Signal Processing Conference (EUSIPCO), 201

arXiv.org e-Print Archive

Repository for Publications and Research Data

On the Complexity of the F5 Gr\"obner basis Algorithm

Author: Bardet Magali
Faugère Jean-Charles
Salvy Bruno
Publication venue
Publication date: 17/07/2014
Field of study

We study the complexity of Gr\"obner bases computation, in particular in the generic situation where the variables are in simultaneous Noether position with respect to the system. We give a bound on the number of polynomials of degree

d

in a Gr\"obner basis computed by Faug\`ere's

F_5

algorithm~(Fau02) in this generic case for the grevlex ordering (which is also a bound on the number of polynomials for a reduced Gr\"obner basis, independently of the algorithm used). Next, we analyse more precisely the structure of the polynomials in the Gr\"obner bases with signatures that

F_5

computes and use it to bound the complexity of the algorithm. Our estimates show that the version of~

F_5

we analyse, which uses only standard Gaussian elimination techniques, outperforms row reduction of the Macaulay matrix with the best known algorithms for moderate degrees, and even for degrees up to the thousands if Strassen's multiplication is used. The degree being fixed, the factor of improvement grows exponentially with the number of variables.Comment: 24 page

arXiv.org e-Print Archive

HAL-ENS-LYON

HAL - Normandie Université

INRIA a CCSD electronic archive server

Hal-Diderot

Speeding up the constraint-based method in difference logic

Author: A Miné
A Miné
A Podelski
A Schrijver
AM Frisch
C Borralleras
D Beyer
D Jovanović
D Kapur
D Larraz
D Larraz
D Larraz
D Monniaux
DL Dill
J Argelich
J Marques-Silva
M Bofill
MA Colón
N Bjørner
R Dechter
R Nieuwenhuis
R Sebastiani
S Sankaranarayanan
S Sankaranarayanan
SK Lahiri
TH Cormen
WH Press
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

"The final publication is available at http://link.springer.com/chapter/10.1007%2F978-3-319-40970-2_18"Over the years the constraint-based method has been successfully applied to a wide range of problems in program analysis, from invariant generation to termination and non-termination proving. Quite often the semantics of the program under study as well as the properties to be generated belong to difference logic, i.e., the fragment of linear arithmetic where atoms are inequalities of the form u v = k. However, so far constraint-based techniques have not exploited this fact: in general, Farkas’ Lemma is used to produce the constraints over template unknowns, which leads to non-linear SMT problems. Based on classical results of graph theory, in this paper we propose new encodings for generating these constraints when program semantics and templates belong to difference logic. Thanks to this approach, instead of a heavyweight non-linear arithmetic solver, a much cheaper SMT solver for difference logic or linear integer arithmetic can be employed for solving the resulting constraints. We present encouraging experimental results that show the high impact of the proposed techniques on the performance of the VeryMax verification systemPeer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC