8,305 research outputs found

    Wildcard dimensions, coding theory and fault-tolerant meshes and hypercubes

    Get PDF
    Hypercubes, meshes and tori are well known interconnection networks for parallel computers. The sets of edges in those graphs can be partitioned to dimensions. It is well known that the hypercube can be extended by adding a wildcard dimension resulting in a folded hypercube that has better fault-tolerant and communication capabilities. First we prove that the folded hypercube is optimal in the sense that only a single wildcard dimension can be added to the hypercube. We then investigate the idea of adding wildcard dimensions to d-dimensional meshes and tori. Using techniques from error correcting codes we construct d-dimensional meshes and tori with wildcard dimensions. Finally, we show how these constructions can be used to tolerate edge and node faults in mesh and torus networks

    Simple and Effective Type Check Removal through Lazy Basic Block Versioning

    Get PDF
    Dynamically typed programming languages such as JavaScript and Python defer type checking to run time. In order to maximize performance, dynamic language VM implementations must attempt to eliminate redundant dynamic type checks. However, type inference analyses are often costly and involve tradeoffs between compilation time and resulting precision. This has lead to the creation of increasingly complex multi-tiered VM architectures. This paper introduces lazy basic block versioning, a simple JIT compilation technique which effectively removes redundant type checks from critical code paths. This novel approach lazily generates type-specialized versions of basic blocks on-the-fly while propagating context-dependent type information. This does not require the use of costly program analyses, is not restricted by the precision limitations of traditional type analyses and avoids the implementation complexity of speculative optimization techniques. We have implemented intraprocedural lazy basic block versioning in a JavaScript JIT compiler. This approach is compared with a classical flow-based type analysis. Lazy basic block versioning performs as well or better on all benchmarks. On average, 71% of type tests are eliminated, yielding speedups of up to 50%. We also show that our implementation generates more efficient machine code than TraceMonkey, a tracing JIT compiler for JavaScript, on several benchmarks. The combination of implementation simplicity, low algorithmic complexity and good run time performance makes basic block versioning attractive for baseline JIT compilers

    Polynomial-Time Amoeba Neighborhood Membership and Faster Localized Solving

    Full text link
    We derive efficient algorithms for coarse approximation of algebraic hypersurfaces, useful for estimating the distance between an input polynomial zero set and a given query point. Our methods work best on sparse polynomials of high degree (in any number of variables) but are nevertheless completely general. The underlying ideas, which we take the time to describe in an elementary way, come from tropical geometry. We thus reduce a hard algebraic problem to high-precision linear optimization, proving new upper and lower complexity estimates along the way.Comment: 15 pages, 9 figures. Submitted to a conference proceeding

    Type-II/III DCT/DST algorithms with reduced number of arithmetic operations

    Full text link
    We present algorithms for the discrete cosine transform (DCT) and discrete sine transform (DST), of types II and III, that achieve a lower count of real multiplications and additions than previously published algorithms, without sacrificing numerical accuracy. Asymptotically, the operation count is reduced from ~ 2N log_2 N to ~ (17/9) N log_2 N for a power-of-two transform size N. Furthermore, we show that a further N multiplications may be saved by a certain rescaling of the inputs or outputs, generalizing a well-known technique for N=8 by Arai et al. These results are derived by considering the DCT to be a special case of a DFT of length 4N, with certain symmetries, and then pruning redundant operations from a recent improved fast Fourier transform algorithm (based on a recursive rescaling of the conjugate-pair split radix algorithm). The improved algorithms for DCT-III, DST-II, and DST-III follow immediately from the improved count for the DCT-II.Comment: 9 page

    Fast and Accurate Multiclass Inference for MI-BCIs Using Large Multiscale Temporal and Spectral Features

    Full text link
    Accurate, fast, and reliable multiclass classification of electroencephalography (EEG) signals is a challenging task towards the development of motor imagery brain-computer interface (MI-BCI) systems. We propose enhancements to different feature extractors, along with a support vector machine (SVM) classifier, to simultaneously improve classification accuracy and execution time during training and testing. We focus on the well-known common spatial pattern (CSP) and Riemannian covariance methods, and significantly extend these two feature extractors to multiscale temporal and spectral cases. The multiscale CSP features achieve 73.70±\pm15.90% (mean±\pm standard deviation across 9 subjects) classification accuracy that surpasses the state-of-the-art method [1], 70.6±\pm14.70%, on the 4-class BCI competition IV-2a dataset. The Riemannian covariance features outperform the CSP by achieving 74.27±\pm15.5% accuracy and executing 9x faster in training and 4x faster in testing. Using more temporal windows for Riemannian features results in 75.47±\pm12.8% accuracy with 1.6x faster testing than CSP.Comment: Published as a conference paper at the IEEE European Signal Processing Conference (EUSIPCO), 201

    On the Complexity of the F5 Gr\"obner basis Algorithm

    Get PDF
    We study the complexity of Gr\"obner bases computation, in particular in the generic situation where the variables are in simultaneous Noether position with respect to the system. We give a bound on the number of polynomials of degree dd in a Gr\"obner basis computed by Faug\`ere's F5F_5 algorithm~(Fau02) in this generic case for the grevlex ordering (which is also a bound on the number of polynomials for a reduced Gr\"obner basis, independently of the algorithm used). Next, we analyse more precisely the structure of the polynomials in the Gr\"obner bases with signatures that F5F_5 computes and use it to bound the complexity of the algorithm. Our estimates show that the version of~F5F_5 we analyse, which uses only standard Gaussian elimination techniques, outperforms row reduction of the Macaulay matrix with the best known algorithms for moderate degrees, and even for degrees up to the thousands if Strassen's multiplication is used. The degree being fixed, the factor of improvement grows exponentially with the number of variables.Comment: 24 page

    Speeding up the constraint-based method in difference logic

    Get PDF
    "The final publication is available at http://link.springer.com/chapter/10.1007%2F978-3-319-40970-2_18"Over the years the constraint-based method has been successfully applied to a wide range of problems in program analysis, from invariant generation to termination and non-termination proving. Quite often the semantics of the program under study as well as the properties to be generated belong to difference logic, i.e., the fragment of linear arithmetic where atoms are inequalities of the form u v = k. However, so far constraint-based techniques have not exploited this fact: in general, Farkas’ Lemma is used to produce the constraints over template unknowns, which leads to non-linear SMT problems. Based on classical results of graph theory, in this paper we propose new encodings for generating these constraints when program semantics and templates belong to difference logic. Thanks to this approach, instead of a heavyweight non-linear arithmetic solver, a much cheaper SMT solver for difference logic or linear integer arithmetic can be employed for solving the resulting constraints. We present encouraging experimental results that show the high impact of the proposed techniques on the performance of the VeryMax verification systemPeer ReviewedPostprint (author's final draft
    • …
    corecore