8,305 research outputs found
Wildcard dimensions, coding theory and fault-tolerant meshes and hypercubes
Hypercubes, meshes and tori are well known interconnection networks for parallel computers. The sets of edges in those graphs can be partitioned to dimensions. It is well known that the hypercube can be extended by adding a wildcard dimension resulting in a folded hypercube that has better fault-tolerant and communication capabilities. First we prove that the folded hypercube is optimal in the sense that only a single wildcard dimension can be added to the hypercube. We then investigate the idea of adding wildcard dimensions to d-dimensional meshes and tori. Using techniques from error correcting codes we construct d-dimensional meshes and tori with wildcard dimensions. Finally, we show how these constructions can be used to tolerate edge and node faults in mesh and torus networks
Simple and Effective Type Check Removal through Lazy Basic Block Versioning
Dynamically typed programming languages such as JavaScript and Python defer
type checking to run time. In order to maximize performance, dynamic language
VM implementations must attempt to eliminate redundant dynamic type checks.
However, type inference analyses are often costly and involve tradeoffs between
compilation time and resulting precision. This has lead to the creation of
increasingly complex multi-tiered VM architectures.
This paper introduces lazy basic block versioning, a simple JIT compilation
technique which effectively removes redundant type checks from critical code
paths. This novel approach lazily generates type-specialized versions of basic
blocks on-the-fly while propagating context-dependent type information. This
does not require the use of costly program analyses, is not restricted by the
precision limitations of traditional type analyses and avoids the
implementation complexity of speculative optimization techniques.
We have implemented intraprocedural lazy basic block versioning in a
JavaScript JIT compiler. This approach is compared with a classical flow-based
type analysis. Lazy basic block versioning performs as well or better on all
benchmarks. On average, 71% of type tests are eliminated, yielding speedups of
up to 50%. We also show that our implementation generates more efficient
machine code than TraceMonkey, a tracing JIT compiler for JavaScript, on
several benchmarks. The combination of implementation simplicity, low
algorithmic complexity and good run time performance makes basic block
versioning attractive for baseline JIT compilers
Polynomial-Time Amoeba Neighborhood Membership and Faster Localized Solving
We derive efficient algorithms for coarse approximation of algebraic
hypersurfaces, useful for estimating the distance between an input polynomial
zero set and a given query point. Our methods work best on sparse polynomials
of high degree (in any number of variables) but are nevertheless completely
general. The underlying ideas, which we take the time to describe in an
elementary way, come from tropical geometry. We thus reduce a hard algebraic
problem to high-precision linear optimization, proving new upper and lower
complexity estimates along the way.Comment: 15 pages, 9 figures. Submitted to a conference proceeding
Type-II/III DCT/DST algorithms with reduced number of arithmetic operations
We present algorithms for the discrete cosine transform (DCT) and discrete
sine transform (DST), of types II and III, that achieve a lower count of real
multiplications and additions than previously published algorithms, without
sacrificing numerical accuracy. Asymptotically, the operation count is reduced
from ~ 2N log_2 N to ~ (17/9) N log_2 N for a power-of-two transform size N.
Furthermore, we show that a further N multiplications may be saved by a certain
rescaling of the inputs or outputs, generalizing a well-known technique for N=8
by Arai et al. These results are derived by considering the DCT to be a special
case of a DFT of length 4N, with certain symmetries, and then pruning redundant
operations from a recent improved fast Fourier transform algorithm (based on a
recursive rescaling of the conjugate-pair split radix algorithm). The improved
algorithms for DCT-III, DST-II, and DST-III follow immediately from the
improved count for the DCT-II.Comment: 9 page
Fast and Accurate Multiclass Inference for MI-BCIs Using Large Multiscale Temporal and Spectral Features
Accurate, fast, and reliable multiclass classification of
electroencephalography (EEG) signals is a challenging task towards the
development of motor imagery brain-computer interface (MI-BCI) systems. We
propose enhancements to different feature extractors, along with a support
vector machine (SVM) classifier, to simultaneously improve classification
accuracy and execution time during training and testing. We focus on the
well-known common spatial pattern (CSP) and Riemannian covariance methods, and
significantly extend these two feature extractors to multiscale temporal and
spectral cases. The multiscale CSP features achieve 73.7015.90% (mean
standard deviation across 9 subjects) classification accuracy that surpasses
the state-of-the-art method [1], 70.614.70%, on the 4-class BCI
competition IV-2a dataset. The Riemannian covariance features outperform the
CSP by achieving 74.2715.5% accuracy and executing 9x faster in training
and 4x faster in testing. Using more temporal windows for Riemannian features
results in 75.4712.8% accuracy with 1.6x faster testing than CSP.Comment: Published as a conference paper at the IEEE European Signal
Processing Conference (EUSIPCO), 201
On the Complexity of the F5 Gr\"obner basis Algorithm
We study the complexity of Gr\"obner bases computation, in particular in the
generic situation where the variables are in simultaneous Noether position with
respect to the system.
We give a bound on the number of polynomials of degree in a Gr\"obner
basis computed by Faug\`ere's algorithm~(Fau02) in this generic case for
the grevlex ordering (which is also a bound on the number of polynomials for a
reduced Gr\"obner basis, independently of the algorithm used). Next, we analyse
more precisely the structure of the polynomials in the Gr\"obner bases with
signatures that computes and use it to bound the complexity of the
algorithm.
Our estimates show that the version of~ we analyse, which uses only
standard Gaussian elimination techniques, outperforms row reduction of the
Macaulay matrix with the best known algorithms for moderate degrees, and even
for degrees up to the thousands if Strassen's multiplication is used. The
degree being fixed, the factor of improvement grows exponentially with the
number of variables.Comment: 24 page
Speeding up the constraint-based method in difference logic
"The final publication is available at http://link.springer.com/chapter/10.1007%2F978-3-319-40970-2_18"Over the years the constraint-based method has been successfully applied to a wide range of problems in program analysis, from invariant generation to termination and non-termination proving. Quite often the semantics of the program under study as well as the properties to be generated belong to difference logic, i.e., the fragment of linear arithmetic where atoms are inequalities of the form u v = k. However, so far constraint-based techniques have not exploited this fact: in general, Farkas’ Lemma is used to produce the constraints over template unknowns, which leads to non-linear SMT problems. Based on classical results of graph theory, in this paper we propose new encodings for generating these constraints when program semantics and templates belong to difference logic. Thanks to this approach, instead of a heavyweight non-linear arithmetic solver, a much cheaper SMT solver for difference logic or linear integer arithmetic can be employed for solving the resulting constraints. We present encouraging experimental results that show the high impact of the proposed techniques on the performance of the VeryMax verification systemPeer ReviewedPostprint (author's final draft
- …