Search CORE

20,090 research outputs found

Optimistic Parallelization of Floating-Point Accumulation

Author: DeHon André
Kapre Nachiket
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Floating-point arithmetic is notoriously non-associative due to the limited precision representation which demands intermediate values be rounded to fit in the available precision. The resulting cyclic dependency in floating-point accumulation inhibits parallelization of the computation, including efficient use of pipelining. In practice, however, we observe that floating-point operations are "mostly" associative. This observation can be exploited to parallelize floating-point accumulation using a form of optimistic concurrency. In this scheme, we first compute an optimistic associative approximation to the sum and then relax the computation by iteratively propagating errors until the correct sum is obtained. We map this computation to a network of 16 statically-scheduled, pipelined, double-precision floating-point adders on the Virtex-4 LX160 (-12) device where each floating-point adder runs at 296 MHz and has a pipeline depth of 10. On this 16 PE design, we demonstrate an average speedup of 6× with randomly generated data and 3-7× with summations extracted from Conjugate Gradient benchmarks

CiteSeerX

Crossref

Caltech Authors

ScholarlyCommons@Penn

Representing a P-complete problem by small trellis automata

Author: Okhotin Alexander
Publication venue: 'Open Publishing Association'
Publication date: 01/06/2009
Field of study

A restricted case of the Circuit Value Problem known as the Sequential NOR Circuit Value Problem was recently used to obtain very succinct examples of conjunctive grammars, Boolean grammars and language equations representing P-complete languages (Okhotin, http://dx.doi.org/10.1007/978-3-540-74593-8_23 "A simple P-complete problem and its representations by language equations", MCU 2007). In this paper, a new encoding of the same problem is proposed, and a trellis automaton (one-way real-time cellular automaton) with 11 states solving this problem is constructed

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

Slot Games for Detecting Timing Leaks of Programs

Author: Dimovski Aleksandar S.
Publication venue: 'Open Publishing Association'
Publication date: 01/07/2013
Field of study

In this paper we describe a method for verifying secure information flow of programs, where apart from direct and indirect flows a secret information can be leaked through covert timing channels. That is, no two computations of a program that differ only on high-security inputs can be distinguished by low-security outputs and timing differences. We attack this problem by using slot-game semantics for a quantitative analysis of programs. We show how slot-games model can be used for performing a precise security analysis of programs, that takes into account both extensional and intensional properties of programs. The practicality of this approach for automated verification is also shown.Comment: In Proceedings GandALF 2013, arXiv:1307.416

arXiv.org e-Print Archive

Directory of Open Access Journals

On Decidable Growth-Rate Properties of Imperative Programs

Author: Ben-Amram Amir M.
Publication venue: 'Open Publishing Association'
Publication date: 01/05/2010
Field of study

In 2008, Ben-Amram, Jones and Kristiansen showed that for a simple "core" programming language - an imperative language with bounded loops, and arithmetics limited to addition and multiplication - it was possible to decide precisely whether a program had certain growth-rate properties, namely polynomial (or linear) bounds on computed values, or on the running time. This work emphasized the role of the core language in mitigating the notorious undecidability of program properties, so that one deals with decidable problems. A natural and intriguing problem was whether more elements can be added to the core language, improving its utility, while keeping the growth-rate properties decidable. In particular, the method presented could not handle a command that resets a variable to zero. This paper shows how to handle resets. The analysis is given in a logical style (proof rules), and its complexity is shown to be PSPACE-complete (in contrast, without resets, the problem was PTIME). The analysis algorithm evolved from the previous solution in an interesting way: focus was shifted from proving a bound to disproving it, and the algorithm works top-down rather than bottom-up

arXiv.org e-Print Archive

Directory of Open Access Journals