14,900 research outputs found
Recommended from our members
Behavioral synthesis from VHDL using structured modeling
This dissertation describes work in behavioral synthesis involving the development of a VHDL Synthesis System VSS which accepts a VHDL behavioral input specification and performs technology independent synthesis to generate a circuit netlist of generic components. The VHDL language is used for input and output descriptions. An intermediate representation which incorporates signal typing and component attributes simplifies compilation and facilitates design optimization.A Structured Modeling methodology has been developed to suggest standard VHDL modeling practices for synthesis. Structured modeling provides recommendations for the use of available VHDL description styles so that optimal designs will be synthesized.A design composed of generic components is synthesized from the input description through a process of Graph Compilation, Graph Criticism, and Design Compilation. Experiments were performed to demonstrate the effects of different modeling styles on the quality of the design produced by VSS. Several alternative VHDL models were examined for each benchmark, illustrating the improvements in design quality achieved when Structured Modeling guidelines were followed
pocl: A Performance-Portable OpenCL Implementation
OpenCL is a standard for parallel programming of heterogeneous systems. The
benefits of a common programming standard are clear; multiple vendors can
provide support for application descriptions written according to the standard,
thus reducing the program porting effort. While the standard brings the obvious
benefits of platform portability, the performance portability aspects are
largely left to the programmer. The situation is made worse due to multiple
proprietary vendor implementations with different characteristics, and, thus,
required optimization strategies.
In this paper, we propose an OpenCL implementation that is both portable and
performance portable. At its core is a kernel compiler that can be used to
exploit the data parallelism of OpenCL programs on multiple platforms with
different parallel hardware styles. The kernel compiler is modularized to
perform target-independent parallel region formation separately from the
target-specific parallel mapping of the regions to enable support for various
styles of fine-grained parallel resources such as subword SIMD extensions, SIMD
datapaths and static multi-issue. Unlike previous similar techniques that work
on the source level, the parallel region formation retains the information of
the data parallelism using the LLVM IR and its metadata infrastructure. This
data can be exploited by the later generic compiler passes for efficient
parallelization.
The proposed open source implementation of OpenCL is also platform portable,
enabling OpenCL on a wide range of architectures, both already commercialized
and on those that are still under research. The paper describes how the
portability of the implementation is achieved. Our results show that most of
the benchmarked applications when compiled using pocl were faster or close to
as fast as the best proprietary OpenCL implementation for the platform at hand.Comment: This article was published in 2015; it is now openly accessible via
arxi
Expansion Trees with Cut
Herbrand's theorem is one of the most fundamental insights in logic. From the
syntactic point of view it suggests a compact representation of proofs in
classical first- and higher-order logic by recording the information which
instances have been chosen for which quantifiers, known in the literature as
expansion trees.
Such a representation is inherently analytic and hence corresponds to a
cut-free sequent calculus proof. Recently several extensions of such proof
representations to proofs with cut have been proposed. These extensions are
based on graphical formalisms similar to proof nets and are limited to prenex
formulas.
In this paper we present a new approach that directly extends expansion trees
by cuts and covers also non-prenex formulas. We describe a cut-elimination
procedure for our expansion trees with cut that is based on the natural
reduction steps. We prove that it is weakly normalizing using methods from the
epsilon-calculus
Enumerative Branching with Less Repetition
We can compactly represent large sets of solutions for problems with discrete decision variables by using decision diagrams. With them, we can efficiently identify optimal solutions for different objective functions. In fact, a decision diagram naturally arises from the branch-and-bound tree that we could use to enumerate these solutions if we merge nodes from which the same solutions are obtained on the remaining variables. However, we would like to avoid the repetitive work of finding the same solutions from branching on different nodes at the same level of that tree. Instead, we would like to explore just one of these equivalent nodes and then infer that the same solutions would have been found if we explored other nodes. In this work, we show how to identify such equivalences—and thus directly construct a reduced decision diagram—in integer programs where the left-hand sides of all constraints consist of additively separable functions. First, we extend an existing result regarding problems with a single linear constraint and integer coefficients. Second, we show necessary conditions with which we can isolate a single explored node as the only candidate to be equivalent to each unexplored node in problems with multiple constraints. Third, we present a sufficient condition that confirms if such a pair of nodes is indeed equivalent, and we demonstrate how to induce that condition through preprocessing. Finally, we report computational results on integer linear programming problems from the MIPLIB benchmark. Our approach often constructs smaller decision diagrams faster and with less branching
Beyond Worst-Case Analysis for Joins with Minesweeper
We describe a new algorithm, Minesweeper, that is able to satisfy stronger
runtime guarantees than previous join algorithms (colloquially, `beyond
worst-case guarantees') for data in indexed search trees. Our first
contribution is developing a framework to measure this stronger notion of
complexity, which we call {\it certificate complexity}, that extends notions of
Barbay et al. and Demaine et al.; a certificate is a set of propositional
formulae that certifies that the output is correct. This notion captures a
natural class of join algorithms. In addition, the certificate allows us to
define a strictly stronger notion of runtime complexity than traditional
worst-case guarantees. Our second contribution is to develop a dichotomy
theorem for the certificate-based notion of complexity. Roughly, we show that
Minesweeper evaluates -acyclic queries in time linear in the certificate
plus the output size, while for any -cyclic query there is some instance
that takes superlinear time in the certificate (and for which the output is no
larger than the certificate size). We also extend our certificate-complexity
analysis to queries with bounded treewidth and the triangle query.Comment: [This is the full version of our PODS'2014 paper.
Future value based single assignment program representations and optimizations
An optimizing compiler internal representation fundamentally affects the clarity, efficiency and feasibility of optimization algorithms employed by the compiler. Static Single Assignment (SSA) as a state-of-the-art program representation has great advantages though still can be improved. This dissertation explores the domain of single assignment beyond SSA, and presents two novel program representations: Future Gated Single Assignment (FGSA) and Recursive Future Predicated Form (RFPF). Both FGSA and RFPF embed control flow and data flow information, enabling efficient traversal program information and thus leading to better and simpler optimizations. We introduce future value concept, the designing base of both FGSA and RFPF, which permits a consumer instruction to be encountered before the producer of its source operand(s) in a control flow setting. We show that FGSA is efficiently computable by using a series T1/T2/TR transformation, yielding an expected linear time algorithm for combining together the construction of the pruned single assignment form and live analysis for both reducible and irreducible graphs. As a result, the approach results in an average reduction of 7.7%, with a maximum of 67% in the number of gating functions compared to the pruned SSA form on the SPEC2000 benchmark suite. We present a solid and near optimal framework to perform inverse transformation from single assignment programs. We demonstrate the importance of unrestricted code motion and present RFPF. We develop algorithms which enable instruction movement in acyclic, as well as cyclic regions, and show the ease to perform optimizations such as Partial Redundancy Elimination on RFPF
On Global Types and Multi-Party Session
Global types are formal specifications that describe communication protocols
in terms of their global interactions. We present a new, streamlined language
of global types equipped with a trace-based semantics and whose features and
restrictions are semantically justified. The multi-party sessions obtained
projecting our global types enjoy a liveness property in addition to the
traditional progress and are shown to be sound and complete with respect to the
set of traces of the originating global type. Our notion of completeness is
less demanding than the classical ones, allowing a multi-party session to leave
out redundant traces from an underspecified global type. In addition to the
technical content, we discuss some limitations of our language of global types
and provide an extensive comparison with related specification languages
adopted in different communities
- …