1,501 research outputs found
Program representation size in an intermediate language with intersection and union types
The CIL compiler for core Standard ML compiles whole programs using a novel typed intermediate language (TIL) with intersection and union types and flow labels on both terms and types. The CIL term representation duplicates portions of the program where intersection types are introduced and union types are eliminated. This duplication makes it easier to represent type information and to introduce customized data representations. However, duplication incurs compile-time space costs that are potentially much greater than are incurred in TILs employing type-level abstraction or quantification. In this paper, we present empirical data on the compile-time space costs of using CIL as an intermediate language. The data shows that these costs can be made tractable by using sufficiently fine-grained flow analyses together with standard hash-consing techniques. The data also suggests that non-duplicating formulations of intersection (and union) types would not achieve significantly better space complexity.National Science Foundation (CCR-9417382, CISE/CCR ESS 9806747); Sun grant (EDUD-7826-990410-US); Faculty Fellowship of the Carroll School of Management, Boston College; U.K. Engineering and Physical Sciences Research Council (GR/L 36963, GR/L 15685
Recommended from our members
Intensional Polymorphism in Type-erasure Semantics
Intensional polymorphism, the ability to dispatch to different routines based on types at run time, enables a variety of advanced implementation techniques for polymorphic languages, including tag-free garbage collection, unboxed function arguments, polymorphic marshalling, and flattened data structures. To date, languages that support intensional polymorphism have required a type-passing (as opposed to type-erasure) interpretation where types are constructed and passed to polymorphic functions at run time. Unfortunately, type-passing suffers from a number of drawbacks: it requires duplication of constructs at the term and type levels, it prevents abstraction, and it severely complicates polymorphic closure conversion.We present a type-theoretic framework that supports intensional polymorphism, but avoids many of the disadvantages of type passing. In our approach, run-time type information is represented by ordinary terms. This avoids the duplication problem, allows us to recover abstraction, and avoids complications with closure conversion. In addition, our type system provides another improvement in expressiveness; it allows unknown types to be refined in place thereby avoiding certain beta-expansions required by other frameworksEngineering and Applied Science
Stream Fusion, to Completeness
Stream processing is mainstream (again): Widely-used stream libraries are now
available for virtually all modern OO and functional languages, from Java to C#
to Scala to OCaml to Haskell. Yet expressivity and performance are still
lacking. For instance, the popular, well-optimized Java 8 streams do not
support the zip operator and are still an order of magnitude slower than
hand-written loops. We present the first approach that represents the full
generality of stream processing and eliminates overheads, via the use of
staging. It is based on an unusually rich semantic model of stream interaction.
We support any combination of zipping, nesting (or flat-mapping), sub-ranging,
filtering, mapping-of finite or infinite streams. Our model captures
idiosyncrasies that a programmer uses in optimizing stream pipelines, such as
rate differences and the choice of a "for" vs. "while" loops. Our approach
delivers hand-written-like code, but automatically. It explicitly avoids the
reliance on black-box optimizers and sufficiently-smart compilers, offering
highest, guaranteed and portable performance. Our approach relies on high-level
concepts that are then readily mapped into an implementation. Accordingly, we
have two distinct implementations: an OCaml stream library, staged via
MetaOCaml, and a Scala library for the JVM, staged via LMS. In both cases, we
derive libraries richer and simultaneously many tens of times faster than past
work. We greatly exceed in performance the standard stream libraries available
in Java, Scala and OCaml, including the well-optimized Java 8 streams
Trading inference effort versus size in CNF Knowledge Compilation
Knowledge Compilation (KC) studies compilation of boolean functions f into
some formalism F, which allows to answer all queries of a certain kind in
polynomial time. Due to its relevance for SAT solving, we concentrate on the
query type "clausal entailment" (CE), i.e., whether a clause C follows from f
or not, and we consider subclasses of CNF, i.e., clause-sets F with special
properties. In this report we do not allow auxiliary variables (except of the
Outlook), and thus F needs to be equivalent to f.
We consider the hierarchies UC_k <= WC_k, which were introduced by the
authors in 2012. Each level allows CE queries. The first two levels are
well-known classes for KC. Namely UC_0 = WC_0 is the same as PI as studied in
KC, that is, f is represented by the set of all prime implicates, while UC_1 =
WC_1 is the same as UC, the class of unit-refutation complete clause-sets
introduced by del Val 1994. We show that for each k there are (sequences of)
boolean functions with polysize representations in UC_{k+1}, but with an
exponential lower bound on representations in WC_k. Such a separation was
previously only know for k=0. We also consider PC < UC, the class of
propagation-complete clause-sets. We show that there are (sequences of) boolean
functions with polysize representations in UC, while there is an exponential
lower bound for representations in PC. These separations are steps towards a
general conjecture determining the representation power of the hierarchies PC_k
< UC_k <= WC_k. The strong form of this conjecture also allows auxiliary
variables, as discussed in depth in the Outlook.Comment: 43 pages, second version with literature updates. Proceeds with the
separation results from the discontinued arXiv:1302.442
Efficient enumeration of solutions produced by closure operations
In this paper we address the problem of generating all elements obtained by
the saturation of an initial set by some operations. More precisely, we prove
that we can generate the closure of a boolean relation (a set of boolean
vectors) by polymorphisms with a polynomial delay. Therefore we can compute
with polynomial delay the closure of a family of sets by any set of "set
operations": union, intersection, symmetric difference, subsets, supersets
). To do so, we study the problem: for a set
of operations , decide whether an element belongs to the closure
by of a family of elements. In the boolean case, we prove that
is in P for any set of boolean operations
. When the input vectors are over a domain larger than two
elements, we prove that the generic enumeration method fails, since
is NP-hard for some . We also study the
problem of generating minimal or maximal elements of closures and prove that
some of them are related to well known enumeration problems such as the
enumeration of the circuits of a matroid or the enumeration of maximal
independent sets of a hypergraph. This article improves on previous works of
the same authors.Comment: 30 pages, 1 figure. Long version of the article arXiv:1509.05623 of
the same name which appeared in STACS 2016. Final version for DMTCS journa
A resolution principle for clauses with constraints
We introduce a general scheme for handling clauses whose variables are constrained by an underlying constraint theory. In general, constraints can be seen as quantifier restrictions as they filter out the values that can be assigned to the variables of a clause (or an arbitrary formulae with restricted universal or existential quantifier) in any of the models of the constraint theory. We present a resolution principle for clauses with constraints, where unification is replaced by testing constraints for satisfiability over the constraint theory. We show that this constrained resolution is sound and complete in that a set of clauses with constraints is unsatisfiable over the constraint theory if we can deduce a constrained empty clause for each model of the constraint theory, such that the empty clauses constraint is satisfiable in that model. We show also that we cannot require a better result in general, but we discuss certain tractable cases, where we need at most finitely many such empty clauses or even better only one of them as it is known in classical resolution, sorted resolution or resolution with theory unification
Generalizing input-driven languages: theoretical and practical benefits
Regular languages (RL) are the simplest family in Chomsky's hierarchy. Thanks
to their simplicity they enjoy various nice algebraic and logic properties that
have been successfully exploited in many application fields. Practically all of
their related problems are decidable, so that they support automatic
verification algorithms. Also, they can be recognized in real-time.
Context-free languages (CFL) are another major family well-suited to
formalize programming, natural, and many other classes of languages; their
increased generative power w.r.t. RL, however, causes the loss of several
closure properties and of the decidability of important problems; furthermore
they need complex parsing algorithms. Thus, various subclasses thereof have
been defined with different goals, spanning from efficient, deterministic
parsing to closure properties, logic characterization and automatic
verification techniques.
Among CFL subclasses, so-called structured ones, i.e., those where the
typical tree-structure is visible in the sentences, exhibit many of the
algebraic and logic properties of RL, whereas deterministic CFL have been
thoroughly exploited in compiler construction and other application fields.
After surveying and comparing the main properties of those various language
families, we go back to operator precedence languages (OPL), an old family
through which R. Floyd pioneered deterministic parsing, and we show that they
offer unexpected properties in two fields so far investigated in totally
independent ways: they enable parsing parallelization in a more effective way
than traditional sequential parsers, and exhibit the same algebraic and logic
properties so far obtained only for less expressive language families
- …