76 research outputs found
Sidekick compilation with xDSL
Traditionally, compiler researchers either conduct experiments within an
existing production compiler or develop their own prototype compiler; both
options come with trade-offs. On one hand, prototyping in a production compiler
can be cumbersome, as they are often optimized for program compilation speed at
the expense of software simplicity and development speed. On the other hand,
the transition from a prototype compiler to production requires significant
engineering work. To bridge this gap, we introduce the concept of sidekick
compiler frameworks, an approach that uses multiple frameworks that
interoperate with each other by leveraging textual interchange formats and
declarative descriptions of abstractions. Each such compiler framework is
specialized for specific use cases, such as performance or prototyping.
Abstractions are by design shared across frameworks, simplifying the transition
from prototyping to production. We demonstrate this idea with xDSL, a sidekick
for MLIR focused on prototyping and teaching. xDSL interoperates with MLIR
through a shared textual IR and the exchange of IRs through an IR Definition
Language. The benefits of sidekick compiler frameworks are evaluated by showing
on three use cases how xDSL impacts their development: teaching, DSL
compilation, and rewrite system prototyping. We also investigate the trade-offs
that xDSL offers, and demonstrate how we simplify the transition between
frameworks using the IRDL dialect. With sidekick compilation, we envision a
future in which engineers minimize the cost of development by choosing a
framework built for their immediate needs, and later transitioning to
production with minimal overhead
Pac-Learning Recursive Logic Programs: Efficient Algorithms
We present algorithms that learn certain classes of function-free recursive
logic programs in polynomial time from equivalence queries. In particular, we
show that a single k-ary recursive constant-depth determinate clause is
learnable. Two-clause programs consisting of one learnable recursive clause and
one constant-depth determinate non-recursive clause are also learnable, if an
additional ``basecase'' oracle is assumed. These results immediately imply the
pac-learnability of these classes. Although these classes of learnable
recursive programs are very constrained, it is shown in a companion paper that
they are maximally general, in that generalizing either class in any natural
way leads to a computationally difficult learning problem. Thus, taken together
with its companion paper, this paper establishes a boundary of efficient
learnability for recursive logic programs.Comment: See http://www.jair.org/ for any accompanying file
Cognitive control in written word production
IntroductionCognitive control processes have been extensively studied in spoken word production, however, relevant investigations of written word production are scarce. Using data from a group of post-stroke individuals we studied, for the first time, the neural substrates of cognitive control in written word production. We addressed three questions: Are control mechanisms: (1) shared by language and non-language domains; (2) shared by lexical and segmental levels of word production within the word production system; (3) related to both interference and facilitation effect types?MethodsTo address these questions, for each participant we calculated cognitive control indices that reflected the interference and facilitation effects observed in written Blocked Cyclic Naming (written language production) and Simon (visuo-spatial processing) tasks. These behavioral cognitive control indices were studied both on their own, as well as in relation to the distribution of structural (gray matter) lesions.ResultsFor Question 1, we provide strong evidence of domain-specific control mechanisms used in written word production, as, among other findings, distinct regions within Broca's Area were associated with control in written word production vs. control in visuo-spatial processing. For Question 2, our results provide no strong evidence of shared control mechanisms for lexical and segmental levels of written word production, while they highlight the role of BA45 in instantiating control mechanisms that are specific to the two levels. For Question 3, we found evidence that BA45 supports distinct mechanisms associated with facilitation and interference, while orbital frontal cortex supports control process(es) associated with both.DiscussionThese findings significantly advance our understanding of the cognitive control mechanisms involved in written language production, as well as of the role of Broca's Area in cognitive control and language production more generally
Bridging Python to Silicon: The SODA Toolchain
Systems performing scientific computing, data analysis, and machine learning tasks have a growing demand for application-specific accelerators that can provide high computational performance while meeting strict size and power requirements. However, the algorithms and applications that need to be accelerated are evolving at a rate that is incompatible with manual design processes based on hardware description languages. Agile hardware design tools based on compiler techniques can help by quickly producing an application specific integrated circuit (ASIC) accelerator starting from a high-level algorithmic description. We present the SODA Synthesizer, a modular and open-source hardware compiler that provides automated end-to-end synthesis from high-level software frameworks to ASIC implementation, relying on multi-level representations to progressively lower and optimize the input code. Our approach does not require the application developer to write register-transfer level code, and it is able to reach up to 364 GFLOPS/W efficiency (32-bit precision) on typical convolutional neural network operators
SHACL Satisfiability and Containment
The Shapes Constraint Language (SHACL) is a recent W3C recommendation language for validating RDF data. Specifically, SHACL documents are collections of constraints that enforce particular shapes on an RDF graph. Previous work on the topic has provided theoretical and practical results for the validation problem, but did not consider the standard decision problems of satisfiability and containment, which are crucial for verifying the feasibility of the constraints and important for design and optimization purposes. In this paper, we undertake a thorough study of the different features of SHACL by providing a translation to a new first-order language, called, that precisely captures the semantics of SHACL w.r.t. satisfiability and containment. We study the interaction of SHACL features in this logic and provide the detailed map of decidability and complexity results of the aforementioned decision problems for different SHACL sublanguages. Notably, we prove that both problems are undecidable for the full language, but we present decidable combinations of interesting features
Automatically Harnessing Sparse Acceleration
Sparse linear algebra is central to many scientific programs, yet compilers
fail to optimize it well. High-performance libraries are available, but
adoption costs are significant. Moreover, libraries tie programs into
vendor-specific software and hardware ecosystems, creating non-portable code.
In this paper, we develop a new approach based on our specification Language
for implementers of Linear Algebra Computations (LiLAC). Rather than requiring
the application developer to (re)write every program for a given library, the
burden is shifted to a one-off description by the library implementer. The
LiLAC-enabled compiler uses this to insert appropriate library routines without
source code changes.
LiLAC provides automatic data marshaling, maintaining state between calls and
minimizing data transfers. Appropriate places for library insertion are
detected in compiler intermediate representation, independent of source
languages.
We evaluated on large-scale scientific applications written in FORTRAN;
standard C/C++ and FORTRAN benchmarks; and C++ graph analytics kernels. Across
heterogeneous platforms, applications and data sets we show speedups of
1.1 to over 10 without user intervention.Comment: Accepted to CC 202
SPARTA: High-Level Synthesis of Parallel Multi-Threaded Accelerators
This paper presents a methodology for the Synthesis of PARallel multi-Threaded Accelerators (SPARTA) from OpenMP annotated C/C++ specifications. SPARTA extends an open-source HLS tool, enabling the generation of accelerators that provide latency tolerance for irregular memory accesses through multithreading, support fine-grained memory-level parallelism through a hot-potato deflection-based network-on-chip (NoC), support synchronization constructs, and can instantiate memory-side caches. Our approach is based on a custom runtime OpenMP library, providing flexibility and extensibility. Experimental results show high scalability when synthesizing irregular graph kernels. The accelerators generated with our approach are, on average, 2.29x faster than state-of-the-art HLS methodologies
- …