7,177 research outputs found
CapablePtrs: Securely Compiling Partial Programs using the Pointers-as-Capabilities Principle
Capability machines such as CHERI provide memory capabilities that can be
used by compilers to provide security benefits for compiled code (e.g., memory
safety). The C to CHERI compiler, for example, achieves memory safety by
following a principle called "pointers as capabilities" (PAC). Informally, PAC
says that a compiler should represent a source language pointer as a machine
code capability. But the security properties of PAC compilers are not yet well
understood. We show that memory safety is only one aspect, and that PAC
compilers can provide significant additional security guarantees for partial
programs: the compiler can provide guarantees for a compilation unit, even if
that compilation unit is later linked to attacker-controlled machine code. This
paper is the first to study the security of PAC compilers for partial programs
formally. We prove for a model of such a compiler that it is fully abstract.
The proof uses a novel proof technique (dubbed TrICL, read trickle), which is
of broad interest because it reuses and extends the compiler correctness
relation in a natural way, as we demonstrate. We implement our compiler on top
of the CHERI platform and show that it can compile legacy C code with minimal
code changes. We provide performance benchmarks that show how performance
overhead is proportional to the number of cross-compilation-unit function
calls
Variable-based multi-module data caches for clustered VLIW processors
Memory structures consume an important fraction of the total processor energy. One solution to reduce the energy consumed by cache memories consists of reducing their supply voltage and/or increase their threshold voltage at an expense in access time. We propose to divide the L1 data cache into two cache modules for a clustered VLIW processor consisting of two clusters. Such division is done on a variable basis so that the address of a datum determines its location. Each cache module is assigned to a cluster and can be set up as a fast power-hungry module or as a slow power-aware module. We also present compiler techniques in order to distribute variables between the two cache modules and generate code accordingly. We have explored several cache configurations using the Mediabench suite and we have observed that the best distributed cache organization outperforms traditional cache organizations by 19%-31% in energy-delay and by 11%-29% in energy-delay. In addition, we also explore a reconfigurable distributed cache, where the cache can be reconfigured on a context switch. This reconfigurable scheme further outperforms the best previous distributed organization by 3%-4%.Peer ReviewedPostprint (published version
On the Complexity of Spill Everywhere under SSA Form
Compilation for embedded processors can be either aggressive (time consuming
cross-compilation) or just in time (embedded and usually dynamic). The
heuristics used in dynamic compilation are highly constrained by limited
resources, time and memory in particular. Recent results on the SSA form open
promising directions for the design of new register allocation heuristics for
embedded systems and especially for embedded compilation. In particular,
heuristics based on tree scan with two separated phases -- one for spilling,
then one for coloring/coalescing -- seem good candidates for designing
memory-friendly, fast, and competitive register allocators. Still, also because
of the side effect on power consumption, the minimization of loads and stores
overhead (spilling problem) is an important issue. This paper provides an
exhaustive study of the complexity of the ``spill everywhere'' problem in the
context of the SSA form. Unfortunately, conversely to our initial hopes, many
of the questions we raised lead to NP-completeness results. We identify some
polynomial cases but that are impractical in JIT context. Nevertheless, they
can give hints to simplify formulations for the design of aggressive
allocators.Comment: 10 page
Survey on Combinatorial Register Allocation and Instruction Scheduling
Register allocation (mapping variables to processor registers or memory) and
instruction scheduling (reordering instructions to increase instruction-level
parallelism) are essential tasks for generating efficient assembly code in a
compiler. In the last three decades, combinatorial optimization has emerged as
an alternative to traditional, heuristic algorithms for these two tasks.
Combinatorial optimization approaches can deliver optimal solutions according
to a model, can precisely capture trade-offs between conflicting decisions, and
are more flexible at the expense of increased compilation time.
This paper provides an exhaustive literature review and a classification of
combinatorial optimization approaches to register allocation and instruction
scheduling, with a focus on the techniques that are most applied in this
context: integer programming, constraint programming, partitioned Boolean
quadratic programming, and enumeration. Researchers in compilers and
combinatorial optimization can benefit from identifying developments, trends,
and challenges in the area; compiler practitioners may discern opportunities
and grasp the potential benefit of applying combinatorial optimization
- …