41,481 research outputs found
Software trace cache
We explore the use of compiler optimizations, which optimize the layout of instructions in memory. The target is to enable the code to make better use of the underlying hardware resources regardless of the specific details of the processor/architecture in order to increase fetch performance. The Software Trace Cache (STC) is a code layout algorithm with a broader target than previous layout optimizations. We target not only an improvement in the instruction cache hit rate, but also an increase in the effective fetch width of the fetch engine. The STC algorithm organizes basic blocks into chains trying to make sequentially executed basic blocks reside in consecutive memory positions, then maps the basic block chains in memory to minimize conflict misses in the important sections of the program. We evaluate and analyze in detail the impact of the STC, and code layout optimizations in general, on the three main aspects of fetch performance; the instruction cache hit rate, the effective fetch width, and the branch prediction accuracy. Our results show that layout optimized, codes have some special characteristics that make them more amenable for high-performance instruction fetch. They have a very high rate of not-taken branches and execute long chains of sequential instructions; also, they make very effective use of instruction cache lines, mapping only useful instructions which will execute close in time, increasing both spatial and temporal locality.Peer ReviewedPostprint (published version
Asynchronous processing of Coq documents: from the kernel up to the user interface
The work described in this paper improves the reactivity of the Coq system by
completely redesigning the way it processes a formal document. By subdividing
such work into independent tasks the system can give precedence to the ones of
immediate interest for the user and postpones the others. On the user side, a
modern interface based on the PIDE middleware aggregates and present in a
consistent way the output of the prover. Finally postponed tasks are processed
exploiting modern, parallel, hardware to offer better scalability.Comment: in Proceedings of ITP, Aug 2015, Nanjing, Chin
A new 3-DOF 2T1R parallel mechanism: Topology design and kinematics
This article presents a new three-degree-of-freedom (3-DOF) parallel
mechanism (PM) with two translations and one rotation (2T1R), designed based on
the topological design theory of the parallel mechanism using position and
orientation characteristics (POC). The PM is primarily intended for use in
package sorting and delivery. The mobile platform of the PM moves along a
translation axis, picks up objects from a conveyor belt, and tilts them to
either side of the axis. We first calculate the PM's topological
characteristics, such as the degree of freedom (DOF) and the degree of
coupling, and provide its topological analytical formula to represent the
topological information of the PM. Next, we solve the direct and inverse
kinematic models based on the kinematic modelling principle using the
topological features. The models are purely analytic and are broken down into a
series of quadratic equations, making them suitable for use in an industrial
robot. We also study the singular configurations to identify the serial and
parallel singularities. Using the decoupling properties, we size the mechanism
to address the package sorting and depositing problem using an algebraic
approach. To determine the smallest segment lengths, we use a cylindrical
algebraic decomposition to solve a system with inequalities.Comment: IDETC-CIE 2023 International Design Engineering Technical Conferences
& Computers and Information in Engineering Conference, ASME, Aug 2023,
Boston, Franc
- …