541 research outputs found
Network Virtual Machine (NetVM): A New Architecture for Efficient and Portable Packet Processing Applications
A challenge facing network device designers, besides increasing the speed of network gear, is improving its programmability in order to simplify the implementation of new applications (see for example, active networks, content networking, etc). This paper presents our work on designing and implementing a virtual network processor, called NetVM, which has an instruction set optimized for packet processing applications, i.e., for handling network traffic. Similarly to a Java Virtual Machine that virtualizes a CPU, a NetVM virtualizes a network processor. The NetVM is expected to provide a compatibility layer for networking tasks (e.g., packet filtering, packet counting, string matching) performed by various packet processing applications (firewalls, network monitors, intrusion detectors) so that they can be executed on any network device, ranging from expensive routers to small appliances (e.g. smart phones). Moreover, the NetVM will provide efficient mapping of the elementary functionalities used to realize the above mentioned networking tasks upon specific hardware functional units (e.g., ASICs, FPGAs, and network processing elements) included in special purpose hardware systems possibly deployed to implement network devices
A Machine-Independent Debugger--Revisited
Most debuggers are notoriously machine-dependent, but some recent research
prototypes achieve varying degrees of machine-independence with novel designs.
Cdb, a simple source-level debugger for C, is completely independent of its
target architecture. This independence is achieved by embedding symbol tables
and debugging code in the target program, which costs both time and space. This
paper describes a revised design and implementation of cdb that reduces the
space cost by nearly one-half and the time cost by 13% by storing symbol tables
in external files. A symbol table is defined by a 31-line grammar in the
Abstract Syntax Description Language (ASDL). ASDL is a domain-specific language
for specifying tree data structures. The ASDL tools accept an ASDL grammar and
generate code to construct, read, and write these data structures. Using ASDL
automates implementing parts of the debugger, and the grammar documents the
symbol table concisely. Using ASDL also suggested simplifications to the
interface between the debugger and the target program. Perhaps most important,
ASDL emphasizes that symbol tables are data structures, not file formats. Many
of the pitfalls of working with low-level file formats can be avoided by
focusing instead on high-level data structures and automating the
implementation details.Comment: 12 pages; 6 figures; 3 table
Retargetable Compilers for Embedded DSPs
Programmable devices are a key technology for the design of embedded systems, such as in the consumer electronics market. Processor cores are used as building blocks for more and more embedded system designs, since they provide a unique combination of features: flexibility and reusability. Processor-based design implies that compilers capable of generating efficient machine code are necessary. However, highly efficient compilers for embedded processors are hardly available. In particular, this holds for digital signal processors (DSPs). This contribution is intended to outline different aspects of DSP compiler technology. First, we cover demands on compilers for embedded DSPs, which are partially in sharp contrast to traditional compiler construction. Secondly, we present recent advances in DSP code optimization techniques, which explore a comparatively large search space in order to achieve high code quality. Finally, we discuss the different approaches to retargetability of compilers, that is, techniques for automatic generation of compilers from processor models
Design Space Exploration for Sobel Application using OpenIMPACT( Opensource Retargetable Compilation for VLIW Architecture)
Retargetable compilation infrastructure bring to growth of application-specific programmable systems
which directly supporting the different target architectures and design space exploration (DSE) for the
instruction set architecture and microarchitecture of the processor under development. There are three
categories in this technology costumized„ semiretargetable and retargetable compiler. In DSE retargetable
compilation methodology , permit to determine the optimal combination of hardwired components for
example IALU, FALU ,Memory,Branch and programmable elements to get better performance that be measured
by cycle count/total execution. DSP TI Processor Model as target architecture implemented, we have
simulated for Sobel Application on VLIW architecture for observing optimal hardwired component needed
in embedded system. With Optimization facility in compiler , result of simulation at variant model defined
on system, giving information of Superblock and Hyperblock types can generate code that be executed
processor better than Classical type. Model unroll looping in Optimization improved performance simulation
until 50% unless in Classical type
t|ket> : A retargetable compiler for NISQ devices
We present t|ket>, a quantum software development platform produced by Cambridge Quantum Computing Ltd. The heart of t|ket> is a language-agnostic optimising compiler designed to generate code for a variety of NISQ devices, which has several features designed to minimise the influence of device error. The compiler has been extensively benchmarked and outperforms most competitors in terms of circuit optimisation and qubit routing
Designing a CPU model: from a pseudo-formal document to fast code
For validating low level embedded software, engineers use simulators that
take the real binary as input. Like the real hardware, these full-system
simulators are organized as a set of components. The main component is the CPU
simulator (ISS), because it is the usual bottleneck for the simulation speed,
and its development is a long and repetitive task. Previous work showed that an
ISS can be generated from an Architecture Description Language (ADL). In the
work reported in this paper, we generate a CPU simulator directly from the
pseudo-formal descriptions of the reference manual. For each instruction, we
extract the information describing its behavior, its binary encoding, and its
assembly syntax. Next, after automatically applying many optimizations on the
extracted information, we generate a SystemC/TLM ISS. We also generate tests
for the decoder and a formal specification in Coq. Experiments show that the
generated ISS is as fast and stable as our previous hand-written ISS.Comment: 3rd Workshop on: Rapid Simulation and Performance Evaluation: Methods
and Tools (2011
Hardware/Software Codesign
The current state of the art technology in integrated circuits allows the incorporation of multiple processor cores and memory arrays, in addition to application specific hardware, on a single substrate. As silicon technology has become more advanced, allowing the implementation of more complex designs, systems have begun to incorporate considerable amounts of embedded software [3]. Thus it becomes increasingly necessary for the system designers to have knowledge on both hardware and software to make efficient design tradeoffs. This is where hardware/software codesign comes into existence
cphVB: A System for Automated Runtime Optimization and Parallelization of Vectorized Applications
Modern processor architectures, in addition to having still more cores, also
require still more consideration to memory-layout in order to run at full
capacity. The usefulness of most languages is deprecating as their
abstractions, structures or objects are hard to map onto modern processor
architectures efficiently.
The work in this paper introduces a new abstract machine framework, cphVB,
that enables vector oriented high-level programming languages to map onto a
broad range of architectures efficiently. The idea is to close the gap between
high-level languages and hardware optimized low-level implementations. By
translating high-level vector operations into an intermediate vector bytecode,
cphVB enables specialized vector engines to efficiently execute the vector
operations.
The primary success parameters are to maintain a complete abstraction from
low-level details and to provide efficient code execution across different,
modern, processors. We evaluate the presented design through a setup that
targets multi-core CPU architectures. We evaluate the performance of the
implementation using Python implementations of well-known algorithms: a jacobi
solver, a kNN search, a shallow water simulation and a synthetic stencil
simulation. All demonstrate good performance
- …