541 research outputs found

    Network Virtual Machine (NetVM): A New Architecture for Efficient and Portable Packet Processing Applications

    Get PDF
    A challenge facing network device designers, besides increasing the speed of network gear, is improving its programmability in order to simplify the implementation of new applications (see for example, active networks, content networking, etc). This paper presents our work on designing and implementing a virtual network processor, called NetVM, which has an instruction set optimized for packet processing applications, i.e., for handling network traffic. Similarly to a Java Virtual Machine that virtualizes a CPU, a NetVM virtualizes a network processor. The NetVM is expected to provide a compatibility layer for networking tasks (e.g., packet filtering, packet counting, string matching) performed by various packet processing applications (firewalls, network monitors, intrusion detectors) so that they can be executed on any network device, ranging from expensive routers to small appliances (e.g. smart phones). Moreover, the NetVM will provide efficient mapping of the elementary functionalities used to realize the above mentioned networking tasks upon specific hardware functional units (e.g., ASICs, FPGAs, and network processing elements) included in special purpose hardware systems possibly deployed to implement network devices

    A Machine-Independent Debugger--Revisited

    Full text link
    Most debuggers are notoriously machine-dependent, but some recent research prototypes achieve varying degrees of machine-independence with novel designs. Cdb, a simple source-level debugger for C, is completely independent of its target architecture. This independence is achieved by embedding symbol tables and debugging code in the target program, which costs both time and space. This paper describes a revised design and implementation of cdb that reduces the space cost by nearly one-half and the time cost by 13% by storing symbol tables in external files. A symbol table is defined by a 31-line grammar in the Abstract Syntax Description Language (ASDL). ASDL is a domain-specific language for specifying tree data structures. The ASDL tools accept an ASDL grammar and generate code to construct, read, and write these data structures. Using ASDL automates implementing parts of the debugger, and the grammar documents the symbol table concisely. Using ASDL also suggested simplifications to the interface between the debugger and the target program. Perhaps most important, ASDL emphasizes that symbol tables are data structures, not file formats. Many of the pitfalls of working with low-level file formats can be avoided by focusing instead on high-level data structures and automating the implementation details.Comment: 12 pages; 6 figures; 3 table

    Retargetable Compilers for Embedded DSPs

    Get PDF
    Programmable devices are a key technology for the design of embedded systems, such as in the consumer electronics market. Processor cores are used as building blocks for more and more embedded system designs, since they provide a unique combination of features: flexibility and reusability. Processor-based design implies that compilers capable of generating efficient machine code are necessary. However, highly efficient compilers for embedded processors are hardly available. In particular, this holds for digital signal processors (DSPs). This contribution is intended to outline different aspects of DSP compiler technology. First, we cover demands on compilers for embedded DSPs, which are partially in sharp contrast to traditional compiler construction. Secondly, we present recent advances in DSP code optimization techniques, which explore a comparatively large search space in order to achieve high code quality. Finally, we discuss the different approaches to retargetability of compilers, that is, techniques for automatic generation of compilers from processor models

    Design Space Exploration for Sobel Application using OpenIMPACT( Opensource Retargetable Compilation for VLIW Architecture)

    Get PDF
    Retargetable compilation infrastructure bring to growth of application-specific programmable systems which directly supporting the different target architectures and design space exploration (DSE) for the instruction set architecture and microarchitecture of the processor under development. There are three categories in this technology costumized„ semiretargetable and retargetable compiler. In DSE retargetable compilation methodology , permit to determine the optimal combination of hardwired components for example IALU, FALU ,Memory,Branch and programmable elements to get better performance that be measured by cycle count/total execution. DSP TI Processor Model as target architecture implemented, we have simulated for Sobel Application on VLIW architecture for observing optimal hardwired component needed in embedded system. With Optimization facility in compiler , result of simulation at variant model defined on system, giving information of Superblock and Hyperblock types can generate code that be executed processor better than Classical type. Model unroll looping in Optimization improved performance simulation until 50% unless in Classical type

    t|ket> : A retargetable compiler for NISQ devices

    Get PDF
    We present t|ket>, a quantum software development platform produced by Cambridge Quantum Computing Ltd. The heart of t|ket> is a language-agnostic optimising compiler designed to generate code for a variety of NISQ devices, which has several features designed to minimise the influence of device error. The compiler has been extensively benchmarked and outperforms most competitors in terms of circuit optimisation and qubit routing

    Designing a CPU model: from a pseudo-formal document to fast code

    Get PDF
    For validating low level embedded software, engineers use simulators that take the real binary as input. Like the real hardware, these full-system simulators are organized as a set of components. The main component is the CPU simulator (ISS), because it is the usual bottleneck for the simulation speed, and its development is a long and repetitive task. Previous work showed that an ISS can be generated from an Architecture Description Language (ADL). In the work reported in this paper, we generate a CPU simulator directly from the pseudo-formal descriptions of the reference manual. For each instruction, we extract the information describing its behavior, its binary encoding, and its assembly syntax. Next, after automatically applying many optimizations on the extracted information, we generate a SystemC/TLM ISS. We also generate tests for the decoder and a formal specification in Coq. Experiments show that the generated ISS is as fast and stable as our previous hand-written ISS.Comment: 3rd Workshop on: Rapid Simulation and Performance Evaluation: Methods and Tools (2011

    Hardware/Software Codesign

    Get PDF
    The current state of the art technology in integrated circuits allows the incorporation of multiple processor cores and memory arrays, in addition to application specific hardware, on a single substrate. As silicon technology has become more advanced, allowing the implementation of more complex designs, systems have begun to incorporate considerable amounts of embedded software [3]. Thus it becomes increasingly necessary for the system designers to have knowledge on both hardware and software to make efficient design tradeoffs. This is where hardware/software codesign comes into existence

    cphVB: A System for Automated Runtime Optimization and Parallelization of Vectorized Applications

    Full text link
    Modern processor architectures, in addition to having still more cores, also require still more consideration to memory-layout in order to run at full capacity. The usefulness of most languages is deprecating as their abstractions, structures or objects are hard to map onto modern processor architectures efficiently. The work in this paper introduces a new abstract machine framework, cphVB, that enables vector oriented high-level programming languages to map onto a broad range of architectures efficiently. The idea is to close the gap between high-level languages and hardware optimized low-level implementations. By translating high-level vector operations into an intermediate vector bytecode, cphVB enables specialized vector engines to efficiently execute the vector operations. The primary success parameters are to maintain a complete abstraction from low-level details and to provide efficient code execution across different, modern, processors. We evaluate the presented design through a setup that targets multi-core CPU architectures. We evaluate the performance of the implementation using Python implementations of well-known algorithms: a jacobi solver, a kNN search, a shallow water simulation and a synthetic stencil simulation. All demonstrate good performance
    • …
    corecore