249,886 research outputs found

    A low-power geometric mapping co-processor for high-speed graphics application

    No full text
    In this article we present a novel design of a low-power geometric mapping co-processor that can be used for high-performance graphics system. The processor can carry out any single or a combination of transformations belonging to affine transformation family ranging from 1-D to 3-D. It allows interactive operations which can be defined either by a user (allowing it to be a stand-alone geometric transformation processor) or by a host processor (allowing it to be a co-processor to accelerate certain graphics operations). It occupies a silicon area of 6 mm2 and consumes 40 mW power when synthesized with 0.25?m technology

    MORA - an architecture and programming model for a resource efficient coarse grained reconfigurable processor

    Get PDF
    This paper presents an architecture and implementation details for MORA, a novel coarse grained reconfigurable processor for accelerating media processing applications. The MORA architecture involves a 2-D array of several such processors, to deliver low cost, high throughput performance in media processing applications. A distinguishing feature of the MORA architecture is the co-design of hardware architecture and low-level programming language throughout the design cycle. The implementation details for the single MORA processor, and benchmark evaluation using a cycle accurate simulator are presented

    PGPG: An Automatic Generator of Pipeline Design for Programmable GRAPE Systems

    Get PDF
    We have developed PGPG (Pipeline Generator for Programmable GRAPE), a software which generates the low-level design of the pipeline processor and communication software for FPGA-based computing engines (FBCEs). An FBCE typically consists of one or multiple FPGA (Field-Programmable Gate Array) chips and local memory. Here, the term "Field-Programmable" means that one can rewrite the logic implemented to the chip after the hardware is completed, and therefore a single FBCE can be used for calculation of various functions, for example pipeline processors for gravity, SPH interaction, or image processing. The main problem with FBCEs is that the user need to develop the detailed hardware design for the processor to be implemented to FPGA chips. In addition, she or he has to write the control logic for the processor, communication and data conversion library on the host processor, and application program which uses the developed processor. These require detailed knowledge of hardware design, a hardware description language such as VHDL, the operating system and the application, and amount of human work is huge. A relatively simple design would require 1 person-year or more. The PGPG software generates all necessary design descriptions, except for the application software itself, from a high-level design description of the pipeline processor in the PGPG language. The PGPG language is a simple language, specialized to the description of pipeline processors. Thus, the design of pipeline processor in PGPG language is much easier than the traditional design. For real applications such as the pipeline for gravitational interaction, the pipeline processor generated by PGPG achieved the performance similar to that of hand-written code. In this paper we present a detailed description of PGPG version 1.0.Comment: 24 pages, 6 figures, accepted PASJ 2005 July 2

    Generic Pipelined Processor Modeling and High Performance Cycle-Accurate Simulator Generation

    Full text link
    Detailed modeling of processors and high performance cycle-accurate simulators are essential for today's hardware and software design. These problems are challenging enough by themselves and have seen many previous research efforts. Addressing both simultaneously is even more challenging, with many existing approaches focusing on one over another. In this paper, we propose the Reduced Colored Petri Net (RCPN) model that has two advantages: first, it offers a very simple and intuitive way of modeling pipelined processors; second, it can generate high performance cycle-accurate simulators. RCPN benefits from all the useful features of Colored Petri Nets without suffering from their exponential growth in complexity. RCPN processor models are very intuitive since they are a mirror image of the processor pipeline block diagram. Furthermore, in our experiments on the generated cycle-accurate simulators for XScale and StrongArm processor models, we achieved an order of magnitude (~15 times) speedup over the popular SimpleScalar ARM simulator.Comment: Submitted on behalf of EDAA (http://www.edaa.com/

    Using fast and accurate simulation to explore hardware/software trade-offs in the multi-core era

    Get PDF
    Writing well-performing parallel programs is challenging in the multi-core processor era. In addition to achieving good per-thread performance, which in itself is a balancing act between instruction-level parallelism, pipeline effects and good memory performance, multi-threaded programs complicate matters even further. These programs require synchronization, and are affected by the interactions between threads through sharing of both processor resources and the cache hierarchy. At the Intel Exascience Lab, we are developing an architectural simulator called Sniper for simulating future exascale-era multi-core processors. Its goal is twofold: Sniper should assist hardware designers to make design decisions, while simultaneously providing software designers with a tool to gain insight into the behavior of their algorithms and allow for optimization. By taking architectural features into account, our simulator can provide more insight into parallel programs than what can be obtained from existing performance analysis tools. This unique combination of hardware simulator and software performance analysis tool makes Sniper a useful tool for a simultaneous exploration of the hardware and software design space for future high-performance multi-core systems

    Design of Asynchronous Processor

    Get PDF
    There has been a resurgence of interest in asynchronous design recently. The renewed interest in asynchronous design results from its potential to address the problem faced by the synchronous design methodology. In asynchronous methodology, there is no global clock controlling the synchronization of a circuit; instead, the data communication between each functional unit is completed through local request-acknowledge handshake protocol. The growth in demand of high performance portable systems has accelerated asynchronous logic design technique which can offers better performance and lower power consumption especially in the development of the asynchronous processor for mobile and portable application. In this thesis, the design and verification of an 8-bit asynchronous pipelined processor is presented. The developed asynchronous processor is based on Harvard architecture and uses Reduced Instruction Set Computer (RISC) instruction set architecture. 24 instructions are supported by the processor including register, memory, branch and jump operations. The processor has three-stage pipelining i.e. fetch, decode and execution pipeline. Micropipelines framework with 2-phase signalling protocol and bundled-data approach is employed in designing complex and powerful asynchronous control circuits for the processor. Very High Speed Integrated Circuit Hardware Description Language (VHDL) is used to design and construct all parts of the asynchronous processor. Simulation, synthesis and verification of the processor are carried out using MAX +PLUS II software. The simulation results have demonstrated that the developed 8-bit asynchronous RISC processor is working correctly using current Field Programmable Gate Array (FPGA) technology. This processor employed 903 logic cells and has 6144 memory bits for instruction and data memory. Each of the processor subsystem can operates at different cycle time, thus enable an asynchronous processor achieving 11.95MHz average speed performance
    corecore