27 research outputs found

    A piecewise linear simulator for large scale integrated circuits

    Get PDF

    The piecewise linear simulator

    No full text

    The Eindhoven schematic editor

    No full text

    3D-TV Rendering on a Multiprocessor System on a Chip

    No full text
    Abstract – In this paper we describe how two 3D-TV rendering algorithms have been mapped onto a chip multiprocessor named Wasabi. This platform contains several TriMedia processors that communicate via a shared memory, fast message-passing channels to support multi-chip systems, and some application-specific co-processors. By mapping 3D-TV rendering applications to Wasabi, the performance figures are obtained not only to check the feasibility of the algorithms and mappings, but also to match the application requirements with the hardware architecture. The results show that both algorithms scale with the number of processors. The first algorithm makes viewer see 2D effect without glasses. However, to get 3D effect, the viewer must wear a pair of special glasses. And it can be executed in real-time on a single TriMedia processor. The 3D effect produced by the second algorithm can be seen without wearing special glasses. To execute this algorithm in real-time, 16 TriMedias are needed. Key words- 3D-TV rendering; optimization; mapping; multiprocessor; system on a chip I

    3D-TV Rendering on a Multiprocessor System on a Chip

    No full text
    Abstract – In this paper we describe how two 3D-TV rendering algorithms have been mapped onto a chip multiprocessor named Wasabi. This platform contains several TriMedia processors that communicate via a shared memory, fast message-passing channels to support multi-chip systems, and some application-specific co-processors. By mapping 3D-TV rendering applications to Wasabi, the performance figures are obtained not only to check the feasibility of the algorithms and mappings, but also to match the application requirements with the hardware architecture. The results show that both algorithms scale with the number of processors. The first algorithm makes viewer see 2D effect without glasses. However, to get 3D effect, the viewer must wear a pair of special glasses. And it can be executed in real-time on a single TriMedia processor. The 3D effect produced by the second algorithm can be seen without wearing special glasses. To execute this algorithm in real-time, 16 TriMedias are needed. Key words- 3D-TV rendering; optimization; mapping; multiprocessor; system on a chi

    Design of Multi-Tasking Coprocessor Control for Eclipse

    No full text
    Eclipse defines a heterogeneous multiprocessor architecture template for dam-dependent stream processing. Intended as a scalable and flexible subsystem of forthcoming media-processing systemson -a-chip, Eclipse combines application configuration flexibility with the efficiency of function-specific hardware, or coprocessors. To facilitate reuse, Eclipse separates coprocessor functionality from generic support that addresses multi-tasking, inter-task synchranizatian, and data transport. Five interface primitives accomplish this separation. The interface facilitates the design of coprocessors that require complex control to handle data-dependent I/O, saving/restoring task state upon task switches, and pipelined processing. This paper presents how this interface enables the design of such reusable yet cost-effective coprocessors

    Eijndhoven. Hardwired MPEG-4 Repetitive Padding

    No full text
    Abstract—We consider two hardwired solutions for repetitive padding, a performance restricting algorithm for real time MPEG-4 execution. The first solution regards application specific implementations, the second regards general purpose processing. For the application specific implementations we propose a systolic array structure. To determine the chip area and speed, we have synthesized its VHDL models for two field-programmable gate array families—Xilinx and Altera. Depending on the implemented configuration, the unit can process between 77 K and 950 K macroblocks per second (MB/s) when mapped on FPGA chips containing less than 10 K logical gates and frequency capabilities below 100 MHz. The second approach regards an augmentation of a general-purpose arithmetic logical units with an extra functionality added to perform repetitive padding. At trivial hardware costs of a few hundred 2 2 AND-OR logic gates, we achieve an order of magnitude speed-up compared to nonaugmented general purpose processor padding. The proposed hardware solutions meet the requirements of all MPEG-4 visual profile levels. Both approaches have been proven to be scalable and fit into different architectural concepts and operand widths. Index Terms—Arithmetic-logical-unit (ALU) augmentation, field-programmable gate array (FPGA), hardwired repetitive padding, MPEG-4, systolic structure. I

    An 8-Point IDCT Computing Resource Implemented on a TriMedia/CPU64 Reconfigurable Functional Unit

    No full text
    This paper presents the implementation of an 8-point Inverse Discrete Cosine Transform (IDCT) computing resource on a TriMedia/CPU64 FPGA-based Reconfigurable Functional Unit (RFU). TriMedia/CPU64 is a 64-bit 5 issue-slot VLIW processor launching a long instruction every clock cycle. The RFU consists mainly of an FPGA core, and is embedded into the TriMedia as any other hardwired functional unit, i.e., it receives instructions from the instruction decoder, reads its input arguments from and writes the computed values back to the register file. To reduce the computational complexity of IDCT, we used a modified version of the Loeffler algorithm which requires 14 multiplications. Since each multiplicand is a 16-bit signed number represented in 2's complement notation, while each multiplier is a positive constant of 15 bits or less, we employed a "multiplication-by-constant" scheme which was optimized against the multiplier. To increase the throughput of the IDCT computing resource, we propose a pipeline implementation. When mapped on an ACEX EP1K100 FPGA-based RFU, our 8-point IDCT computing resource exhibits a latency of 16 TriMedia cycles, a recovery of 2 cycles, and occupies 42% of the logic cells of the device
    corecore