434 research outputs found

    Maximizing resource utilization by slicing of superscalar architecture

    Full text link
    Superscalar architectural techniques increase instruction throughput from one instruction per cycle to more than one instruction per cycle. Modern processors make use of several processing resources to achieve this kind of throughput. Control units perform various functions to minimize stalls and to ensure a continuous feed of instructions to execution units. It is vital to ensure that instructions ready for execution do not encounter a bottleneck in the execution stage; This thesis work proposes a dynamic scheme to increase efficiency of execution stage by a methodology called block slicing. Implementing this concept in a wide, superscalar pipelined architecture introduces minimal additional hardware and delay in the pipeline. The hardware required for the implementation of the proposed scheme is designed and assessed in terms of cost and delay. Performance measures of speed-up, throughput and efficiency have been evaluated for the resulting pipeline and analyzed

    A C++-embedded Domain-Specific Language for programming the MORA soft processor array

    Get PDF
    MORA is a novel platform for high-level FPGA programming of streaming vector and matrix operations, aimed at multimedia applications. It consists of soft array of pipelined low-complexity SIMD processors-in-memory (PIM). We present a Domain-Specific Language (DSL) for high-level programming of the MORA soft processor array. The DSL is embedded in C++, providing designers with a familiar language framework and the ability to compile designs using a standard compiler for functional testing before generating the FPGA bitstream using the MORA toolchain. The paper discusses the MORA-C++ DSL and the compilation route into the assembly for the MORA machine and provides examples to illustrate the programming model and performance

    High Speed Low Power Cyclic Redundancy Check-32 using FPGA

    Get PDF
    Cyclic Redundancy Check (CRC) is a method used for error detection technique and data integrity. CRC take a block of a messageā€Ÿs bits and divide it by a binary number called polynomial, the result of this division is the checksum that will be added to the message. On the receiver side, the same division will be performed to get the remainder which could be compared with the transmitted checksum if there are no differences that are mean there are no errors. This paper aims to design CRC32 that applied in the Ethernet frame by using Field Programmable Gate Array (FPGA) Virtex-7. Lookup tables and slicing-by-16 algorithm are used together to calculate the CRC32 in parallel. Xilinx ISE used as IDE and synthesis tool and I-Sim used for simulation purposes. The result of this design is 1.250 ns which is the processing time and 102.4 Gbps which is the throughput, furthermore the power consumption is very low as well as the device utilization

    Efficient Fault Injection based on Dynamic HDL Slicing Technique

    Full text link
    This work proposes a fault injection methodology where Hardware Description Language (HDL) code slicing is exploited to prune fault injection locations, thus enabling more efficient campaigns for safety mechanisms evaluation. In particular, the dynamic HDL slicing technique provides for a highly collapsed critical fault list and allows avoiding injections at redundant locations or time-steps. Experimental results show that the proposed methodology integrated into commercial tool flow doubles the simulation speed when comparing to the state-of-the-art industrial-grade EDA tool flows.Comment: arXiv admin note: substantial text overlap with arXiv:2001.0998
    • ā€¦
    corecore