152 research outputs found

    Programmable flexible cores for SoC applications

    Get PDF
    Tese de mestrado. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 200

    Technology mapping algorithms for hybrid fpgas containing lookup tables and plas

    Full text link

    Design, performance, and energy consumption of eDRAM/SRAM macrocells for L1 data caches

    Full text link
    (c) 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other worksSRAM and DRAM have been the predominant technologies used to implement memory cells in computer systems, each one having its advantages and shortcomings. SRAM cells are faster and require no refresh since reads are not destructive. In contrast, DRAM cells provide higher density and minimal leakage energy since there are no paths within the cell from Vdd to ground. Recently, DRAM cells have been embedded in logic-based technology (eDRAM), thus overcoming the speed limit of typical DRAM cells. In this paper, we propose a hybrid n-bit macrocell that implements one SRAM cell and n-1 eDRAM cells. This cell is aimed at being used in an n-way set-associative first-level data cache. Architectural mechanisms (e.g., special writeback policies) have been devised to completely avoid refresh logic. Performance, energy, and area have been analyzed in detail. Experimental results show that using typical eDRAM capacitors, and compared to a conventional cache, a 4-way set-associative hybrid cache reduces both energy consumption and area up to 54 and 29 percent, respectively, while having negligible impact on performance (less than 2 percent).This work was supported by the Spanish Ministerio de Ciencia e Innovacion (MICINN), and jointly financed with Plan E funds under Grant TIN2009-14475-C04-01 and by Consolider-Ingenio 2010 under Grant CSD2006-00046.Valero Bresó, A.; Petit Martí, SV.; Sahuquillo Borrás, J.; López Rodríguez, PJ.; Duato Marín, JF. (2012). Design, performance, and energy consumption of eDRAM/SRAM macrocells for L1 data caches. IEEE Transactions on Computers. 61(9):1231-1242. https://doi.org/10.1109/TC.2011.138S1231124261

    A survey of DA techniques for PLD and FPGA based systems

    Full text link
    Programmable logic devices (PLDs) are gaining in acceptance, of late, for designing systems of all complexities ranging from glue logic to special purpose parallel machines. Higher densities and integration levels are made possible by the new breed of complex PLDs and FPGAs. The added complexities of these devices make automatic computer aided tools indispensable for achieving good performance and a high usable gate-count. In this article, we attempt to present in an unified manner, the different tools and their underlying algorithms using an example of a vending machine controller as an illustrative example. Topics covered include logic synthesis for PLDs and FPGAs along with an in-depth survey of important technology mapping, partitioning and place and route algorithms for different FPGA architectures.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/31206/1/0000108.pd

    An hybrid eDRAM/SRAM macrocell to implement first-level data caches

    Full text link
    SRAM and DRAM cells have been the predominant technologies used to implement memory cells in computer systems, each one having its advantages and shortcomings. SRAM cells are faster and require no refresh since reads are not destructive. In contrast, DRAM cells provide higher density and minimal leakage energy since there are no paths within the cell from Vdd to ground. Recently, DRAM cells have been embedded in logic-based technology, thus overcoming the speed limit of typical DRAM cells. In this paper we propose an n-bit macrocell that implements one static cell, and n-1 dynamic cells. This cell is aimed at being used in an n-way set-associative first-level data cache. Our study shows that in a four-way set-associative cache with this macrocell compared to an SRAM based with the same capacity, leakage is reduced by about 75% and area more than half with a minimal impact on performance. Architectural mechanisms have also been devised to avoid refresh logic. Experimental results show that no performance is lost when the retention time is larger than 50K processor cycles. In addition, the proposed delayed writeback policy that avoids refreshing performs a similar amount of writebacks than a conventional cache with the same organization, so no power wasting is incurred

    The Fifth NASA Symposium on VLSI Design

    Get PDF
    The fifth annual NASA Symposium on VLSI Design had 13 sessions including Radiation Effects, Architectures, Mixed Signal, Design Techniques, Fault Testing, Synthesis, Signal Processing, and other Featured Presentations. The symposium provides insights into developments in VLSI and digital systems which can be used to increase data systems performance. The presentations share insights into next generation advances that will serve as a basis for future VLSI design

    A configurable vector processor for accelerating speech coding algorithms

    Get PDF
    The growing demand for voice-over-packer (VoIP) services and multimedia-rich applications has made increasingly important the efficient, real-time implementation of low-bit rates speech coders on embedded VLSI platforms. Such speech coders are designed to substantially reduce the bandwidth requirements thus enabling dense multichannel gateways in small form factor. This however comes at a high computational cost which mandates the use of very high performance embedded processors. This thesis investigates the potential acceleration of two major ITU-T speech coding algorithms, namely G.729A and G.723.1, through their efficient implementation on a configurable extensible vector embedded CPU architecture. New scalar and vector ISAs were introduced which resulted in up to 80% reduction in the dynamic instruction count of both workloads. These instructions were subsequently encapsulated into a parametric, hybrid SISD (scalar processor)–SIMD (vector) processor. This work presents the research and implementation of the vector datapath of this vector coprocessor which is tightly-coupled to a Sparc-V8 compliant CPU, the optimization and simulation methodologies employed and the use of Electronic System Level (ESL) techniques to rapidly design SIMD datapaths
    • …
    corecore