4 research outputs found

    A low-latency modular switch for CMP systems

    Full text link
    [EN] As technology advances, the number of cores in Chip MultiProcessor systems and MultiProcessor Systems-on-Chips keeps increasing. The network must provide sustained throughput and ultra-low latencies. In this paper we propose new pipelined switch designs focused in reducing the switch latency. We identify the switch components that limit the switch frequency: the arbiter. Then, we simplify the arbiter logic by using multiple smaller arbiters, but increasing greatly the switch area. To solve this problem, a second design is presented where the routing traversal and arbitrations tasks are mixed. Results demonstrate a switch latency reduction ranging from 10% to 21%. Network latency is reduced in a range from 11% to 15%. © 2011 Elsevier B.V. All rights reserved.This work was supported by the Spanish MEC and MICINN, as well as European Commission FEDER funds, under Grants CSD2006-00046 and TIN2009-14475-C04. It was also partly supported by the project NaNoC (Project Label 248972) which is funded by the European Commission within the Research Programme FP7.Roca Pérez, A.; Flich Cardo, J.; Silla Jiménez, F.; Duato Marín, JF. (2011). A low-latency modular switch for CMP systems. Microprocessors and Microsystems. 35(8):742-754. https://doi.org/10.1016/j.micpro.2011.08.011S74275435

    Fault-tolerant vertical link design for effective 3D stacking

    Full text link
    [EN] Recently, 3D stacking has been proposed to alleviate the memory bandwidth limitation arising in chip multiprocessors (CMPs). As the number of integrated cores in the chip increases the access to external memory becomes the bottleneck, thus demanding larger memory amounts inside the chip. The most accepted solution to implement vertical links between stacked dies is by using Through Silicon Vias (TSVs). However, TSVs are exposed to misalignment and random defects compromising the yield of the manufactured 3D chip. A common solution to this problem is by over-provisioning, thus impacting on area and cost. In this paper, we propose a fault-tolerant vertical link design. With its adoption, fault-tolerant vertical links can be implemented in a 3D chip design at low cost without the need of adding redundant TSVs (no over-provision). Preliminary results are very promising as the fault-tolerant vertical link design increases switch area only by 6.69% while the achieved interconnect yield tends to 100%.This work was supported by the Spanish MEC and MICINN, as well as European Comission FEDER funds, under Grants CSD2006-00046 and TIN2009-14475-C04. It was also partly supported by the project NaNoC (project label 248972) which is funded by the European Commission within the Research Programme FP7.Hernández Luz, C.; Roca Pérez, A.; Flich Cardo, J.; Silla Jiménez, F.; Duato Marín, JF. (2011). Fault-tolerant vertical link design for effective 3D stacking. IEEE Computer Architecture Letters. 10(2):41-44. https://doi.org/10.1109/L-CA.2011.17S414410

    Efficient Routing in Heterogeneous SoC Designs with Small Implementation Overhead

    Get PDF
    In application-specific SoCs, the irregularity of the topology ends up in a complex and customized implementation of the routing algorithm, usually relying on routing tables implemented with memory structures at source end nodes. As system size increases, the routing tables also increase in size with nonnegligible impact on power, area, and latency overheads. In this paper, we present a routing implementation for application-specific SoCs able to implement in an efficient manner (with no routing tables and using a small logic block in every switch) a deadlock-free routing algorithm in these irregular networks. The mechanism relies on a tool that maps the initial irregular topology of the SoC system into a logical regular structure where the mechanism can be applied. We provide details for both the mapping tool and the proposed routing mechanism. Evaluation results show the effectiveness of the mapping tool as well as the low area and timing requirements of the mechanism. With the mapping tool and the routing mechanism, complex irregular SoC topologies can now be supported without the need for routing tables

    Floorplan-Aware High Performance NoC Design

    Full text link
    Las actuales arquitecturas de m�ltiples n�cleos como los chip multiprocesadores (CMP) y soluciones multiprocesador para sistemas dentro del chip (MPSoCs) han adoptado a las redes dentro del chip (NoC) como elemento -ptimo para la inter-conexi-n de los diversos elementos de dichos sistemas. En este sentido, fabricantes de CMPs y MPSoCs han adoptado NoCs sencillas, generalmente con una topolog'a en malla o anillo, ya que son suficientes para satisfacer las necesidades de los sistemas actuales. Sin embargo a medida que los requerimientos del sistema -- baja latencia y alto rendimiento -- se hacen m�s exigentes, estas redes tan simples dejan de ser una soluci-n real. As', la comunidad investigadora ha propuesto y analizado NoCs m�s complejas. No obstante, estas soluciones son m�s dif'ciles de implementar -- especialmente los enlaces largos -- haciendo que este tipo de topolog'as complejas sean demasiado costosas o incluso inviables. En esta tesis, presentamos una metodolog'a de dise-o que minimiza la p�rdida de prestaciones de la red debido a su implementaci-n real. Los principales problemas que se encuentran al implementar una NoC son los conmutadores y los enlaces largos. En esta tesis, el conmutador se ha hecho modular, es decir, formado como uni-n de m-dulos m�s peque-os. En nuestro caso, los m-dulos son id�nticos, donde cada m-dulo es capaz de arbitrar, conmutar, y almacenar los mensajes que le llegan. Posteriormente, flexibilizamos la colocaci-n de estos m-dulos en el chip, permitiendo que m-dulos de un mismo conmutador est�n distribuidos por el chip. Esta metodolog'a de dise-o la hemos aplicado a diferentes escenarios. Primeramente, hemos introducido nuestro conmutador modular en NoCs con topolog'as conocidas como la malla 2D. Los resultados muestran como la modularidad y la distribuci-n del conmutador reducen la latencia y el consumo de potencia de la red. En segundo lugar, hemos utilizado nuestra metodolog'a de dise-o para implementar un crossbar distribuidRoca Pérez, A. (2012). Floorplan-Aware High Performance NoC Design [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/17844Palanci
    corecore