728 research outputs found

    On the design of a high-performance adaptive router for CC-NUMA multiprocessors

    Get PDF
    Copyright © 2003 IEEEThis work presents the design and evaluation of an adaptive packet router aimed at supporting CC-NUMA traffic. We exploit a simple and efficient packet injection mechanism to avoid deadlock, which leads to a fully adaptive routing by employing only three virtual channels. In addition, we selectively use output buffers for implementing the most utilized virtual paths in order to reduce head-of-line blocking. The careful implementation of these features has resulted in a good trade off between network performance and hardware cost. The outcome of this research is a High-Performance Adaptive Router (HPAR), which adequately balances the needs of parallel applications: minimal network latency at low loads and high throughput at heavy loads. The paper includes an evaluation process in which HPAR is compared with other adaptive routers using FIFO input buffering, with or without additional virtual channels to reduce head-of-line blocking. This evaluation contemplates both the VLSI costs of each router and their performance under synthetic and real application workloads. To make the comparison fair, all the routers use the same efficient deadlock avoidance mechanism. In all the experiments, HPAR exhibited the best response among all the routers tested. The throughput gains ranged from 10 percent to 40 percent in respect to its most direct rival, which employs more hardware resources. Other results shown that HPAR achieves up to 83 percent of its theoretical maximum throughput under random traffic and up to 70 percent when running real applications. Moreover, the observed packet latencies were comparable to those exhibited by simpler routers. Therefore, HPAR can be considered as a suitable candidate to implement packet interchange in next generations of CC-NUMA multiprocessors.Valentín Puente, José-Ángel Gregorio, Ramón Beivide, and Cruz Iz

    Design and performance evaluation of switching architectures for high-speed Internet

    Get PDF
    The motivation for this thesis is the desire to build faster and scalable routers that efficiently handle the exponential traffic growth in the Internet. The Internet forwards information through a mesh of routers and switches, which has to keep up with the increasing demands of traffic. Shared-memory based switches are known to provide the best throughput-delay performance for a given memory size. In this thesis performance of commonly used memory-sharing schemes for the shared memory switches are evaluated under balanced and unbalanced bursty traffic. The scalability of shared-memory switches has been a research issue for quite sometime. One approach is to employ multiple memory modules and use them in parallel to enhance the capacity. The two well-known architectures in this category are (i) shared-multibuffer (SMB) switch architecture invented by Yamanaka et al. of Mitsubishi Electric Corporation, Japan; and (ii) the sliding-window (SW) switch architecture invented by Dr. Kumar of UTPA, Texas, USA. In this thesis, performance of these two architectures are evaluated and compared. Furthermore, in this thesis, the SW switch architecture is extended to enable priority switching to provide differentiated Quality of Service (QoS) for different traffic classes

    On-board B-ISDN fast packet switching architectures. Phase 1: Study

    Get PDF
    The broadband integrate services digital network (B-ISDN) is an emerging telecommunications technology that will meet most of the telecommunications networking needs in the mid-1990's to early next century. The satellite-based system is well positioned for providing B-ISDN service with its inherent capabilities of point-to-multipoint and broadcast transmission, virtually unlimited connectivity between any two points within a beam coverage, short deployment time of communications facility, flexible and dynamic reallocation of space segment capacity, and distance insensitive cost. On-board processing satellites, particularly in a multiple spot beam environment, will provide enhanced connectivity, better performance, optimized access and transmission link design, and lower user service cost. The following are described: the user and network aspects of broadband services; the current development status in broadband services; various satellite network architectures including system design issues; and various fast packet switch architectures and their detail designs

    Doctor of Philosophy

    Get PDF
    dissertationPortable electronic devices will be limited to available energy of existing battery chemistries for the foreseeable future. However, system-on-chips (SoCs) used in these devices are under a demand to offer more functionality and increased battery life. A difficult problem in SoC design is providing energy-efficient communication between its components while maintaining the required performance. This dissertation introduces a novel energy-efficient network-on-chip (NoC) communication architecture. A NoC is used within complex SoCs due it its superior performance, energy usage, modularity, and scalability over traditional bus and point-to-point methods of connecting SoC components. This is the first academic research that combines asynchronous NoC circuits, a focus on energy-efficient design, and a software framework to customize a NoC for a particular SoC. Its key contribution is demonstrating that a simple, asynchronous NoC concept is a good match for low-power devices, and is a fruitful area for additional investigation. The proposed NoC is energy-efficient in several ways: simple switch and arbitration logic, low port radix, latch-based router buffering, a topology with the minimum number of 3-port routers, and the asynchronous advantages of zero dynamic power consumption while idle and the lack of a clock tree. The tool framework developed for this work uses novel methods to optimize the topology and router oorplan based on simulated annealing and force-directed movement. It studies link pipelining techniques that yield improved throughput in an energy-efficient manner. A simulator is automatically generated for each customized NoC, and its traffic generators use a self-similar message distribution, as opposed to Poisson, to better match application behavior. Compared to a conventional synchronous NoC, this design is superior by achieving comparable message latency with half the energy

    Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip

    Get PDF
    The sustained demand for faster, more powerful chips has been met by the availability of chip manufacturing processes allowing for the integration of increasing numbers of computation units onto a single die. The resulting outcome, especially in the embedded domain, has often been called SYSTEM-ON-CHIP (SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC). MPSoC design brings to the foreground a large number of challenges, one of the most prominent of which is the design of the chip interconnection. With a number of on-chip blocks presently ranging in the tens, and quickly approaching the hundreds, the novel issue of how to best provide on-chip communication resources is clearly felt. NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable answer to this design concern. By bringing large-scale networking concepts to the on-chip domain, they guarantee a structured answer to present and future communication requirements. The point-to-point connection and packet switching paradigms they involve are also of great help in minimizing wiring overhead and physical routing issues. However, as with any technology of recent inception, NoC design is still an evolving discipline. Several main areas of interest require deep investigation for NoCs to become viable solutions: • The design of the NoC architecture needs to strike the best tradeoff among performance, features and the tight area and power constraints of the onchip domain. • Simulation and verification infrastructure must be put in place to explore, validate and optimize the NoC performance. • NoCs offer a huge design space, thanks to their extreme customizability in terms of topology and architectural parameters. Design tools are needed to prune this space and pick the best solutions. • Even more so given their global, distributed nature, it is essential to evaluate the physical implementation of NoCs to evaluate their suitability for next-generation designs and their area and power costs. This dissertation performs a design space exploration of network-on-chip architectures, in order to point-out the trade-offs associated with the design of each individual network building blocks and with the design of network topology overall. The design space exploration is preceded by a comparative analysis of state-of-the-art interconnect fabrics with themselves and with early networkon- chip prototypes. The ultimate objective is to point out the key advantages that NoC realizations provide with respect to state-of-the-art communication infrastructures and to point out the challenges that lie ahead in order to make this new interconnect technology come true. Among these latter, technologyrelated challenges are emerging that call for dedicated design techniques at all levels of the design hierarchy. In particular, leakage power dissipation, containment of process variations and of their effects. The achievement of the above objectives was enabled by means of a NoC simulation environment for cycleaccurate modelling and simulation and by means of a back-end facility for the study of NoC physical implementation effects. Overall, all the results provided by this work have been validated on actual silicon layout

    Floorplan-Aware High Performance NoC Design

    Full text link
    Las actuales arquitecturas de m�ltiples n�cleos como los chip multiprocesadores (CMP) y soluciones multiprocesador para sistemas dentro del chip (MPSoCs) han adoptado a las redes dentro del chip (NoC) como elemento -ptimo para la inter-conexi-n de los diversos elementos de dichos sistemas. En este sentido, fabricantes de CMPs y MPSoCs han adoptado NoCs sencillas, generalmente con una topolog'a en malla o anillo, ya que son suficientes para satisfacer las necesidades de los sistemas actuales. Sin embargo a medida que los requerimientos del sistema -- baja latencia y alto rendimiento -- se hacen m�s exigentes, estas redes tan simples dejan de ser una soluci-n real. As', la comunidad investigadora ha propuesto y analizado NoCs m�s complejas. No obstante, estas soluciones son m�s dif'ciles de implementar -- especialmente los enlaces largos -- haciendo que este tipo de topolog'as complejas sean demasiado costosas o incluso inviables. En esta tesis, presentamos una metodolog'a de dise-o que minimiza la p�rdida de prestaciones de la red debido a su implementaci-n real. Los principales problemas que se encuentran al implementar una NoC son los conmutadores y los enlaces largos. En esta tesis, el conmutador se ha hecho modular, es decir, formado como uni-n de m-dulos m�s peque-os. En nuestro caso, los m-dulos son id�nticos, donde cada m-dulo es capaz de arbitrar, conmutar, y almacenar los mensajes que le llegan. Posteriormente, flexibilizamos la colocaci-n de estos m-dulos en el chip, permitiendo que m-dulos de un mismo conmutador est�n distribuidos por el chip. Esta metodolog'a de dise-o la hemos aplicado a diferentes escenarios. Primeramente, hemos introducido nuestro conmutador modular en NoCs con topolog'as conocidas como la malla 2D. Los resultados muestran como la modularidad y la distribuci-n del conmutador reducen la latencia y el consumo de potencia de la red. En segundo lugar, hemos utilizado nuestra metodolog'a de dise-o para implementar un crossbar distribuidRoca Pérez, A. (2012). Floorplan-Aware High Performance NoC Design [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/17844Palanci

    Contention resolution in optical packet-switched cross-connects

    Get PDF

    Towards all-optical label switching nodes with multicast

    Get PDF
    Fiber optics has developed so rapidly during the last decades that it has be- come the backbone of our communication systems. Evolved from initially static single-channel point-to-point links, the current advanced optical backbone net- work consists mostly of wavelength-division multiplexed (WDM) networks with optical add/drop multiplexing nodes and optical cross-connects that can switch data in the optical domain. However, the commercially implemented optical net- work nodes are still performing optical circuit switching using wavelength routing. The dedicated use of wavelength and infrequent recon¯guration result in relatively poor bandwidth utilization. The success of electronic packet switching has inspired researchers to improve the °exibility, e±ciency, granularity and network utiliza- tion of optical networks by introducing optical packet switching using short, local optical labels for forwarding decision making at intermediate optical core network nodes, a technique that is referred to as optical label switching (OLS). Various research demonstrations on OLS systems have been reported with transparent optical packet payload forwarding based on electronic packet label processing, taking advantage of the mature technologies of electronic logical cir- cuitry. This approach requires optic-electronic-optic (OEO) conversion of the op- tical labels, a costly and power consuming procedure particularly for high-speed labels. As optical packet payload bit rate increases from gigabit per second (Gb/s) to terabit per second (Tb/s) or higher, the increased speed of the optical labels will eventually face the electronic bottleneck, so that the OEO conversion and the electronic label processing will be no longer e±cient. OLS with label processing in the optical domain, namely, all-optical label switching (AOLS), will become necessary. Di®erent AOLS techniques have been proposed in the last ¯ve years. In this thesis, AOLS node architectures based on optical time-serial label processing are presented for WDM optical packets. The unicast node architecture, where each optical packet is to be sent to only one output port of the node, has been in- vestigated and partially demonstrated in the EU IST-LASAGNE project. This thesis contributes to the multicast aspects of the AOLS nodes, where the optical packets can be forwarded to multiple or all output ports of a node. Multicast capable AOLS nodes are becoming increasingly interesting due to the exponen- tial growth of the emerging multicast Internet and modern data services such as video streaming, high de¯nition TV, multi-party online games, and enterprise ap- plications such as video conferencing and optical storage area networks. Current electronic routers implement multicast in the Internet protocol (IP) layer, which requires not only the OEO conversion of the optical packets, but also exhaus- tive routing table lookup of the globally unique IP addresses. Despite that, there has been no extensive studies on AOLS multicast nodes, technologies and tra±c performance, apart from a few proof-of-principle experimental demonstrations. In this thesis, three aspects of the multicast capable AOLS nodes are addressed: 1. Logical design of the AOLS multicast node architectures, as well as func- tional subsystems and interconnections, based on state-of-the-art literature research of the ¯eld and the subject. 2. Computer simulations of the tra±c performance of di®erent AOLS unicast and multicast node architectures, using a custom-developed AOLS simulator AOLSim. 3. Experimental demonstrations in laboratory and computer simulations using the commercially available simulator VPItransmissionMakerTM, to evaluate the physical layer performance of the required all-optical multicast technolo- gies. A few selected multi-wavelength conversion (MWC) techniques are particularly looked into. MWC is an essential subsystem of the AOLS node for realizing optical packet multicast by making multiple copies of the optical packet all-optically onto di®er- ent wavelengths channels. In this thesis, theMWC techniques based on cross-phase modulation and four-wave mixing are extensively investigated. The former tech- nique o®ers more wavelength °exibility and good conversion e±ciency, but it is only applicable to intensity modulated signals. The latter technique, on the other hand, o®ers strict transparency in data rate and modulation format, but its work- ing wavelengths are limited by the device or component used, and the conversion e±ciency is considerably lower. The proposals and results presented in this thesis show feasibility of all-optical packet switching and multicasting at line speed without any OEO conversion and electronic processing. The scalability and the costly optical components of the AOLS nodes have been so far two of the major obstacles for commercialization of the AOLS concept. This thesis also introduced a novel, scalable optical labeling concept and a label processing scheme for the AOLS multicast nodes. The pro- posed scheme makes use of the spatial positions of each label bit instead of the total absolute value of all the label bits. Thus for an n-bit label, the complexity of the label processor is determined by n instead of 2n

    Node design in optical packet switched networks

    Get PDF
    corecore