Search CORE

429 research outputs found

Multi-chip multicast schedulers in input queued switches

Author: Bianco Andrea
Scicchitano A.
Publication venue: IEEE
Publication date: 01/01/2008
Field of study

Crossref

PORTO Publications Open Repository TOrino

Multicast Support in Multi-Chip Centralized Schedulers in Input Queued Switches

Author: Ajmone Marsan
Ajmone Marsan
Alessandra Scicchitano
Andrea Bianco
Fraleigh
McKeown
Prabhakar
Smiljanic
Publication venue: Elsevier
Publication date: 01/01/2009
Field of study

Crossref

PORTO Publications Open Repository TOrino

Recommended from our members

Optically-Connected Memory: Architectures and Experimental Characterizations

Author: Brunina Daniel
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2012
Field of study

Growing demands on future data centers and high-performance computing systems are driving the development of processor-memory interconnects with greater performance and flexibility than can be provided by existing electronic interconnects. A redesign of the systems' memory devices and architectures will be essential to enabling high-bandwidth, low-latency, resilient, energy-efficient memory systems that can meet the challenges of exascale systems and beyond. By leveraging an optics-based approach, this thesis presents the design and implementation of an optically-connected memory system that exploits both the bandwidth density and distance-independent energy dissipation of photonic transceivers, in combination with the flexibility and scalability offered by optical networks. By replacing the electronic memory bus with an optical interconnection network, novel memory architectures can be created that are otherwise infeasible. With remote optically-connected memory nodes accessible to processors as if they are local, programming models can be designed to utilize and efficiently share greater amounts of data. Processors that would otherwise be idle, being starved for data while waiting for scarce memory resources, can instead operate at high utilizations, leading to drastic improvements in the overall system performance. This work presents a prototype optically-connected memory module and a custom processor-based optical-network-aware memory controller that communicate transparently and all-optically across an optical interconnection network. The memory modules and controller are optimized to facilitate memory accesses across the optical network using a packet-switched, circuit-switched, or hybrid packet-and-circuit-switched approach. The novel memory controller is experimentally demonstrated to be compatible with existing processor-memory access protocols, with the memory controller acting as the optics-computing interface to render the optical network transparent. Additionally, the flexibility of the optical network enables additional performance benefits including increased memory bandwidth through optical multicasting. This optically-connected architecture can further enable more resilient memory system realizations by expanding on current error dectection and correction memory protocols. The integration of optics with memory technology constitutes a critical step for both optics and computing. The scalability challenges facing main memory systems today, especially concerning bandwidth and power consumption, complement well with the strengths of optical communications-based systems. Additionally, ongoing efforts focused on developing low-cost optical components and subsystems that are suitable for computing environments may benefit from the high-volume memory market. This work therefore takes the first step in merging the areas of optics and memory, developing the necessary architectures and protocols to interface the two technologies, and demonstrating potential benefits while identifying areas for future work. Future computing systems will undoubtedly benefit from this work through the deployment of high-performance, flexible, energy-efficient optically-connected memory architectures

Columbia University Academic Commons

Design and analysis of a 3-dimensional cluster multicomputer architecture using optical interconnection for petaFLOP computing

Author: Okorafor Ekpe Apia
Publication venue: Texas A&M University
Publication date: 25/04/2007
Field of study

In this dissertation, the design and analyses of an extremely scalable distributed multicomputer architecture, using optical interconnects, that has the potential to deliver in the order of petaFLOP performance is presented in detail. The design takes advantage of optical technologies, harnessing the features inherent in optics, to produce a 3D stack that implements efficiently a large, fully connected system of nodes forming a true 3D architecture. To adopt optics in large-scale multiprocessor cluster systems, efficient routing and scheduling techniques are needed. To this end, novel self-routing strategies for all-optical packet switched networks and on-line scheduling methods that can result in collision free communication and achieve real time operation in high-speed multiprocessor systems are proposed. The system is designed to allow failed/faulty nodes to stay in place without appreciable performance degradation. The approach is to develop a dynamic communication environment that will be able to effectively adapt and evolve with a high density of missing units or nodes. A joint CPU/bandwidth controller that maximizes the resource allocation in this dynamic computing environment is introduced with an objective to optimize the distributed cluster architecture, preventing performance/system degradation in the presence of failed/faulty nodes. A thorough analysis, feasibility study and description of the characteristics of a 3-Dimensional multicomputer system capable of achieving 100 teraFLOP performance is discussed in detail. Included in this dissertation is throughput analysis of the routing schemes, using methods from discrete-time queuing systems and computer simulation results for the different proposed algorithms. A prototype of the 3D architecture proposed is built and a test bed developed to obtain experimental results to further prove the feasibility of the design, validate initial assumptions, algorithms, simulations and the optimized distributed resource allocation scheme. Finally, as a prelude to further research, an efficient data routing strategy for highly scalable distributed mobile multiprocessor networks is introduced

Texas A&M Repository

PAM Performance Analysis in Multicast-Enabled Wavelength-Routing Data Centers

Author: Agrell Erik
Rastegarfar Houman
Szczerba Krzysztof
Yan Li
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Multilevel pulse amplitude modulation (M-PAM) is gaining momentum for high-capacity and power-efficient cloud computing. Compared to the classic on-off keying (OOK) modulation, high-order PAM yields better spectral efficiency but is also more susceptible to physical layer degradation effects. We develop a cross-layer analysis framework to examine the PAM transmission performance in data center network environments supporting both optical multicasting and wavelength routing. Our analysis is conducted on a switch architecture based on an arrayed-waveguide grating (AWG) core and distributed broadcast domains, exhibiting different physical paths, and random, uncontrolled crosstalk noise. Reed-Solomon coding with rate adaptation is incorporated into PAM transceivers to compensate for impairments. Our Monte Carlo simulations point to the significant impact of AWG crosstalk on higher order PAM in wavelength-reuse architectures and the importance of code rate adaptation for signals traversing multiple routing stages. According to our study, 8-PAM offers the highest effective bit rates for signals terminating in one broadcast domain and performs poorly when considering interdomain connectivity. On the other hand, the impairment-induced degradation of interdomain capacity for 4-PAM can be limited to 20.7%, making it better suited for connections spanning two broadcast domains and a crosstalk-rich stage. Our results call for software-defined PAM transceiver designs in support of both modulation order and code rate adaptation

Chalmers Research

Chalmers Publication Library

Wireless Communication in Data Centers: A Survey

Author: Alexander Dennis R
Deogun Jitender
Hamza Abdelbaset S.
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 26/01/2016
Field of study

Data centers (DCs) is becoming increasingly an integral part of the computing infrastructures of most enterprises. Therefore, the concept of DC networks (DCNs) is receiving an increased attention in the network research community. Most DCNs deployed today can be classified as wired DCNs as copper and optical fiber cables are used for intra- and inter-rack connections in the network. Despite recent advances, wired DCNs face two inevitable problems; cabling complexity and hotspots. To address these problems, recent research works suggest the incorporation of wireless communication technology into DCNs. Wireless links can be used to either augment conventional wired DCNs, or to realize a pure wireless DCN. As the design spectrum of DCs broadens, so does the need for a clear classification to differentiate various design options. In this paper, we analyze the free space optical (FSO) communication and the 60 GHz radio frequency (RF), the two key candidate technologies for implementing wireless links in DCNs. We present a generic classification scheme that can be used to classify current and future DCNs based on the communication technology used in the network. The proposed classification is then used to review and summarize major research in this area. We also discuss open questions and future research directions in the area of wireless DCs

Broadcast-oriented wireless network-on-chip : fundamentals and feasibility

Author: Abadal Cavallé Sergi
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2016
Field of study

Premi extraordinari doctorat UPC curs 2015-2016, àmbit Enginyeria de les TICRecent years have seen the emergence and ubiquitous adoption of Chip Multiprocessors (CMPs), which rely on the coordinated operation of multiple execution units or cores. Successive CMP generations integrate a larger number of cores seeking higher performance with a reasonable cost envelope. For this trend to continue, however, important scalability issues need to be solved at different levels of design. Scaling the interconnect fabric is a grand challenge by itself, as new Network-on-Chip (NoC) proposals need to overcome the performance hurdles found when dealing with the increasingly variable and heterogeneous communication demands of manycore processors. Fast and flexible NoC solutions are needed to prevent communication become a performance bottleneck, situation that would severely limit the design space at the architectural level and eventually lead to the use of software frameworks that are slow, inefficient, or less programmable. The emergence of novel interconnect technologies has opened the door to a plethora of new NoCs promising greater scalability and architectural flexibility. In particular, wireless on-chip communication has garnered considerable attention due to its inherent broadcast capabilities, low latency, and system-level simplicity. Most of the resulting Wireless Network-on-Chip (WNoC) proposals have set the focus on leveraging the latency advantage of this paradigm by creating multiple wireless channels to interconnect far-apart cores. This strategy is effective as the complement of wired NoCs at moderate scales, but is likely to be overshadowed at larger scales by technologies such as nanophotonics unless bandwidth is unrealistically improved. This dissertation presents the concept of Broadcast-Oriented Wireless Network-on-Chip (BoWNoC), a new approach that attempts to foster the inherent simplicity, flexibility, and broadcast capabilities of the wireless technology by integrating one on-chip antenna and transceiver per processor core. This paradigm is part of a broader hybrid vision where the BoWNoC serves latency-critical and broadcast traffic, tightly coupled to a wired plane oriented to large flows of data. By virtue of its scalable broadcast support, BoWNoC may become the key enabler of a wealth of unconventional hardware architectures and algorithmic approaches, eventually leading to a significant improvement of the performance, energy efficiency, scalability and programmability of manycore chips. The present work aims not only to lay the fundamentals of the BoWNoC paradigm, but also to demonstrate its viability from the electronic implementation, network design, and multiprocessor architecture perspectives. An exploration at the physical level of design validates the feasibility of the approach at millimeter-wave bands in the short term, and then suggests the use of graphene-based antennas in the terahertz band in the long term. At the link level, this thesis provides an insightful context analysis that is used, afterwards, to drive the design of a lightweight protocol that reliably serves broadcast traffic with substantial latency improvements over state-of-the-art NoCs. At the network level, our hybrid vision is evaluated putting emphasis on the flexibility provided at the network interface level, showing outstanding speedups for a wide set of traffic patterns. At the architecture level, the potential impact of the BoWNoC paradigm on the design of manycore chips is not only qualitatively discussed in general, but also quantitatively assessed in a particular architecture for fast synchronization. Results demonstrate that the impact of BoWNoC can go beyond simply improving the network performance, thereby representing a possible game changer in the manycore era.Avenços en el disseny de multiprocessadors han portat a una àmplia adopció dels Chip Multiprocessors (CMPs), que basen el seu potencial en la operació coordinada de múltiples nuclis de procés. Generacions successives han anat integrant més nuclis en la recerca d'alt rendiment amb un cost raonable. Per a que aquesta tendència continuï, però, cal resoldre importants problemes d'escalabilitat a diferents capes de disseny. Escalar la xarxa d'interconnexió és un gran repte en ell mateix, ja que les noves propostes de Networks-on-Chip (NoC) han de servir un tràfic eminentment variable i heterogeni dels processadors amb molts nuclis. Són necessàries solucions ràpides i flexibles per evitar que les comunicacions dins del xip es converteixin en el pròxim coll d'ampolla de rendiment, situació que limitaria en gran mesura l'espai de disseny a nivell d'arquitectura i portaria a l'ús d'arquitectures i models de programació lents, ineficients o poc programables. L'aparició de noves tecnologies d'interconnexió ha possibilitat la creació de NoCs més flexibles i escalables. En particular, la comunicació intra-xip sense fils ha despertat un interès considerable en virtut de les seva baixa latència, simplicitat, i bon rendiment amb tràfic broadcast. La majoria de les Wireless NoC (WNoC) proposades fins ara s'han centrat en aprofitar l'avantatge en termes de latència d'aquest nou paradigma creant múltiples canals sense fils per interconnectar nuclis allunyats entre sí. Aquesta estratègia és efectiva per complementar a NoCs clàssiques en escales mitjanes, però és probable que altres tecnologies com la nanofotònica puguin jugar millor aquest paper a escales més grans. Aquesta tesi presenta el concepte de Broadcast-Oriented WNoC (BoWNoC), un nou enfoc que intenta rendibilitzar al màxim la inherent simplicitat, flexibilitat, i capacitats broadcast de la tecnologia sense fils integrant una antena i transmissor/receptor per cada nucli del processador. Aquest paradigma forma part d'una visió més àmplia on un BoWNoC serviria tràfic broadcast i urgent, mentre que una xarxa convencional serviria fluxos de dades més pesats. En virtut de la escalabilitat i del seu suport broadcast, BoWNoC podria convertir-se en un element clau en una gran varietat d'arquitectures i algoritmes poc convencionals que milloressin considerablement el rendiment, l'eficiència, l'escalabilitat i la programabilitat de processadors amb molts nuclis. El present treball té com a objectius no només estudiar els aspectes fonamentals del paradigma BoWNoC, sinó també demostrar la seva viabilitat des dels punts de vista de la implementació, i del disseny de xarxa i arquitectura. Una exploració a la capa física valida la viabilitat de l'enfoc usant tecnologies longituds d'ona milimètriques en un futur proper, i suggereix l'ús d'antenes de grafè a la banda dels terahertz ja a més llarg termini. A capa d'enllaç, la tesi aporta una anàlisi del context de l'aplicació que és, més tard, utilitzada per al disseny d'un protocol d'accés al medi que permet servir tràfic broadcast a baixa latència i de forma fiable. A capa de xarxa, la nostra visió híbrida és avaluada posant èmfasi en la flexibilitat que aporta el fet de prendre les decisions a nivell de la interfície de xarxa, mostrant grans millores de rendiment per una àmplia selecció de patrons de tràfic. A nivell d'arquitectura, l'impacte que el concepte de BoWNoC pot tenir sobre el disseny de processadors amb molts nuclis no només és debatut de forma qualitativa i genèrica, sinó també avaluat quantitativament per una arquitectura concreta enfocada a la sincronització. Els resultats demostren que l'impacte de BoWNoC pot anar més enllà d'una millora en termes de rendiment de xarxa; representant, possiblement, un canvi radical a l'era dels molts nuclisAward-winningPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa