1,033 research outputs found

    Lower-Bound on Blocking Probability of A Class of Crosstalkfree Optical Cross-connects(OXCs)

    Get PDF

    Novel Cache Hierarchies with Photonic Interconnects for Chip Multiprocessors

    Full text link
    [ES] Los procesadores multinúcleo actuales cuentan con recursos compartidos entre los diferentes núcleos. Dos de estos recursos compartidos, la cache de último nivel y el ancho de banda de memoria principal, pueden convertirse en cuellos de botella para el rendimiento. Además, con el crecimiento del número de núcleos que implementan los diseños más recientes, la red dentro del chip también se convierte en un cuello de botella que puede afectar negativamente al rendimiento, ya que las redes tradicionales pueden encontrar limitaciones a su escalabilidad en el futuro cercano. Prácticamente la totalidad de los diseños actuales implementan jerarquías de memoria que se comunican mediante rápidas redes de interconexión. Esta organización es eficaz dado que permite reducir el número de accesos que se realizan a memoria principal y la latencia media de acceso a memoria. Las caches, la red de interconexión y la memoria principal, conjuntamente con otras técnicas conocidas como la prebúsqueda, permiten reducir las enormes latencias de acceso a memoria principal, limitando así el impacto negativo ocasionado por la diferencia de rendimiento existente entre los núcleos de cómputo y la memoria. Sin embargo, compartir los recursos mencionados es fuente de diferentes problemas y retos, siendo uno de los principales el manejo de la interferencia entre aplicaciones. Hacer un uso eficiente de la jerarquía de memoria y las caches, así como contar con una red de interconexión apropiada, es necesario para sostener el crecimiento del rendimiento en los diseños tanto actuales como futuros. Esta tesis analiza y estudia los principales problemas e inconvenientes observados en estos dos recursos: la cache de último nivel y la red dentro del chip. En primer lugar, se estudia la escalabilidad de las tradicionales redes dentro del chip con topología de malla, así como esta puede verse comprometida en próximos diseños que cuenten con mayor número de núcleos. Los resultados de este estudio muestran que, a mayor número de núcleos, el impacto negativo de la distancia entre núcleos en la latencia puede afectar seriamente al rendimiento del procesador. Como solución a este problema, en esta tesis proponemos una de red de interconexión óptica modelada en un entorno de simulación detallado, que supone una solución viable a los problemas de escalabilidad observados en los diseños tradicionales. A continuación, esta tesis dedica un esfuerzo importante a identificar y proponer soluciones a los principales problemas de diseño de las jerarquías de memoria actuales como son, por ejemplo, el sobredimensionado del espacio de cache privado, la existencia de réplicas de datos y rigidez e incapacidad de adaptación de las estructuras de cache. Aunque bien conocidos, estos problemas y sus efectos adversos en el rendimiento pueden ser evitados en procesadores de alto rendimiento gracias a la enorme capacidad de la cache de último nivel que este tipo de procesadores típicamente implementan. Sin embargo, en procesadores de bajo consumo, no existe la posibilidad de contar con tales capacidades y hacer un uso eficiente del espacio disponible es crítico para mantener el rendimiento. Como solución a estos problemas en procesadores de bajo consumo, proponemos una novedosa organización de jerarquía de dos niveles cache que utiliza una red de interconexión óptica. Los resultados obtenidos muestran que, comparado con diseños convencionales, el consumo de energía estática en la arquitectura propuesta es un 60% menor, pese a que los resultados de rendimiento presentan valores similares. Por último, hemos extendido la arquitectura propuesta para dar soporte tanto a aplicaciones paralelas como secuenciales. Los resultados obtenidos con la esta nueva arquitectura muestran un ahorro de hasta el 78 % de energía estática en la ejecución de aplicaciones paralelas.[CA] Els processadors multinucli actuals compten amb recursos compartits entre els diferents nuclis. Dos d'aquests recursos compartits, la memòria d’últim nivell i l'ample de banda de memòria principal, poden convertir-se en colls d'ampolla per al rendiment. A mes, amb el creixement del nombre de nuclis que implementen els dissenys mes recents, la xarxa dins del xip també es converteix en un coll d'ampolla que pot afectar negativament el rendiment, ja que les xarxes tradicionals poden trobar limitacions a la seva escalabilitat en el futur proper. Pràcticament la totalitat dels dissenys actuals implementen jerarquies de memòria que es comuniquen mitjançant rapides xarxes d’interconnexió. Aquesta organització es eficaç ates que permet reduir el nombre d'accessos que es realitzen a memòria principal i la latència mitjana d’accés a memòria. Les caches, la xarxa d’interconnexió i la memòria principal, conjuntament amb altres tècniques conegudes com la prebúsqueda, permeten reduir les enormes latències d’accés a memòria principal, limitant així l'impacte negatiu ocasionat per la diferencia de rendiment existent entre els nuclis de còmput i la memòria. No obstant això, compartir els recursos esmentats és font de diversos problemes i reptes, sent un dels principals la gestió de la interferència entre aplicacions. Fer un us eficient de la jerarquia de memòria i les caches, així com comptar amb una xarxa d’interconnexió apropiada, es necessari per sostenir el creixement del rendiment en els dissenys tant actuals com futurs. Aquesta tesi analitza i estudia els principals problemes i inconvenients observats en aquests dos recursos: la memòria cache d’últim nivell i la xarxa dins del xip. En primer lloc, s'estudia l'escalabilitat de les xarxes tradicionals dins del xip amb topologia de malla, així com aquesta es pot veure compromesa en propers dissenys que compten amb major nombre de nuclis. Els resultats d'aquest estudi mostren que, a major nombre de nuclis, l'impacte negatiu de la distància entre nuclis en la latència pot afectar seriosament al rendiment del processador. Com a solució' a aquest problema, en aquesta tesi proposem una xarxa d’interconnexió' òptica modelada en un entorn de simulació detallat, que suposa una solució viable als problemes d'escalabilitat observats en els dissenys tradicionals. A continuació, aquesta tesi dedica un esforç important a identificar i proposar solucions als principals problemes de disseny de les jerarquies de memòria actuals com son, per exemple, el sobredimensionat de l'espai de memòria cache privat, l’existència de repliques de dades i la rigidesa i incapacitat d’adaptació' de les estructures de memòria cache. Encara que ben coneguts, aquests problemes i els seus efectes adversos en el rendiment poden ser evitats en processadors d'alt rendiment gracies a l'enorme capacitat de la memòria cache d’últim nivell que aquest tipus de processadors típicament implementen. No obstant això, en processadors de baix consum, no hi ha la possibilitat de comptar amb aquestes capacitats, i fer un us eficient de l'espai disponible es torna crític per mantenir el rendiment. Com a solució a aquests problemes en processadors de baix consum, proposem una nova organització de jerarquia de dos nivells de memòria cache que utilitza una xarxa d’interconnexió òptica. Els resultats obtinguts mostren que, comparat amb dissenys convencionals, el consum d'energia estàtica en l'arquitectura proposada és un 60% menor, malgrat que els resultats de rendiment presenten valors similars. Per últim, hem estes l'arquitectura proposada per donar suport tant a aplicacions paral·leles com seqüencials. Els resultats obtinguts amb aquesta nova arquitectura mostren un estalvi de fins al 78 % d'energia estàtica en l’execució d'aplicacions paral·leles.[EN] Current multicores face the challenge of sharing resources among the different processor cores. Two main shared resources act as major performance bottlenecks in current designs: the off-chip main memory bandwidth and the last level cache. Additionally, as the core count grows, the network on-chip is also becoming a potential performance bottleneck, since traditional designs may find scalability issues in the near future. Memory hierarchies communicated through fast interconnects are implemented in almost every current design as they reduce the number of off-chip accesses and the overall latency, respectively. Main memory, caches, and interconnection resources, together with other widely-used techniques like prefetching, help alleviate the huge memory access latencies and limit the impact of the core-memory speed gap. However, sharing these resources brings several concerns, being one of the most challenging the management of the inter-application interference. Since almost every running application needs to access to main memory, all of them are exposed to interference from other co-runners in their way to the memory controller. For this reason, making an efficient use of the available cache space, together with achieving fast and scalable interconnects, is critical to sustain the performance in current and future designs. This dissertation analyzes and addresses the most important shortcomings of two major shared resources: the Last Level Cache (LLC) and the Network on Chip (NoC). First, we study the scalability of both electrical and optical NoCs for future multicoresand many-cores. To perform this study, we model optical interconnects in a cycle-accurate multicore simulation framework. A proper model is required; otherwise, important performance deviations may be observed otherwise in the evaluation results. The study reveals that, as the core count grows, the effect of distance on the end-to-end latency can negatively impact on the processor performance. In contrast, the study also shows that silicon nanophotonics are a viable solution to solve the mentioned latency problems. This dissertation is also motivated by important design concerns related to current memory hierarchies, like the oversizing of private cache space, data replication overheads, and lack of flexibility regarding sharing of cache structures. These issues, which can be overcome in high performance processors by virtue of huge LLCs, can compromise performance in low power processors. To address these issues we propose a more efficient cache hierarchy organization that leverages optical interconnects. The proposed architecture is conceived as an optically interconnected two-level cache hierarchy composed of multiple cache modules that can be dynamically turned on and off independently. Experimental results show that, compared to conventional designs, static energy consumption is improved by up to 60% while achieving similar performance results. Finally, we extend the proposal to support both sequential and parallel applications. This extension is required since the proposal adapts to the dynamic cache space needs of the running applications, and multithreaded applications's behaviors widely differ from those of single threaded programs. In addition, coherence management is also addressed, which is challenging since each cache module can be assigned to any core at a given time in the proposed approach. For parallel applications, the evaluation shows that the proposal achieves up to 78% static energy savings. In summary, this thesis tackles major challenges originated by the sharing of on-chip caches and communication resources in current multicores, and proposes new cache hierarchy organizations leveraging optical interconnects to address them. The proposed organizations reduce both static and dynamic energy consumption compared to conventional approaches while achieving similar performance; which results in better energy efficiency.Puche Lara, J. (2021). Novel Cache Hierarchies with Photonic Interconnects for Chip Multiprocessors [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/165254TESI

    Center for Aeronautics and Space Information Sciences

    Get PDF
    This report summarizes the research done during 1991/92 under the Center for Aeronautics and Space Information Science (CASIS) program. The topics covered are computer architecture, networking, and neural nets

    Exploiting Properties of CMP Cache Traffic in Designing Hybrid Packet/Circuit Switched NoCs

    Get PDF
    Chip multiprocessors with few to tens of processing cores are already commercially available. Increased scaling of technology is making it feasible to integrate even more cores on a single chip. Providing the cores with fast access to data is vital to overall system performance. When a core requires access to a piece of data, the core's private cache memory is searched first. If a miss occurs, the data is looked up in the next level(s) of the memory hierarchy, where often one or more levels of cache are shared between two or more cores. Communication between the cores and the slices of the on-chip shared cache is carried through the network-on-chip(NoC). Interestingly, the cache and NoC mutually affect the operation of each other; communication over the NoC affects the access latency of cache data, while the cache organization generates the coherence and data messages, thus affecting the communication patterns and latency over the NoC. This thesis considers hybrid packet/circuit switched NoCs, i.e., packet switched NoCs enhanced with the ability to configure circuits. The communication and performance benefit that come from using circuits is predicated on amortizing the time cost incurred for configuring the circuits. To address this challenge, NoC designs are proposed that take advantage of properties of the cache traffic, namely temporal locality and predictability, to amortize or hide the circuit configuration time cost. First, a coarse-grained circuit configuration policy is proposed that exploits the temporal locality in the cache traffic to periodically configure circuits for the heavily communicating nodes. This allows the design of a locality-aware cache that promotes temporal communication locality through data placement, while designing suitable data replacement and migration policies. Next, a fine-grained configuration policy, called Déjà Vu switching, is proposed for leveraging predictability of data messages by initiating a circuit configuration as soon as a cache hit is detected and before the data becomes available. Its benefit is demonstrated for saving interconnect energy in multi-plane NoCs. Finally, a more proactive configuration policy is proposed for fast caches, where circuit reservations are initiated by request messages, which can greatly improve communication latency and system performance

    Proximity coherence for chip-multiprocessors

    Get PDF
    Many-core architectures provide an efficient way of harnessing the growing numbers of transistors available in modern fabrication processes; however, the parallel programs run on these platforms are increasingly limited by the energy and latency costs of communication. Existing designs provide a functional communication layer but do not necessarily implement the most efficient solution for chip-multiprocessors, placing limits on the performance of these complex systems. In an era of increasingly power limited silicon design, efficiency is now a primary concern that motivates designers to look again at the challenge of cache coherence. The first step in the design process is to analyse the communication behaviour of parallel benchmark suites such as Parsec and SPLASH-2. This thesis presents work detailing the sharing patterns observed when running the full benchmarks on a simulated 32-core x86 machine. The results reveal considerable locality of shared data accesses between threads with consecutive operating system assigned thread IDs. This pattern, although of little consequence in a multi-node system, corresponds to strong physical locality of shared data between adjacent cores on a chip-multiprocessor platform. Traditional cache coherence protocols, although often used in chip-multiprocessor designs, have been developed in the context of older multi-node systems. By redesigning coherence protocols to exploit new patterns such as the physical locality of shared data, improving the efficiency of communication, specifically in chip-multiprocessors, is possible. This thesis explores such a design – Proximity Coherence – a novel scheme in which L1 load misses are optimistically forwarded to nearby caches via new dedicated links rather than always being indirected via a directory structure.EPSRC DTA research scholarshi

    The effect of an optical network on-chip on the performance of chip multiprocessors

    Get PDF
    Optical networks on-chip (ONoC) have been proposed to reduce power consumption and increase bandwidth density in high performance chip multiprocessors (CMP), compared to electrical NoCs. However, as buffering in an ONoC is not viable, the end-to-end message path needs to be acquired in advance during which the message is buffered at the network ingress. This waiting latency is therefore a combination of path setup latency and contention and forms a significant part of the total message latency. Many proposed ONoCs, such as Single Writer, Multiple Reader (SWMR), avoid path setup latency at the expense of increased optical components. In contrast, this thesis investigates a simple circuit-switched ONoC with lower component count where nodes need to request a channel before transmission. To hide the path setup latency, a coherence-based message predictor is proposed, to setup circuits before message arrival. Firstly, the effect of latency and bandwidth on application performance is thoroughly investigated using full-system simulations of shared memory CMPs. It is shown that the latency of an ideal NoC affects the CMP performance more than the NoC bandwidth. Increasing the number of wavelengths per channel decreases the serialisation latency and improves the performance of both ONoC types. With 2 or more wavelengths modulating at 25 Gbit=s , the ONoCs will outperform a conventional electrical mesh (maximal speedup of 20%). The SWMR ONoC outperforms the circuit-switched ONoC. Next coherence-based prediction techniques are proposed to reduce the waiting latency. The ideal coherence-based predictor reduces the waiting latency by 42%. A more streamlined predictor (smaller than a L1 cache) reduces the waiting latency by 31%. Without prediction, the message latency in the circuit-switched ONoC is 11% larger than in the SWMR ONoC. Applying the realistic predictor reverses this: the message latency in the SWMR ONoC is now 18% larger than the predictive circuitswitched ONoC

    Towards Cache-Coherent Chiplet-Based Architectures with Wireless Interconnects

    Get PDF
    Cache-coherent chiplet-based architectures have gained significant attention due to their potential for scalability and improved performance in modern computing systems. However, the interconnects in such architectures often pose challenges in maintaining cache coherence across chiplets, leading to increased latency and energy consumption. This thesis focuses on exploring the feasibility and advantages of integrating wireless interconnects into cache-coherent chiplet-based architectures. Through extensive simulations of 16 and 64 core systems segmented in 4 and 8 chiplet systems with multiple inter-chiplet latencies we debug and obtain traffic data. By studying the inter-chiplet traffic for different chiplet-based configurations and analyzing it in terms of spatial, temporal and time variance we derive that chiplet scaling degrades performance. Further we formulate the impact of hybrid wired and wireless interconnects and assess the potential performance benefits they offer. The findings from this research will contribute to the design and optimization of cache-coherent chiplet-based architectures, shedding light on the practicality and advantages of utilizing wireless interconnects in future computing systems

    Evaluation of temperature-performance trade-offs in wireless network-on-chip architectures

    Get PDF
    Continued scaling of device geometries according to Moore\u27s Law is enabling complete end-user systems on a single chip. Massive multicore processors are enablers for many information and communication technology (ICT) innovations spanning various domains, including healthcare, defense, and entertainment. In the design of high-performance massive multicore chips, power and heat are dominant constraints. Temperature hotspots witnessed in multicore systems exacerbate the problem of reliability in deep submicron technologies. Hence, there is a great need to explore holistic power and thermal optimization and management strategies for the massive multicore chips. High power consumption not only raises chip temperature and cooling cost, but also decreases chip reliability and performance. Thus, addressing thermal concerns at different stages of the design and operation is critical to the success of future generation systems. The performance of a multicore chip is also influenced by its overall communication infrastructure, which is predominantly a Network-on-Chip (NoC). The existing method of implementing a NoC with planar metal interconnects is deficient due to high latency, significant power consumption, and temperature hotspots arising out of long, multi-hop wireline links used in data exchange. On-chip wireless networks are envisioned as an enabling technology to design low power and high bandwidth massive multicore architectures. However, optimizing wireless NoCs for best performance does not necessarily guarantee a thermally optimal interconnection architecture. The wireless links being highly efficient attract very high traffic densities which in turn results in temperature hotspots. Therefore, while the wireless links result in better performance and energy-efficiency, they can also cause temperature hotspots and undermine the reliability of the system. Consequently, the location and utilization of the wireless links is an important factor in thermal optimization of high performance wireless Networks-on-Chip. Architectural innovation in conjunction with suitable power and thermal management strategies is the key for designing high performance yet energy-efficient massive multicore chips. This work contributes to exploration of various the design methodologies for establishing wireless NoC architectures that achieve the best trade-offs between temperature, performance and energy-efficiency. It further demonstrates that incorporating Dynamic Thermal Management (DTM) on a multicore chip designed with such temperature and performance optimized Wireless Network-on-Chip architectures improves thermal profile while simultaneously providing lower latency and reduced network energy dissipation compared to its conventional counterparts

    Center for Space Microelectronics Technology 1988-1989 technical report

    Get PDF
    The 1988 to 1989 Technical Report of the JPL Center for Space Microelectronics Technology summarizes the technical accomplishments, publications, presentations, and patents of the center. Listed are 321 publications, 282 presentations, and 140 new technology reports and patents

    Modeling and Analysis of the Performance of Exascale Photonic Networks

    Full text link
    "This is the peer reviewed version of the following article: Duro, José, Jose A. Pascual, Salvador Petit, Julio Sahuquillo, and María E. Gómez. 2018. Modeling and Analysis of the Performance of Exascale Photonic Networks. Concurrency and Computation: Practice and Experience 31 (21). Wiley. doi:10.1002/cpe.4773, which has been published in final form at https://doi.org/10.1002/cpe.4773. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving."[EN] Photonics technology has become a promising and viable alternative for both on-chip and off-chip interconnection networks of future Exascale systems. Nevertheless, this technology is not mature enough yet in this context, so research efforts focusing on photonic networks are still required to achieve realistic suitable network implementations. In this regard, system-level photonic network simulators can help guide designers to assess the multiple design choices. Most current research is done on electrical network simulators, whose components work widely different from photonics components. In this work, we summarize and compare the working behavior of both technologies which includes the use of optical routers, wavelength-division multiplexing and circuit switching among others. After implementing them into a well-known simulation framework, an extensive simulation study has been carried out using realistic photonic network configurations with synthetic and realistic traffic. Experimental results show that, compared to electrical networks, optical networks can reduce the execution time of the studied real workloads in almost one order of magnitude. Our study also reveals that the photonic configuration highly impacts on the network performance, being the bandwidth per channel and the message length the most important parameters.This work was supported by the ExaNeSt project, funded by the European Union's Horizon 2020 Research and Innovation Program under grant 671553, and by the Spanish Ministerio de Economía y Competitividad (MINECO) and Plan E funds under grant TIN2015-66972-C5-1-R. Pascual was supported by a HiPEAC Collaboration Grant.Duro-Gómez, J.; Pascual Pérez, JA.; Petit Martí, SV.; Sahuquillo Borrás, J.; Gómez Requena, ME. (2019). Modeling and Analysis of the Performance of Exascale Photonic Networks. Concurrency and Computation Practice and Experience. 31(21):1-12. https://doi.org/10.1002/cpe.4773S1123121Top500 website. Accessed January2018.Kodi, A. K., Neel, B., & Brantley, W. C. (2014). Photonic Interconnects for Exascale and Datacenter Architectures. IEEE Micro, 34(5), 18-30. doi:10.1109/mm.2014.62Rumley, S., Nikolova, D., Hendry, R., Li, Q., Calhoun, D., & Bergman, K. (2015). Silicon Photonics for Exascale Systems. Journal of Lightwave Technology, 33(3), 547-562. doi:10.1109/jlt.2014.2363947Shacham, A., Bergman, K., & Carloni, L. P. (2008). Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors. IEEE Transactions on Computers, 57(9), 1246-1260. doi:10.1109/tc.2008.78Batten, C., Joshi, A., Orcutt, J., Khilo, A., Moss, B., Holzwarth, C. W., … Asanovic, K. (2009). Building Many-Core Processor-to-DRAM Networks with Monolithic CMOS Silicon Photonics. IEEE Micro, 29(4), 8-21. doi:10.1109/mm.2009.60WernerS NavaridasJ LujánM.Designing low‐power low‐latency networks‐on‐chip by optimally combining electrical and optical links. Paper presented at: 23rd IEEE International Symposium on High Performance Computer Architecture (HPCA);2016;Austin TX.PucheJ LechagoS PetitS GómezME SahuquilloJ.Accurately modeling a photonic NoC in a detailed CMP simulation framework. Paper presented at: 2016 International Conference on High Performance Computing & Simulation (HPCS);2016;Innsbruck Austria.ChenG ChenH HaurylauM et al.On‐chip copper‐based vs. optical interconnects: delay uncertainty latency power and bandwidth density comparative predictions. Paper presented at: 2006 International Interconnect Technology Conference;2006;Burlingame CA.KatevenisM ChrysosN MarazakisM et al.The ExaNeSt project: Interconnects storage and packaging for exascale systems. Paper presented at: 2016 Euromicro Conference on Digital System Design (DSD);2016;Limassol Cyprus.ConcattoC PascualJA NavaridasJ et al.A CAM-Free Exascalable HPC Router for Low-Energy Communications. Paper presented at: 31st International Conference on Architecture of Computing Systems (ARCS);2018.DuanG‐H FedeliJ‐M KeyvaniniaS ThomsonD et al.10 Gb/s integrated tunable hybrid III‐V/si laser and silicon Mach‐Zehnder modulator. Paper presented at: European Conference and Exhibition on Optical Communications;2012;Amsterdam The Netherlands.DuanGH JanyC Le LiepvreAL et al.Integrated hybrid III‐V/si laser and transmitter. Paper presented at: 2012 International Conference on Indium Phosphide and Related Materials;2012;Santa Barbara CA.Soref, R., & Bennett, B. (1987). Electrooptical effects in silicon. IEEE Journal of Quantum Electronics, 23(1), 123-129. doi:10.1109/jqe.1987.1073206Liu, A., Liao, L., Rubin, D., Nguyen, H., Ciftcioglu, B., Chetrit, Y., … Paniccia, M. (2007). High-speed optical modulation based on carrier depletion in a silicon waveguide. Optics Express, 15(2), 660. doi:10.1364/oe.15.000660Thomson, D. J., Gardes, F. Y., Hu, Y., Mashanovich, G., Fournier, M., Grosse, P., … Reed, G. T. (2011). High contrast 40Gbit/s optical modulation in silicon. Optics Express, 19(12), 11507. doi:10.1364/oe.19.011507Bergman, K., Carloni, L. P., Biberman, A., Chan, J., & Hendry, G. (2014). Photonic Network-on-Chip Design. Integrated Circuits and Systems. doi:10.1007/978-1-4419-9335-9Dong, P., Chen, L., Xie, C., Buhl, L. L., & Chen, Y.-K. (2012). 50-Gb/s silicon quadrature phase-shift keying modulator. Optics Express, 20(19), 21181. doi:10.1364/oe.20.021181DongP LiuX SethumadhavanC et al.224‐Gb/s PDM‐16‐QAM modulator and receiver based on silicon photonic integrated circuits. Paper presented at: Optical Fiber Communication Conference/National Fiber Optic Engineers Conference;2013;Anaheim CA.Navaridas, J., Miguel-Alonso, J., Pascual, J. A., & Ridruejo, F. J. (2011). Simulating and evaluating interconnection networks with INSEE. Simulation Modelling Practice and Theory, 19(1), 494-515. doi:10.1016/j.simpat.2010.08.008Lu, L., Zhao, S., Zhou, L., Li, D., Li, Z., Wang, M., … Chen, J. (2016). 16 × 16 non-blocking silicon optical switch based on electro-optic Mach-Zehnder interferometers. Optics Express, 24(9), 9295. doi:10.1364/oe.24.009295DuroJ PetitS SahuquilloJ GómezME.Modeling a photonic network for exascale computing. Paper presented at: 2017 International Conference on High Performance Computing & Simulation (HPCS);2017;Genoa Italy.Xi, K., Kao, Y.-H., & Chao, H. J. (2012). A Petabit Bufferless Optical Switch for Data Center Networks. Optical Interconnects for Future Data Center Networks, 135-154. doi:10.1007/978-1-4614-4630-9_8KimJ DallyWJ ScottS AbtsD.Technology‐driven highly‐scalable dragonfly topology. Paper presented at: 35th International Symposium on Computer Architecture (ISCA);2008;Beijing China.Essiambre, R.-J., & Tkach, R. W. (2012). Capacity Trends and Limits of Optical Communication Networks. Proceedings of the IEEE, 100(5), 1035-1055. doi:10.1109/jproc.2012.2182970Temprana, E., Myslivets, E., Kuo, B. P.-P., Liu, L., Ataie, V., Alic, N., & Radic, S. (2015). Overcoming Kerr-induced capacity limit in optical fiber transmission. Science, 348(6242), 1445-1448. doi:10.1126/science.aab1781Springel, V. (2005). The cosmological simulation code gadget-2. Monthly Notices of the Royal Astronomical Society, 364(4), 1105-1134. doi:10.1111/j.1365-2966.2005.09655.xPlimpton, S. (1995). Fast Parallel Algorithms for Short-Range Molecular Dynamics. Journal of Computational Physics, 117(1), 1-19. doi:10.1006/jcph.1995.1039Ben‐ItzhakY ZahaviE CidonI KolodnyA.HNOCS: Modular open‐source simulator for heterogeneous NoCs. Paper presented at: 2012 International Conference on Embedded Computer Systems (SAMOS);2012;Samos Greece.HossainH AhmedM Al‐NayeemA IslamTZ AkbarMM.Gpnocsim‐a general purpose simulator for network‐on‐chip. Paper presented at: 2007 International Conference on Information and Communication Technology;2007;Dhaka Bangladesh.JainL Al‐HashimiB GaurMS LaxmiV NarayananA.NIRGAM: A simulator for NoC interconnect routing and application modeling. Paper presented at: Design Automation and Test in Europe Conference;2007;Nice France.ChanJ HendryG BibermanA BergmanK CarloniLP.PhoenixSim: A simulator for physical‐layer analysis of chip‐scale photonic interconnection networks. In: Proceedings of the Conference on Design Automation and Test in Europe;2010;Dresden Germany.RumleyS BahadoriM WenK NikolovaD BergmanK.PhoenixSim: crosslayer design and modeling of silicon photonic interconnects. In: Proceedings of the 1st International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems;2016;Prague Czech Republic.VargaA HornigR.An overview of the OMNeT++ simulation environment. In: Proceedings of the 1st International Conference on Simulation Tools and Techniques for Communications Networks and Systems & Workshops;2008;Marseille France.SunC ChenCHO KurianG et al.DSENT‐a tool connecting emerging photonics with electronics for opto‐electronic networks‐on‐chip modeling. Paper presented at: 2012 IEEE/ACM Sixth International Symposium on Networks‐on‐Chip;2012;Copenhagen Denmark.Ma, X., Yu, J., Hua, X., Wei, C., Huang, Y., Yang, L., … Yang, J. (2014). LioeSim: A Network Simulator for Hybrid Opto-Electronic Networks-on-Chip Analysis. Journal of Lightwave Technology, 32(22), 4301-4310. doi:10.1109/jlt.2014.2356515KahngAB LiB PehL‐S SamadiK.ORION 2.0: a fast and accurate NoC power and area model for early‐stage design space exploration. In: Proceedings of the Conference on Design Automation and Test in Europe;2009;Nice France.Chan, J., Hendry, G., Bergman, K., & Carloni, L. P. (2011). Physical-Layer Modeling and System-Level Design of Chip-Scale Photonic Interconnection Networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 30(10), 1507-1520. doi:10.1109/tcad.2011.215715
    corecore