115,841 research outputs found
Recommended from our members
Architectural Exploration and Design Methodologies of Photonic Interconnection Networks
Photonic technology is becoming an increasingly attractive solution to the problems facing today's electronic chip-scale interconnection networks. Recent progress in silicon photonics research has enabled the demonstration of all the necessary optical building blocks for creating extremely high-bandwidth density and energy-efficient links for on- and off-chip communications. From the feasibility and architecture perspective however, photonics represents a dramatic paradigm shift from traditional electronic network designs due to fundamental differences in how electronics and photonics function and behave. As a result of these differences, new modeling and analysis methods must be employed in order to properly realize a functional photonic chip-scale interconnect design. In this work, we present a methodology for characterizing and modeling fundamental photonic building blocks which can subsequently be combined to form full photonic network architectures. We also describe a set of tools which can be utilized to assess the physical-layer and system-level performance properties of a photonic network. The models and tools are integrated in a novel open-source design and simulation environment called PhoenixSim. Next, we leverage PhoenixSim for the study of chip-scale photonic networks. We examine several photonic networks through the synergistic study of both physical-layer metrics and system-level metrics. This holistic analysis method enables us to provide deeper insight into architecture scalability since it considers insertion loss, crosstalk, and power dissipation. In addition to these novel physical-layer metrics, traditional system-level metrics of bandwidth and latency are also obtained. Lastly, we propose a novel routing architecture known as wavelength-selective spatial routing. This routing architecture is analogous to electronic virtual channels since it enables the transmission of multiple logical optical channels through a single physical plane (i.e. the waveguides). The available wavelength channels are partitioned into separate groups, and each group is routed independently in the network. Each partition is spectrally multiplexed, as opposed to temporally multiplexed in the electronic case. The wavelength-selective spatial routing technique benefits network designers by provider lower contention and increased path diversity
An Energy and Performance Exploration of Network-on-Chip Architectures
In this paper, we explore the designs of a circuit-switched router, a wormhole router, a quality-of-service (QoS) supporting virtual channel router and a speculative virtual channel router and accurately evaluate the energy-performance tradeoffs they offer. Power results from the designs placed and routed in a 90-nm CMOS process show that all the architectures dissipate significant idle state power. The additional energy required to route a packet through the router is then shown to be dominated by the data path. This leads to the key result that, if this trend continues, the use of more elaborate control can be justified and will not be immediately limited by the energy budget. A performance analysis also shows that dynamic resource allocation leads to the lowest network latencies, while static allocation may be used to meet QoS goals. Combining the power and performance figures then allows an energy-latency product to be calculated to judge the efficiency of each of the networks. The speculative virtual channel router was shown to have a very similar efficiency to the wormhole router, while providing a better performance, supporting its use for general purpose designs. Finally, area metrics are also presented to allow a comparison of implementation costs
A scalable multi-core architecture with heterogeneous memory structures for Dynamic Neuromorphic Asynchronous Processors (DYNAPs)
Neuromorphic computing systems comprise networks of neurons that use
asynchronous events for both computation and communication. This type of
representation offers several advantages in terms of bandwidth and power
consumption in neuromorphic electronic systems. However, managing the traffic
of asynchronous events in large scale systems is a daunting task, both in terms
of circuit complexity and memory requirements. Here we present a novel routing
methodology that employs both hierarchical and mesh routing strategies and
combines heterogeneous memory structures for minimizing both memory
requirements and latency, while maximizing programming flexibility to support a
wide range of event-based neural network architectures, through parameter
configuration. We validated the proposed scheme in a prototype multi-core
neuromorphic processor chip that employs hybrid analog/digital circuits for
emulating synapse and neuron dynamics together with asynchronous digital
circuits for managing the address-event traffic. We present a theoretical
analysis of the proposed connectivity scheme, describe the methods and circuits
used to implement such scheme, and characterize the prototype chip. Finally, we
demonstrate the use of the neuromorphic processor with a convolutional neural
network for the real-time classification of visual symbols being flashed to a
dynamic vision sensor (DVS) at high speed.Comment: 17 pages, 14 figure
Modeling and Analysis of the Performance of Exascale Photonic Networks
"This is the peer reviewed version of the following article: Duro, JosĂ©, Jose A. Pascual, Salvador Petit, Julio Sahuquillo, and MarĂa E. GĂłmez. 2018. Modeling and Analysis of the Performance of Exascale Photonic Networks. Concurrency and Computation: Practice and Experience 31 (21). Wiley. doi:10.1002/cpe.4773, which has been published in final form at https://doi.org/10.1002/cpe.4773. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving."[EN] Photonics technology has become a promising and viable alternative for both on-chip and off-chip interconnection networks of future Exascale systems. Nevertheless, this technology is not mature enough yet in this context, so research efforts focusing on photonic networks are still required to achieve realistic suitable network implementations. In this regard, system-level photonic network simulators can help guide designers to assess the multiple design choices. Most current research is done on electrical network simulators, whose components work widely different from photonics components. In this work, we summarize and compare the working behavior of both technologies which includes the use of optical routers, wavelength-division multiplexing and circuit switching among others. After implementing them into a well-known simulation framework, an extensive simulation study has been carried out using realistic photonic network configurations with synthetic and realistic traffic. Experimental results show that, compared to electrical networks, optical networks can reduce the execution time of the studied real workloads in almost one order of magnitude. Our study also reveals that the photonic configuration highly impacts on the network performance, being the bandwidth per channel and the message length the most important parameters.This work was supported by the ExaNeSt project, funded by the European Union's Horizon 2020 Research and Innovation Program under grant 671553, and by the Spanish Ministerio de EconomĂa y Competitividad (MINECO) and Plan E funds under grant TIN2015-66972-C5-1-R. Pascual was supported by a HiPEAC Collaboration Grant.Duro-GĂłmez, J.; Pascual PĂ©rez, JA.; Petit MartĂ, SV.; Sahuquillo BorrĂĄs, J.; GĂłmez Requena, ME. (2019). Modeling and Analysis of the Performance of Exascale Photonic Networks. Concurrency and Computation Practice and Experience. 31(21):1-12. https://doi.org/10.1002/cpe.4773S1123121Top500 website. Accessed January2018.Kodi, A. K., Neel, B., & Brantley, W. C. (2014). Photonic Interconnects for Exascale and Datacenter Architectures. IEEE Micro, 34(5), 18-30. doi:10.1109/mm.2014.62Rumley, S., Nikolova, D., Hendry, R., Li, Q., Calhoun, D., & Bergman, K. (2015). Silicon Photonics for Exascale Systems. Journal of Lightwave Technology, 33(3), 547-562. doi:10.1109/jlt.2014.2363947Shacham, A., Bergman, K., & Carloni, L. P. (2008). Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors. IEEE Transactions on Computers, 57(9), 1246-1260. doi:10.1109/tc.2008.78Batten, C., Joshi, A., Orcutt, J., Khilo, A., Moss, B., Holzwarth, C. W., ⊠Asanovic, K. (2009). Building Many-Core Processor-to-DRAM Networks with Monolithic CMOS Silicon Photonics. IEEE Micro, 29(4), 8-21. doi:10.1109/mm.2009.60WernerS NavaridasJ LujĂĄnM.Designing lowâpower lowâlatency networksâonâchip by optimally combining electrical and optical links. Paper presented at: 23rd IEEE International Symposium on High Performance Computer Architecture (HPCA);2016;Austin TX.PucheJ LechagoS PetitS GĂłmezME SahuquilloJ.Accurately modeling a photonic NoC in a detailed CMP simulation framework. Paper presented at: 2016 International Conference on High Performance Computing & Simulation (HPCS);2016;Innsbruck Austria.ChenG ChenH HaurylauM et al.Onâchip copperâbased vs. optical interconnects: delay uncertainty latency power and bandwidth density comparative predictions. Paper presented at: 2006 International Interconnect Technology Conference;2006;Burlingame CA.KatevenisM ChrysosN MarazakisM et al.The ExaNeSt project: Interconnects storage and packaging for exascale systems. Paper presented at: 2016 Euromicro Conference on Digital System Design (DSD);2016;Limassol Cyprus.ConcattoC PascualJA NavaridasJ et al.A CAM-Free Exascalable HPC Router for Low-Energy Communications. Paper presented at: 31st International Conference on Architecture of Computing Systems (ARCS);2018.DuanGâH FedeliJâM KeyvaniniaS ThomsonD et al.10 Gb/s integrated tunable hybrid IIIâV/si laser and silicon MachâZehnder modulator. Paper presented at: European Conference and Exhibition on Optical Communications;2012;Amsterdam The Netherlands.DuanGH JanyC Le LiepvreAL et al.Integrated hybrid IIIâV/si laser and transmitter. Paper presented at: 2012 International Conference on Indium Phosphide and Related Materials;2012;Santa Barbara CA.Soref, R., & Bennett, B. (1987). Electrooptical effects in silicon. IEEE Journal of Quantum Electronics, 23(1), 123-129. doi:10.1109/jqe.1987.1073206Liu, A., Liao, L., Rubin, D., Nguyen, H., Ciftcioglu, B., Chetrit, Y., ⊠Paniccia, M. (2007). High-speed optical modulation based on carrier depletion in a silicon waveguide. Optics Express, 15(2), 660. doi:10.1364/oe.15.000660Thomson, D. J., Gardes, F. Y., Hu, Y., Mashanovich, G., Fournier, M., Grosse, P., ⊠Reed, G. T. (2011). High contrast 40Gbit/s optical modulation in silicon. Optics Express, 19(12), 11507. doi:10.1364/oe.19.011507Bergman, K., Carloni, L. P., Biberman, A., Chan, J., & Hendry, G. (2014). Photonic Network-on-Chip Design. Integrated Circuits and Systems. doi:10.1007/978-1-4419-9335-9Dong, P., Chen, L., Xie, C., Buhl, L. L., & Chen, Y.-K. (2012). 50-Gb/s silicon quadrature phase-shift keying modulator. Optics Express, 20(19), 21181. doi:10.1364/oe.20.021181DongP LiuX SethumadhavanC et al.224âGb/s PDMâ16âQAM modulator and receiver based on silicon photonic integrated circuits. Paper presented at: Optical Fiber Communication Conference/National Fiber Optic Engineers Conference;2013;Anaheim CA.Navaridas, J., Miguel-Alonso, J., Pascual, J. A., & Ridruejo, F. J. (2011). Simulating and evaluating interconnection networks with INSEE. Simulation Modelling Practice and Theory, 19(1), 494-515. doi:10.1016/j.simpat.2010.08.008Lu, L., Zhao, S., Zhou, L., Li, D., Li, Z., Wang, M., ⊠Chen, J. (2016). 16 Ă 16 non-blocking silicon optical switch based on electro-optic Mach-Zehnder interferometers. Optics Express, 24(9), 9295. doi:10.1364/oe.24.009295DuroJ PetitS SahuquilloJ GĂłmezME.Modeling a photonic network for exascale computing. Paper presented at: 2017 International Conference on High Performance Computing & Simulation (HPCS);2017;Genoa Italy.Xi, K., Kao, Y.-H., & Chao, H. J. (2012). A Petabit Bufferless Optical Switch for Data Center Networks. Optical Interconnects for Future Data Center Networks, 135-154. doi:10.1007/978-1-4614-4630-9_8KimJ DallyWJ ScottS AbtsD.Technologyâdriven highlyâscalable dragonfly topology. Paper presented at: 35th International Symposium on Computer Architecture (ISCA);2008;Beijing China.Essiambre, R.-J., & Tkach, R. W. (2012). Capacity Trends and Limits of Optical Communication Networks. Proceedings of the IEEE, 100(5), 1035-1055. doi:10.1109/jproc.2012.2182970Temprana, E., Myslivets, E., Kuo, B. P.-P., Liu, L., Ataie, V., Alic, N., & Radic, S. (2015). Overcoming Kerr-induced capacity limit in optical fiber transmission. Science, 348(6242), 1445-1448. doi:10.1126/science.aab1781Springel, V. (2005). The cosmological simulation code gadget-2. Monthly Notices of the Royal Astronomical Society, 364(4), 1105-1134. doi:10.1111/j.1365-2966.2005.09655.xPlimpton, S. (1995). Fast Parallel Algorithms for Short-Range Molecular Dynamics. Journal of Computational Physics, 117(1), 1-19. doi:10.1006/jcph.1995.1039BenâItzhakY ZahaviE CidonI KolodnyA.HNOCS: Modular openâsource simulator for heterogeneous NoCs. Paper presented at: 2012 International Conference on Embedded Computer Systems (SAMOS);2012;Samos Greece.HossainH AhmedM AlâNayeemA IslamTZ AkbarMM.Gpnocsimâa general purpose simulator for networkâonâchip. Paper presented at: 2007 International Conference on Information and Communication Technology;2007;Dhaka Bangladesh.JainL AlâHashimiB GaurMS LaxmiV NarayananA.NIRGAM: A simulator for NoC interconnect routing and application modeling. Paper presented at: Design Automation and Test in Europe Conference;2007;Nice France.ChanJ HendryG BibermanA BergmanK CarloniLP.PhoenixSim: A simulator for physicalâlayer analysis of chipâscale photonic interconnection networks. In: Proceedings of the Conference on Design Automation and Test in Europe;2010;Dresden Germany.RumleyS BahadoriM WenK NikolovaD BergmanK.PhoenixSim: crosslayer design and modeling of silicon photonic interconnects. In: Proceedings of the 1st International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems;2016;Prague Czech Republic.VargaA HornigR.An overview of the OMNeT++ simulation environment. In: Proceedings of the 1st International Conference on Simulation Tools and Techniques for Communications Networks and Systems & Workshops;2008;Marseille France.SunC ChenCHO KurianG et al.DSENTâa tool connecting emerging photonics with electronics for optoâelectronic networksâonâchip modeling. Paper presented at: 2012 IEEE/ACM Sixth International Symposium on NetworksâonâChip;2012;Copenhagen Denmark.Ma, X., Yu, J., Hua, X., Wei, C., Huang, Y., Yang, L., ⊠Yang, J. (2014). LioeSim: A Network Simulator for Hybrid Opto-Electronic Networks-on-Chip Analysis. Journal of Lightwave Technology, 32(22), 4301-4310. doi:10.1109/jlt.2014.2356515KahngAB LiB PehLâS SamadiK.ORION 2.0: a fast and accurate NoC power and area model for earlyâstage design space exploration. In: Proceedings of the Conference on Design Automation and Test in Europe;2009;Nice France.Chan, J., Hendry, G., Bergman, K., & Carloni, L. P. (2011). Physical-Layer Modeling and System-Level Design of Chip-Scale Photonic Interconnection Networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 30(10), 1507-1520. doi:10.1109/tcad.2011.215715
Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine
Deep neural networks have achieved impressive results in computer vision and
machine learning. Unfortunately, state-of-the-art networks are extremely
compute and memory intensive which makes them unsuitable for mW-devices such as
IoT end-nodes. Aggressive quantization of these networks dramatically reduces
the computation and memory footprint. Binary-weight neural networks (BWNs)
follow this trend, pushing weight quantization to the limit. Hardware
accelerators for BWNs presented up to now have focused on core efficiency,
disregarding I/O bandwidth and system-level efficiency that are crucial for
deployment of accelerators in ultra-low power devices. We present Hyperdrive: a
BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel
binary-weight streaming approach, which can be used for arbitrarily sized
convolutional neural network architecture and input resolution by exploiting
the natural scalability of the compute units both at chip-level and
system-level by arranging Hyperdrive chips systolically in a 2D mesh while
processing the entire feature map together in parallel. Hyperdrive achieves 4.3
TOp/s/W system-level efficiency (i.e., including I/Os)---3.1x higher than
state-of-the-art BWN accelerators, even if its core uses resource-intensive
FP16 arithmetic for increased robustness
An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration
We empirically evaluate an undervolting technique, i.e., underscaling the
circuit supply voltage below the nominal level, to improve the power-efficiency
of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable
Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing
faults due to excessive circuit latency increase. We evaluate the
reliability-power trade-off for such accelerators. Specifically, we
experimentally study the reduced-voltage operation of multiple components of
real FPGAs, characterize the corresponding reliability behavior of CNN
accelerators, propose techniques to minimize the drawbacks of reduced-voltage
operation, and combine undervolting with architectural CNN optimization
techniques, i.e., quantization and pruning. We investigate the effect of
environmental temperature on the reliability-power trade-off of such
accelerators. We perform experiments on three identical samples of modern
Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification
CNN benchmarks. This approach allows us to study the effects of our
undervolting technique for both software and hardware variability. We achieve
more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain
is the result of eliminating the voltage guardband region, i.e., the safe
voltage region below the nominal level that is set by FPGA vendor to ensure
correct functionality in worst-case environmental and circuit conditions. 43%
of the power-efficiency gain is due to further undervolting below the
guardband, which comes at the cost of accuracy loss in the CNN accelerator. We
evaluate an effective frequency underscaling technique that prevents this
accuracy loss, and find that it reduces the power-efficiency gain from 43% to
25%.Comment: To appear at the DSN 2020 conferenc
- âŠ