I. INTRODUCTION
The observed pace of the semiconductor industry is notably slowing down especially since the 14 nm technology node. This is driven by physical limitations relating to leakage current, thermal dissipation density, and fabrication process control (e.g. gate-oxide thickness control) becoming noncircumventable. As a compromise, 'dark silicon', which represents the part of the powered off on-chip circuit that restricted by the power and thermal budget, emerges to both emergence of extreme scaled devices and the adoption of parallelism in computing, i.e. multi-core processing [1] [2] [3] .
These front-end challenges ripple through to datacommunication back-end; for instance the electrical capacitance of a piece of electrical wire in 14 nm technology node is 1.65 pF/cm. Thus over 800fJ of power is dissipated to charge a 1 cm metallic wire given a VDD = 1V. With rapidly rising machine performance the communicate-to-compute overhead is increasing, making a case to use the bosonic nature of photons to enter a flatter scaling regime for communication technologies [4] . That is, Silicon photonics and possibly plasmonics may be integrated on-chip while mitigating waveguide-to-waveguide spacings of several micrometers [5] [6] . As such the often-stated diffraction limit of light (DLL) is not so much of a limit by itself since the high refractive index of semiconductors reduces the modal cutoff of a waveguide to about 200 nm at NIR wavelengths. In fact, it is not obvious why the operating wavelength on-chip ought to be a telecommunication frequency; visible frequencies are conceivable for intra-chip applications to operate on Silicon nitride on-insulator substrates reducing the DLL to <100 nm, which is smaller than the width of a modern transistor. The actual challenge of photonics is the fundamentally weak lightmatter-interaction (LMI) originating from the small dipole moment of the optical wave acting at the matter atom, which leads to 10-100's of micrometer long interaction lengths for optoelectronic devices. However, making the photon more polaritonic (matter-like), such as in plasmonics, enables strong LMIs and hence short devices which has positive effects on the device performance of the device [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] . Positive effects of wavelength-scale active opto-electronics are a) low electrical capacitance, b) short photon lifetimes allowing rapid reexcitation of the device (e.g. modulation, small-signal gain modulation of lasers), and high energy efficiency due to the small capacitance and voltage enabling dense and highperforming devices. Nevertheless, a polaritonic waveguide has naturally high optical losses limiting signal propagation to less than hundred micrometers. Thus, by combining the low propagation loss silicon photonic links with ultra-fast plasmonic active devices, a hybrid interconnect is able to combine high LMI active optoelectronics with low-loss passive photonic elements therefore enabling high-performance hybrid photonic plasmonic interconnects [5] .
In this work, we introduce a universal FOM Capability-toLatency-Energy-Amount-Resistance (CLEAR). This FOM covers both physical and economic factors related to the evolution rate of different technology options among multiple hardware hierarchy levels; from the device building block level, over interconnect link level, to the network compute system level. By comparing the FOM value at different interconnect lengths, CLEAR is able to select the best technology option and achieve application-driven dynamic reconfigurability if the network offers the built-in overhead to do so. As such CLEAR can be regarded as a universal guideline for emerging technology options in on-chip computing and communications since it incorporates fundamental device performance and economic models. The rest of this paper is structured into three major parts as follows: 1) establish a multi-factor FOM that is able to track the compute system evolution more accurately than conventional FOMs; 2) breakdown CLEAR into the energy efficiency and the computational efficiency to show the compute system evolution in these two aspects; 3) expand CLEAR into device, link and network levels for performance comparison due to its ability to track the actual performance evolution accurately.
II. RULES OF THE SEMICONDUCTORS
Moore's Law has been taken as the 'golden rule' of the semiconductor industry. However, with emerging technologies such as silicon photonics and plasmonics, simply counting the number of components on-chip, as a stand-alone metric does not accurately reflect the actual performance evolution (Fig. 1 ).
Indeed the International Technology Roadmap for
Semiconductors (ITRS) adjusted its predictions several times in an attempt to match development pace with the observed actual trend, which shows a decelerating rate. In fact, recent reports hypothesis that transistor scaling might stop sooner than originally anticipated [24] . Notably, Moore's Law itself was amended several times during the past decades in order to fit in the actual evolution rate [25] ; originally it counted the number of components of an integrated circuit, then it shifted to a transistor size and speed scaling dominated model, and after the clock frequency saturated around 2006 driven by the current leakage and heat dissipation, parallel heterogeneous architectures emerged. As such, the original doubling rate of 12-month shifted to every 18, then 24-months. As such Dennard Scaling and the Koomey's Law introduced single factor based development models to include power density and the computation efficiency respectively to evaluate performance [26] [27] . Despite accurate compute performance tracking for the 1950's and 60's, the Koomey's Law metric eventually deviates when driving factors become either complex or obsolete.
Furthermore, Makimoto's Wave tracks the periodic variations between standardization and customization, and uses a fourfactor FOM to quantify the evolution of the computer towards the popularization; performance (MIPS) divided by cost (power, volume and price) [28] . However, it also deviates after tracking its original growth rate for the first few decades. The deviation is attributed to the slow saturation of the clock speed, power density scaling, and the emergence of the multi-core parallel computing. Common to all four FOMs is the eventual deviation from the actual development rate, due to the limited number of factors considered during the technology evolution [29] .
III. HOLISTIC FIGURE-OF-MERIT FOR COMPUTE SYSTEMS
The analysis and comparison above shows that an We derive the linear relation between the log scale of unit price and time, and this relation could be confirmed by the historical data of transistor [30] . We note that while the metric MIPS as a measure of performance is being replaced by metrics such as floating point operations (FLOPS) due to its susceptibility to the underlying instruction set, in this work CLEAR is applied to many historical processors for which other performance metrics are not available under known benchmarking suites (for example SPEC or LINPAC). Towards making MIPS a representative performance metric however, we weighted (i.e. 
A. Computer System Evolution Trend
The results show that the five-factor FOM CLEAR is able to accurately track the entire computer system evolution, which displays a constant growth rate enabled by the holistic FOM ( Fig. 1) . Moreover, the actual observed evolution rate is consistently held at 2x for every 12- In addition, applying this multi-factor FOMs we are able to classify compute systems by their relative position to the 2x/year trend line (Fig. 1) . For instance the additional overhead on (i.e. physical space, parallelism, heat removal, low economy-of-scale, manufacturing costs) of supercomputers show their inferior CLEAR relative to all other computer types, despite their higher performance (dashed circles, Fig. 1 ). The high parallelism of multi-core technologies used in supercomputers is challenged by compute-to-energy returns described by Amdahl's law [32] . We observe that while supercomputers deliver peta-FLOP performance, they entire infrastructure resembles that of computers 5-30 years back thus questioning future scale-up.
B. Computational and Energy Efficiency Tradeoff
As mentioned before, supercomputers are trading their energy efficiency for extra computing power. However, their actual computational efficiency, which represents the amount of data capacity that a system is able to handle per unit latency, area and price, is slowly saturating. Computational efficiency is a key quantifier describing capability improvement with respect to i) processing data as a function of energy consumption, and ii) to achieve higher computing efficiency per unit resources (Fig. 2) reconfigurable optical networks, and metamaterial-based computing [34] [35] . Comparing Fig. 2 with Fig. 1 , we conclude that photonics has higher compute potential than traditional electronics in terms of both energy and compute efficiency.
IV. OTHER HIERARCHIES APPLICATIONS OF CLEAR
Above we showed that CLEAR is an appropriate FOM to trace the information processing capability of a compute system.
Next we show that this metric can also be used for device-and link level comparisons provided small amendments are made, thus making CLEAR multi-hierarchical. Our discussion includes a comparison between traditional electronics, emerging photonics, plasmonics, and hybrid photonicplasmonics among device, interconnect and network levels.
A. CLEAR Comparison at Device Level
In order to amend CLEAR for device-level usage, a few adjustments need to be made to capture the characteristics of device components. Here CLEAR becomes Capability-toLength-Energy-AREA-Ratio, which breaks-down as follows: i) the device operating frequency is the capability (C); ii) the scaling efficiency which is the reciprocal of the critical scaling length (L) of the device describes the interaction length to provide functionality; iii) the energy consumption (E) of the energy 'cost' per bit is the reciprocal of the energy efficiency; iv) the on-chip footprint, or area (A), and v) the economic resistance (R) in units of dollars ($) is the reciprocal of the device cost efficiency. Here the critical scaling length in the denominator does not conflict with the area factor, but indicates the scaling level or ability of the device to deliver functionality given its length. For instance, the critical scaling length of the CMOS transistor is the length of its logic gate, which controls the ON/OFF states. For photonic and plasmonic devices, it can be regarded as the ring diameter and the side length of the active layer respectively.
We represent the device-CLEAR results as five merit factors in a radar plot (Fig. 3) . Note, each factor is represented in such a way that the larger the colored area in Fig. 3 However, emerging technologies, such as photonics and plasmonics, are gradually catching up with the electronic Fig. 3 . The CLEAR comparison at device level. Each axis of the radar plot represents one factor of the device-CLEAR and is scaled to the actual physical limit of each factor. Four devices compared from different technology options are: 1) the conventional CMOS transistor at 14 nm process; 2) the photonic microdisk silicon modulator [36] ; 3) the MOS field effect plasmonic modulator [37] ; and 4) the photonic plasmonic hybrid ITO modulator [38] .
The colored area of each device also demonstrates the relative CLEAR value of each device. Here, we compare the link-CLEAR among the four aforementioned technologies for three different link lengths, (100 μm, 1 mm, 1 cm) to study the signal length dependent performance change (Fig. 4) . Note, the chip-scale (i.e. die size)
B. CLEAR Comparison at Link Level
is about 1 cm. All five axis of Fig. 4 are normalized to the physical limit similar to the device analysis. For instance, the capacity of the link is restricted by the Bremermann's Limit, Fig. 4 . Link-CLEAR breakdown comparison among 1) electronic, 2) photonic, 3) plasmonic and 4) hybrid photonic-plasmonic interconnects. The solid lines covered areas with square nodes represent the link-CLEAR for 100 μm length interconnects, and the dashed lines with triangle nodes and the dotted lines with circular nodes represent the 1 mm and 1 cm length interconnects respectively. All five link-CLEAR are shown at the same axis of the four radar plots and scaled into the same range constrained by the physical limit. The numbers at the end of each axis of the second radar plot show the actual physical limits of each factor. All of the four radar plots share the same axis titles and ticks, and certain titles and tricks are omitted for conciseness.
which is derived from the mass-energy equivalency and the Heisenberg uncertainty principle giving a maximum bit rate per unit mass of the system [41] . Assuming the link only contains of smallest devices of 1.5 nm each for the sender and receiver, as the 'ultimate' case, this system is still able to provide over 10 16 bps data rate as shown on the capacity axis. The physical limit for the P2P latency can be basically regarded as the light propagation time through the link. The maximum P2P
frequency for a 100 μm is approximately 1 THz, and for 1 mm and 1 cm distance, the frequency limit is one and two orders of magnitude lower. Energy-wise the Landauer's limit applies to the device-CLEAR section, the energy efficiency of this ultimate link is half of the device level energy efficiency since that bit of information has been manipulated twice. Moreover, the area efficiency limit of the link level is also half of that of the device limit level. However, for the economic part, there is no actual physical limit since the fabrication processes we are still being scaling. Thus, here we take the cost efficiency axis limit to be 10 10 but not that it still might improve with time.
For all four interconnect options we find that there is significant room for development. For the P2P frequencies, all three optical interconnects show higher performance compared to electronic interconnects due to low RC delay times, especially for HyPPI, which uses passive low-loss SOI waveguides as for signal propagation and LMI-enhanced active optoelectronics. Although HyPPI is able to deliver about 10-100 times higher capacity compared to the other options, it still falls short several orders of magnitude before we approaching ultimate limits. In addition, the energy efficiency of the In addition, reconfigurable networks are conceivable, which allow the network select between varieties of link options depending on the application demand.
C. CLEAR Comparison at Network Level
We next apply CLEAR to the network level and compare the different link technology options for a 16×16 Mesh networkon-chip (NoC). Adjusting the definitions of the individual factors in CLEAR applicable to a network gives:
Ci is the bandwidth capacity of link i, and N is the number of is an estimated economic cost based on the wafer costs and the area occupied by the electronic and photonic components on their respective dies. Note that we adopt a NoC that uses electronic routers and point-to-point links between the routers.
In this evaluation, we study the effect of using different technologies for these point-to-point links. [4] . For the energy and area estimates, we used the DSENT tool for an analysis of the links and routers, adopting the 11 nm technology node [43] . For HyPPI, we modified DSENT based on previously published component parameters [5] . For
Plasmonics we repeated the link every 100 µm, due to its losses.
The latency for electronic links is 1 clock cycle. For optical links, however, an additional clock cycle is added in order to account for the O/E conversion at the receiver, because the routers are electronic. The link propagation delay is bounded within the clock cycle of the 1.5625 GHz router clock used.
Thus, all optical links exhibit 2-clock cycles latency. The router pipeline latency is three clock cycles.
Evaluation Methodology:
We use a synthetic traffic statistics to model input traffic, based on Soteriou et. al [44] .
We then estimate the activity on each link in the mesh network.
Subsequently, we compute the total dynamic energy per bit accounting for all links and routers based on the injection rate at each link. The throttled laser model from DSENT is adopted, and thus the laser power is accounted for as part of the dynamic energy consumption. The total area is obtained by summing the individual component areas obtained from DSENT, which includes routers, links, drivers, and SERDES.
Results: .As expected, the higher economic costs of all the optical link options reduces the CLEAR value, indicating that an electronic mesh is the most viable option at this point of time.
However, it will be surpassed by HyPPI mesh for increased flit size (128 and larger) basically due to less energy and area efficiency. Nevertheless, HyPPI shows a significant advantage over conventional silicon photonics, because of its lower area and energy requirements. Furthermore, plasmonics is the least suitable options, due to higher energy costs given the assumed 1 millimeter core-to-core link lengths. As photonic wafer costs reduce with economic learning curves, we expect HyPPI to eventually become viable for NoC interconnect. On the path towards HyPPI-enabled NoCs, we envision that the industry will witness a gradual transition from electronics links to HyPPI.
For example, we repeated our experiment for the same 16x16 electronic mesh, augmented with HyPPI "express links". These HyPPI-augmented electronic network is higher, and is thus a good first step towards fully optical NoCs.
V. CONCLUSION
We introduce a novel Figure- of-Merit, CLEAR, incorporating a holistic set of performance parameters. This FOM is universal since it covers both physical and economic factors known to-date that contribute to the evolution of computer systems. As such it is applicable to a variety of 
