We study here the case of a 2048 cores chip, where cores are spread into 32 tilesets of 16 tiles containing 4 cores each. Each of the 32 tilesets has a RF access point to the serpentine transmission line across the chip. Inside tilesets, a 2-D mesh is used between the 16 tiles where a crossbar switch joints 4 cores, one RAM block and one DMA unit. Total 1 Terabytes memory is physically distributed into all tiles but logically shared between all cores and managed by a distributed hybrid cache coherency protocol (DHCCP). From the RFNoC point of view, a 20 GHz bandwidth between 20 and 40 GHz is shared into 1024 carriers between all 32 RF access nodes. The novelty of our work is that we have derived, in previous publications, algorithms able to dynamically share the RF resources between the 32 nodes. It has been stated by simulations that the channel transfer function is flat in the 20-40 GHz frequency band and just depends on the distance between nodes. The scope of this paper is to make a capacity analysis on the different links between nodes and to derive mean capacity evaluation of the RF NoC. We state that only -42 dBm of transmission power on the RF line is necessary to reach a 6 bits/s/Hz spectral efficiency.
INTRODUCTION
In order to drive the ever increasing computational demands of the 21st century, multiprocessors are being preferred more and more over single processors which have reached their limits due to thermal and physical issues. With developing lithographic techniques and semiconductor technologies, the number of cores in a single chip is expected to reach thousands, before the end of next decade (Borkar, 2007) . These architectures constituted of sea of processors, where each of them are simpler, lower frequency cores providing higher computational power by exploiting parallelism, power efficiency and robustness. They are referred as Chip Multiprocessors (CMPs) or as manycore processors in the community (Olukotun et al., 2007) .
Traditional, relatively simple communication mediums for connecting on-chip processing elements were buses or crossbars. However, with increasing number of cores it has become impractical to implant dedicated point-to-point wires and congestion problem has arised with the buses. Researchers had to introduce a new framework known as Network-onChip (NoC), where communication layer is detached from the data generated by on-chip nodes, and packetized transmission is performed via buffered routers as it can be done in large scale telecommunication networks (Cota et al., 2011) . This modular approach does not only increase bandwidth, but also enables a wider spectrum of choice for designers, as various topologies, routing and arbitration algorithms can be applied. NoC has changed the approach of on-chip community to the problem, where brand new research interests have emerged such as optimization and dimensioning these router based architectures. However, as hundreds, thousands of cores are on the horizon, conventional electrical networks are not able to sustain the communication demands of these massive chips in terms of latency, bandwidth and power efficiency.
In order to provide the necessary breakthrough, designers have focused on developing optical and RF interconnects, recently (Pasricha & Dutt, 2010) . These interconnects serve as high bandwidth, low latency communication highways between each core or group of several cores. Photonic interconnects are considered as an effective technology to reduce the latency, however they require constantly operated on-chip or off-chip laser sources and they are incompatible with the CMOS technology. On the other hand proposed RF interconnects are based on fully CMOS compatible components and a mature technology (Deb, 2012) .
In this paper, firstly we present our hierarchical 2048-core CMP with its Orthogonal Frequency Division Multiple Access (OFDMA) based RF interconnect. State-of-the-art optical and RF on-chip interconnects rely on numerous amount of electronic circuits such as microring resonators, local oscillators etc. to generate orthogonal communication channels, which limit their scalability. However, proposed OFDMA interconnect in this paper has the potential to overcome the scalability issue by encoding data on frequency domain digitally by not requiring high amount of circuitry, providing broadcasting capability and high bandwidth reconfigurability, thanks to the cutting edge components being designed in the vicinity of the project. After we present the multiprocessor architecture, we introduce certain details of the RF interconnect connecting 32 tilesets (group of several cores) and the attached RF frontends on each of these tilesets. The main aim of this paper is to study the information theoretic capacity and determine the associated minimum transmission powers for this on-chip RF interconnect. 
CMP ARCHITECTURE

OFDMA Based RF Interconnect
In the WiNoCoD chip proposed in (Briere et al., 2015) , 32 tilesets are interconnected via an serpentine, U-shaped, state-of-the-art RF microstrip transmission line for the inter-tileset communication.
Close ended, circular loop transmission line shape is avoided in order not to cause self-interference. The packets that are generated inside a tile in tileset, which are destined to a tile in another tileset, traverses the electrical mesh network and reaches to the RF Fig. 2 , illustrates the architecture of the RF frontends. The up-conversion mixers combine a baseband signal with a local oscillator signal. Mixing occurs in a MOSFET, whose gate and drain are respectively fed by the local oscillator and the baseband signal. As the local oscillator frequency is 30 GHz, which is the middle of our 20 GHz bandwidth, it needs to be suppressed. Thanks to the differential outputs of the DAC, two IQ-Modulators can work together to do so. Besides avoiding interference caused by image frequencies they can reduce the LO level in the output. As we use the same local oscillator for both of IQ-modulators and opposite I-Q signals, the IQModulators outputs are subtracted in a differential amplifier to perform this suppression. Then this signal is amplified by a Low-Noise Amplifier (LNA) and transmitted on waveguide.
The reception is done synchronously every T = 51.2 ns as transmission, too. The received signal from the transmission line is amplified and fed to a separator circuit, mixers and 30 GHz local oscillator to obtain in-phase and quadrature components. Low Pass Filters (LPF) are used for down-conversion. Then I and Q components are converted to digital domain by our Analog-to-Digital (ADC) components. After Serial to Parallel conversion this vector of I and Q values are converted to frequency domain by an FFT block. Utilized FFT/IFFT processors are estimated to be manufactured with 120 nm CMOS technology. We estimate the area of each of these modules as 0.31 mm 2 and power consumption of 67.5 mW. Each of ADCs and DACs are designed with 120 nm technology and have an estimated surface area of 0.12 mm 2 and power consumption of 81 mW (Briere et al., 2015) . It was shown that WiNoCoD interconnect has a 0.2-0.3 dB/mm attenuation with distance over 20-40 GHz band. These results are derived in the scope of WiNoCoD project (Briere et al., 2015; Hamieh at al., 2014) .
INFORMATION THEORETIC ANALYSIS OF THE PROPOSED RF INTERCONNECT
RF Interconnect Capacity Derivation
In this section, we introduce a brief analysis of achievable communication capacities between tilesets and the associated minimum transmission powers based on the information theory. Information theory, which is founded by C. Shannon's seminal paper (Shannon, 2001) provides the bound for maximum achievable transmission rate on a communication channel with the given signal power, where the probability of error approaches the zero. This theoretical bound is independent of the utilized signal protection or correction mechanisms and may provide a good insight for designers for dimensioning a reliable communication on a channel. The information theoretic capacity of a channel can be written in bits/sec as:
where B is bandwidth in Hz and SNR is Signal-toNoise Power Ratio in linear. The power of the noise P N , depends on the temperature and bandwidth. SNR can be written as the ratio of the received signal power to the noise power as P R =P N . P N is the Additive White Gaussian Noise (AWGN) power in the bandwidth. The AWGN power spectral density in standard room temperature is -174 dBm/Hz, which we also accept this value in our calculations (Shankar, 2002) .
As we have a immobile and minuscule environment in contrast with general wireless communications, we can assume that the only loss on transmitted signal power is due to distance between tilesets. As the frequency response is relatively nonfluctuating, we can assume a single value for attenuation per distance over all bandwidth. For our calculations we assume a 0.25 dB/mm attenuation on the transmission line, which is the average of minimal and maximal values of 0.2 and 0.3 dB/mm. Hence, the received signal power can be written as the ratio of the transmitted signal power to the attenuation due to the distances between tileset-i and tileset-j:
. d ij being distance in mm between tileset-i and tilesetj, the resulting attenuation in dB becomes 0.25d ij . Converting this expression in scalar, we can rewrite the received signal power as a function of distance: This calculated minimum transmission power shall provide a good rationale for users on the requirements of error-free reliable communication.
In this paper, we will analyze the information theoretic channel capacities and associated minimum transmission power values for our RF based on-chip interconnect. In addition to this, required transmission powers for different bit error rates for different modulation orders and uncoded communications are evaluated. These two type of indicators can be compared to dimension the required communication energy for the proposed on-chip RF interconnect. Adjacent tileset access points have a displacement of 8 mm and we assume that vis-a-vis tilesets' access points have a displacement of 1 mm. We also assume that each tileset is allocated evenly 32 subcarriers for transmission, which corresponds to 640 MHz of bandwidth. Hence, the noise power for the bandwidth can be calculated by multiplying it with the assumed AWGN power spectral density and can be found as approximately -86 dBm. Note that one can derive desired figures for different bandwidths simply by scaling linearly. Information spectral capacity densities can be regarded as good indicators for the utilizable modulation orders with error free communication, such as 1 bits/s/Hz corresponding to BPSK, 2 bits/s/Hz corresponding to QPSK etc.
However, we also investigate the required transmission powers for various bit error rates for different modulation orders under uncoded transmission. Note that, information theoretic capacity defines the maximum achievable rate with a bit error rate approaching to 0, with a hypothesized perfect channel coding mechanism. Therefore, even though required transmission power for a desired channel capacity density is defined for an error rate of 0, it may require less power than the power required for various bit error rates. This is due to robustness of channel coding. Next, we develop the expressions for the required transmission powers for different modulation orders. For BPSK and QPSK, the BER or 
SNR b pQ
where b is the number of bits per constellation such as 4 bits for 16-QAM, 8 bits for 256-QAM etc. Let us calculate the SNR per bit at first. As we did in capacity formula in (2), we can write the received SNR as the ratio of transmission power to ambient noise power and attenuation by distance. For BPSK and QPSK. By using the assumption made in (Hamieh et al., 2014) , a noise factor, F of 3 dB is used, which is approximately 2 in linear scale. Therefore, the linear noise power density can be calculated from the -174 dBm and the additional 3 dB noise factor as: The placement of tilesets with their ID number is shown on Fig. 3 . For instance, one of the maximum distances among tileset connection is between tileset-1 and tileset-32, which is 120 mm (spacings of 15 adjacent tilesets). We have analyzed the information theoretic limits for each of the 32x31 unicast communication combination between tilesets. Fig. 4 shows the distances between tilesets for each of these combinations according to tileset ID numbers.
The capacity of the each transmission combination can be written in matrix format assuming transmission power P T is allocated for each them, where d ij is the distance between tileset-i and tileset-j in mm: 
Results
Combining (2) and (3), we derive the required minimum transmission powers in dBm for different spectral capacity densities from 1 bits/s/Hz to 6 bits/s/Hz corresponding to modulation orders between BPSK and 64-QAM. One can see that from Fig.5 , for achieving a spectral efficiency of 1 bits/s/Hz between the most distant tilesets, we need approximately -40 dBm transmission power and between the closest tilesets, we need approximately -70 dBm transmission power. For achieving a spectral efficiency of 6 bits/s/Hz between the most distant tilesets, we need approximately -20 dBm transmission power and between the closest tilesets, we need approximately -50 dBm transmission power. Note that, the transmission power in dBm varies linearly with spectral efficiency and distances between tilesets, consistent with the equations above. Fig. 6 shows the total required transmission power, considering each of the unicast communication combination has capacity densities 1-6 bits/s/Hz. We can see that for achieving a spectral efficiency of 1 bits/s/Hz, we need approximately -57 dBm total transmission power and for a spectral efficiency of 6 bits/s/Hz, we need approximately -42 dBm average transmission power. And finally, Fig.7 shows the minimum required overall transmission power from each tileset to their 31 destinations, for bit error rates of 10 -1 ; 10 -3 ; 10 -5 and 10 -7 for the lowest and highest modulation orders; BPSK and 64-QAM. As a reference, overall required transmission powers for information theoretic capacity densities of 1 bits/s/Hz and 6 bits/s/Hz (which are associated to BPSK and 64-QAM respectively) are shown. Note that, minimum required overall power for bit error rates of 10 -3 ; 10 -5 ; 10 -7 for BPSK and 64-QAM are higher than the power required for information theoretic capacity densities of 1 bits/s/Hz and 6 bits/s/Hz, respectively. This is due to capacity's definition for perfect channel coding as mentioned previously. ; 10 -7 under BPSK and 64-QAM and information theoretic capacity densities of 1 bits/s/Hz and 6 bits/s/Hz.
CONCLUSIONS
We have briefly presented the architecture of a 2048-core generic multiprocessor, which is being developed in the scope of WiNoCoD project, which attempts to employ an OFDMA based on-chip RF interconnect for the first time, to the best of our knowledge. With their matured manufacturing techniques, full CMOS compatibility and ever increasing transistor frequencies, RF interconnects are considered viable candidates of future massive on-chip platforms. We have given details of this 20 GHz OFDMA infrastructure to be used by 32 tilesets, where each of them incorporates 64 cores. Information theory defines the limits for the achievable transmission rate with given power budget and noise characteristics. Information theoretic capacities and the associated minimum transmission powers for this interconnect have been evaluated in this paper.
