Abstract-Advanced Ultra-Low Power (ULP) computing platforms can be affected by large performance variations. This phenomenon is mainly caused by process and ambient temperature variations, and it is magnified by the strong Temperature Effect Inversion (TEI) that characterizes devices when operating Near-Threshold (NT) in highly scaled nodes. 28nm UTBB FD-SOI technology supports an extended range of both forward and reverse Body-Bias (BB) voltage. This feature can be efficiently used to reduce margins at design time and compensate variations at runtime. In this paper we propose a BB voltage controller capable to independently probe the maximum frequency of P and N transistors, and leverage a BB voltage adjustment to achieve a user-specified target frequency, minimizing the leakage current. Compared to the case where zero BB is applied to the transistors, the controller achieves up to 23% power reduction exploiting the performance increase originated by TEI, further reducing power by 12% with respect to a symmetric BB approach.
I. INTRODUCTION Maximizing the energy efficiency of IoT nodes represents a big challenge because the environment where such devices operate can not be always predicted at design time. To achieve timing closure on wide temperature and voltage ranges, designers tend to assume large margins [1] . In deep sub-micrometer technology nodes, the sensitivity of the transistors performance to process and ambient temperature variations is very high, and in devices operating Near-Threshold (NT) this phenomena are further amplified. As demonstrated by [2] and [3] , on FD-SOI technologies, performance variations caused by process and ambient temperature variations can be efficiently compensated applying a Body-Bias (BB) voltage to the transistors to dynamically adjust their threshold. Additionally, as reported in [3] , compensation exploiting BB voltage adaptation achieves better results than compensation with Adaptive Voltage Scaling (AVS). In this paper we propose a mixed hardware-software BB voltage controller, capable to independently regulate the voltage applied to the well of P and N transistors to achieve a user-specified target frequency with minimum power. The controller exploits the performance improvement caused by a temperature increase to apply a reverse body bias voltage until the target performance are restored, reducing as a consequence the leakage current.
II. PULPV3 SOC The proposed BB voltage control strategy has been implemented and validated on the Parallel Ultra-LowPower platform [4] version 3 (PULPv3) [5] , a multi-core near-threshold processor for Internet of Things (IoT) applications.
Fine-grain power management is possible thanks to three isolated power domains: i) The "SoC Body-Bias Domain", hosting the IO peripherals and the L2 memory ii) The "Cluster Body-Bias Domain", hosting the cores iii) The "Safe Voltage Domain", containing the Frequency Locked Loop generators and the two BodyBias Generators for the SoC and Cluster regions. In our tests, we will focus only on the Cluster Domain, which is the one that dominates both static and dynamic power. We apply V bb = 0 V to the other body-bias domains. The main features of the BB voltage generators as well as the PULPv3 SoC are reported in [5] .
Every power domain features a Process Monitoring Box (PMB) capable to probe the maximum achievable switching frequency of both P and N ring oscillators. The maximum frequency value returned by these modules, as demonstrated in [6] , changes consistently with the maximum frequency of the entire power domain where they are placed, thus they can be used as on-chip frequency-meters.
III. BODY-BIAS VOLTAGE CONTROLLER
The BB controller proposed in this paper performs a closed-loop regulation of the voltage applied to the P and N well of the transistors. The regulation relies on the observation of the maximum frequency achievable by both types of transistors, given an operating point (temperature and core cluster domain supply voltage). As first step of the asymmetric BB voltage regulation loop, the PMBs probe the maximum frequency. Then, the measured frequency is compared to the set-point frequency specified by the user. Once the frequency mismatch between the current frequency and the target one is determined, a PID controller computes the proper BB voltage for P and N transistor wells that allows to fill the performance gap. Finally, the voltage computed by the software controller is applied by the on-chip BB voltage generators to both P and N transistor wells. Fig.1 illustrates the controller architecture. Fig.2 reports the BB voltage range available for the regulation, and how the voltage is applied to P and N wells. 
IV. EXPERIMENTAL SETUP
The measurements presented in this paper have been obtained under the following conditions:
• Power measurements performed by means of the Keysight N6705B power analyzer.
• Chip temperature enforced with a Peltier's cell controlled by the Meerstetter-1091-PT100 Termo Electric Controller.
• Cores in the cluster domain were clock-gated during the leakage current measurement.
• The total power consumption refers to the execution of a benchmark capable to trigger the most critical paths of the circuit, that, according to the postlayout simulation, are located in core datapath. Clock frequency set-point equal to 40MHz, which is close to the maximum frequency of the device at 25 • C with zero BB. Core cluster power domain supply voltage equal to 0.5V.
• Transistor wells body-biasing as reported in Fig.2 
V. RESULTS
In this section we present the performance of the BB controller and we discuss the results.
A. Leakage Current
As a first analysis, we measured the leakage current in three different conditions: i) when the BB voltage is zero, ii) when the BB voltage is regulated to compensate the slower transistors, and applied symmetrically to the faster ones iii) when the BB voltage is independently regulated for P and N transistors.
As shown in Table I , the independent body biasing outperforms the symmetric BB voltage regulation at all temperatures, maximizing the leakage reduction. In digital circuits, the maximum frequency is limited by the slowest transistors. Observing the on-chip frequency measurements performed by means of PMBs, we found that in our design N transistors are significantly slower than P transistors. Because of this discrepancy in transistor speed, when the controller applies a symmetric BB voltage to both type of devices, it is capable to properly boost the N transistors (slower) to achieve the target frequency, however, it boosts also the P transistors (faster) applying unnecessary forward BB voltage. As a result, the symmetric approach allows the circuit to achieve the desired frequency, but the over-compensation of the P transistors causes an unnecessary leakage current increase. On the contrary, when the controller applies an independent BB voltage, it boosts only the N transistors (slower) to achieve the target frequency. Additionally, if the maximum frequency achievable by the P transistors (faster) is higher then the target one, the controller applies a reverse BB voltage, further reducing the leakage current. Table I reports the BB voltage applied by the controller both in the symmetric and independent approach. The body biasing of N transistors is the same in both cases, this happens because the N transistors are slower than P transistors and practically limit the circuit maximum frequency, hence the controller boosts them with the same amount of BB voltage regardless of the body biasing approach. On P transistors, if the body biasing is performed symmetrically, the BB voltage is symmetric to the one applied to the N transistors. On the contrary, when the body biasing of P and N transistors is independent, the BB voltage on P transistors is chosen according to their maximum frequency probed by PMBs.
In our experiments, the P transistors were capable to switch faster than the target frequency (40 MHz) already at 15 • C, indeed the controller applies a reverse BB voltage to reduce the frequency and minimize the leakage current.
B. Power Consumption
The second analysis, whose results are presented in Fig. 3 , shows the effect of the BB voltage regulation on the total power consumption of the cluster power domain. When the chip is supplied at 0.5V, and the clock frequency is close to the maximum one, the leakage constitutes from 30% to 45% of the total power consumption, depending on the temperature. Therefore, trying to minimize the leakage current reduces the total power consumption. We observed that, in this scenario, applying an independent BB voltage to P and N transistor wells, we reduced by 23% the power consumption at high temperatures and we extend the operating range of the chip below ambient temperature. Additionally, we demonstrated that the independent BB, compared to the symmetric BB, reduces on the average 12% more power.
VI. CONCLUSION In this paper we proposed a closed loop body bias voltage regulation for a SoC implemented in 28nm FD-SOI technology. The BB controller is capable to independently modulate the BB voltage on P and N transistors to minimize the leakage current while guaranteeing the target frequency. Compared to the symmetric body biasing, the independent approach presented in this work allows 12% further power reduction, achieving 23% power saving with respect to the case where zero BB is applied to P and N transistors.
