An ultra-low-power tunable bump circuit is presented. It incorporates a novel wide-input-range tunable pseudo-differential transconductor linearised using the drain resistances of saturated transistors. Measurement results show that the transconductor has a 5 V differential input range with <20% of linearity error. The bump circuit demonstrates tunability of the centre, width and height, consuming 18.9 nW power from a 3 V supply, occupying 988 μm 2 in a 0.13 μm CMOS process.
Introduction: Circuits with bell-shaped transfer functions are widely used to provide similarity measures in analogue signal processing systems such as pattern classifiers [1, 2] , support vector machines [3] and deep learning engines [4] . Such nonlinear radial basis functions can be realised with the classic bump circuit [5] . However, the original implementation lacks the ability to change the width of its transfer function. Variable width can be obtained by pre-scaling the input voltage before connecting to the bump generator. The pre-scaling circuit using multi-input floating gate transistors [1] or a digital-to-analogue converter [3] consumes area and increase the power overhead. In [2, 6] , the widths of bump-like circuits are varied by switching binary-sized transistors, but the number of possible widths is limited. A Gaussian function can be directly synthesised by exponentiating the Euclidean distance [7] , but this approach can lead to a complex circuit and large area.
In this Letter, we propose implementing a bump circuit by preceding the current correlator [5] with a tunable transconductor to achieve variable width and height. The design of linear transconductors in subthreshold CMOS is challenging as the linear range of a conventional differential pair diminishes with the gate overdrive, and reaches its minimum in the subthreshold region [8] . Common linearisation techniques such as source degeneration [8] , bias offset [9] , source coupling [10] and the triode transconductor [11] become either less effective or less practical due to the nano-amp biasing current and exponential transfer function of the transistors. The novel transconductor proposed in this Letter exploits the drain resistance of saturated transistors to obtain a wide-input range and tunable transconductance. The pseudo-differential structure allows operation with a low supply voltage.
Circuit design: The schematic of the proposed bump circuit with a wide-input-range pseudo-differential transconductor is shown in Fig. 1 . In the subthreshold, the current correlator M5-10 [5] computes a measure of the correlation of its two inputs (with a current scaling factor of 4)
The tunable transconductor (M1-M4 and I W ) converts the differential inputs V in1 and V in2 to current outputs I 1 and I 2 . The input transistors M1 and M2 act as a source follower. In the subthreshold and assuming saturation, their source voltages are given by
where k ≃ 0.7 is the gate coupling factor, U T ≃ 26 mV is the thermal voltage and I 0 is the pre-exponential current factor dependent on the process and device dimension. In (2), the first term indicates a linear relationship between V in1,2 and V s1,2 , whereas the second term causes nonlinearity. This nonlinearity is mild as it is in a logarithm term. M3 (M4) serves as the current source for follower M1 (M2); its gate length is intentionally made smaller to exploit its channel length modulation (CLM). With first-order approximation, the drain current in M3 is
where I D = I W + I 1 , λ is its CLM coefficient and I D0 is the drain current without CLM, which is equal for both M3 and M4. We utilise this dependence of I D on V s1 to implement a large-value resistor tunable by current I W . A common mode feedback circuit M11-M14 controls the gates of M3 and M4 to provide the common mode rejection for the pseudo-differential structure and ensures that I 1 + I 2 = I H . Combining this with (3), the output currents are
Assuming a balanced input of V in1 + V in2 = 2V cm , and that the second term in (2) can be neglected, the transconductance is given by
It can be seen that the transconductor is controlled by both I W and I H . The V cm term in the δ causes slight asymmetry in the bump transfer function, which is tolerable in typical machine learning applications. The pseudo-differential structure allows a wide differential input range and the circuit can operate at a supply voltage as low as V GS5 + 6U T .
When V in1 = V in2 , I 1 = I 2 = 0.5I H and the maximum bump current output (bump height) is given by I out,max = I H . With I H fixed, changing I W varies the transconductance of the transconductor, and therefore changes the width of the bump. As I 1 and I 2 are linearly related to the input voltages, the shape of the bump output is quadratic
Measurement results: The proposed bump circuit is fabricated in a 0.13 μm CMOS process; thick oxide IO FETs are used to extend the V DD , and therefore the input dynamic range. The active area is 26 × 38 μm 2 , as shown in Fig. 2 . Biased at I W = I H = 1 nA and V in1 = V in2 , it consumes 6.3 nA current from a 3 V supply. The circuit is functional with V DD down to 0.5 V; however, the input range is limited at such a low supply. The transconductor outputs I 1 and I 2 are mirrored off chip by two additional PMOSs at nodes a and b, omitted in Fig. 1 . The differential output currents with different I W are plotted in Fig. 3a with I H = 2 nA and balanced input voltage with V cm = 1.5 V. The normalised g m when I W = 0 is plotted in Fig. 3b , showing an input range of 5 V with g m error below 20%, covering almost the entire input common mode range. The nonlinearity can be attributed to the second term in (2), as well as the second-order effects such as the dependence of λ on V DS . It is tolerable in bump generator applications as the bump output itself is an approximation of a highly nonlinear function [1-3, 5, 6 ]. The offset of about 100 mV is due to the device mismatch and can be calibrated out by utilising floating gate techniques such as that in [12] .
The transfer functions of the bump circuit with regard to one input V in2 are plotted in Fig. 4 , showing variable centre, width and height by varyingV in1 , I W and I H , respectively. Fig. 4 also demonstrates that the circuit works properly with unbalanced input.
The one-dimensional (1D) bump output can be extended to higher dimensions to represent multivariate probability distribution by cascading multiple bump circuits, i.e. connecting I out of one circuit to the I H input of the next circuit. The measured 2D bump output is plotted in Fig. 5 . Just as in the 1D case, each dimension's parameters are individually tunable.
To evaluate the computational throughput of the circuit, the step response time is measured. With I W = I H = 1 nA, the output current 95% settling time is 45 μs when the differential input steps from 0 to 1 V. Table 1summarises the measured performance of the proposed bump circuit. Compared with other recently reported works, the proposed circuit occupies smaller area and consumes significantly lower power. 
Conclusion:
We present an ultra-low-power tunable bump circuit to provide similarity measures in analogue signal processing. It incorporates a novel transconductor linearised using drain resistances of saturated transistors. We show in the analysis that the proposed transconductor can achieve tunable g m with a wide-input range. Measurement results demonstrate a 5 V differential input range of the transconductor with <20% linearity error and bump transfer functions with tunable centre, width and height. We also demonstrate 2D bump outputs by cascading two bump circuits on the same chip.
