ABSTRACT Computational scaling beyond silicon electronics based on Moore's law requires the adoption of alternate state variables such as electronic spin. Multiple research efforts are underway exploring both Boolean and non-Boolean design space using spin devices in order to make their energy and delay benefits competitive to CMOS. In this paper, we propose spin channel networks (SCN), where the exponential decay property of spin current along the spin channel is exploited to achieve energy-efficient dot product implementation for inference applications. As the use of exponentially decaying spin current for analog computation enforces severe locality constraints, we employ adaptive boosting to design an ensemble of tiny SCNs that work in unison to solve any binary classification task. Such boosted SCNs achieve up to 112× and 14× higher energy efficiency over conventional all-spin-logic-based and 20-nm CMOS designs, respectively.
I. INTRODUCTION
Exponential scaling of CMOS-based logic devices in accordance with Moore's law has enabled tremendous improvement in computational efficiency. However, as the channel lengths continue to reduce beyond a few tens of nanometers, the energy and delay benefits achievable via scaling have stagnated. On the other hand, emerging big data applications, such recognition, mining and synthesis, require a significant amount of information processing, thus requiring energy and delay improvements in computing systems. This has led to much interest in exploring the use of alternative state variables such as electron spin [1] , [2] for computing. Spin torque devices store information in terms of aligned magnetic moments (spins) of unpaired electrons in nanomagnets and rely upon spin diffusion in the nonmagnetic metallic channel connecting two nanomagnets for information transfer.
One example of spin-based devices proposed for digital logic computation is the all spin logic (ASL) device [3] . ASL devices offer certain unique advantages such as nonvolatility, high logic efficiency, and ultralow operating voltages, and are considered a promising beyond-CMOS alternative when combined with material improvements [4] . However, ASL is found out to be noncompetitive compared to CMOS in terms of energy consumption and delay of digital logic implementations [5] , [6] , mainly due to the large energy and delay required for deterministic nanomagnet switching, the exponential decay of spin alignment during propagation along the spin interconnect, and because ASL gates consume static power [6] , [7] .
There have been few works that exploit the stochastic nature of nanomagnetic switching to achieve energy efficiency. In [8] , the switching behavior of magnets in the super-paramagnetic regime was shown to resemble the dynamics of a Boltzmann machine, and thus, a nanomagnetic network was trained to implement certain inference tasks. Stochastic magnets were also employed for energy-efficient random number generation [9] in stochastic computing, as well as for spike generation [10] in spiking neural network implementation.
Several research efforts have explored the neuromorphic design space using spin devices in order to achieve energyefficiency. While it is clear that the switching of nanomagnet naturally implements thresholding function of a neuron, these approaches use additional devices (resistive memory [11] , domain wall magnets [12] - [14] ) to achieve synaptic weighing of spin currents feeding into the nanomagnet. In [15] , multiple binary weighted CMOS drivers along with a clever nanomagnet configuration were employed to obtain spin-current weighing in cellular neural network implementation.
There exists work at the architectural level to fully exploit the advantages of emerging spin-device configurations such as racetrack memory [16] , [17] . For example, high area efficiency and serial access of racetrack memory was exploited to achieve reconfigurable precision [18] and efficient logic operations [19] . In [20] , a novel data converter design was proposed by exploiting the serial structure of racetrack memory devices.
Recently, Ganguly et al. [21] took a physics-based approach for examining the power dissipation in spintronic switches. They identified that nanomagnetic switching consumes 10 3 × to 10 4 × higher switching charge (Q sw ) compared to the CMOS inverter of comparable size. Such a large gap in the switching charge requirements underscores the fundamentally expensive nature of the nanomagnetic switching. ASL networks [ Fig. 1(a) ] use nanomagnetic switching at the output of every gate to implement digital logic. Hence, they require switching of a large number intermediate nanomagnets leading to a high energy consumption. In this paper, we propose spin channel networks (SCN) [ Fig. 1(b) ], where all intermediate nanomagnets are eliminated and all input nanomagnets contribute to the charge required to switch a single output nanomagnet to represent final decision, thereby amortizing the energy consumed in switching it. It is particularly suited for inference implementations. While elimination of intermediate nanomagnetic switching is expected to enhance the energy efficiency, it also presents following two key challenges. 1) How does one realize arbitrary computation while accumulating analog spin currents from multiple nanomagnets?
2) Will this approach scale with the input vector dimensionality (complexity) of the inference kernel?
To address challenge 1) we show that the exponential decay property of spin current along the spin channel, a disadvantage in digital ASL networks, can be exploited to achieve energy efficient analog dot product implementation. To circumvent challenge 2) we employ Adaptive Boosting (AdaBoost) [22] framework to design multiple isolated tiny SCNs (t-SCNs) that work in unison to solve an arbitrary binary classification task. Such boosted t-SCNs achieve 112× to 22.5× and 14× to 2.5× higher energy-efficiency over conventional ASL-based and 20-nm CMOS designs, respectively, when realizing 10 to 100-D binary classifiers.
The rest of this paper is organized as follows. Section II gives the relevant background about ASL, support vector machine (SVM), and classifier ensemble designs via AdaBoost, while Section III focuses on the design of SCNs. Section IV describes the SCN-based SVM and the boosted t-SCN implementations. Section V presents the simulation results, and Section VI concludes this paper.
II. BACKGROUND
A. ALL SPIN LOGIC DEVICE Fig. 2 shows the schematic of an ASL device. It consists of two nanomagnets separated by a conducting channel of length L. The input magnet (M in ) polarizes the supply current passing through it. This creates a spin concentration gradient and propagates a spin current in the channel of length L. This spin current, in turn, exerts a torque on the magnetization of the output magnet (M out ) forcing it to switch. Since the magnets are nonvolatile, they retain the magnetization vector state when the supply current is switched off. The electrical current in the order of 10-100 µA is required to generate sufficient spin current to switch the output magnet. Since the nanomagnets and the spin channel are metallic, the equivalent electrical resistance across the nanomagnetic stack is small (few ohms), enabling these devices to operate at ultralow supply voltages. However, the electrical current through the input magnet flows irrespective of output activity, causing high static energy consumption. Hence, Pajouhi et al. [6] and Calayir et al. [23] propose to clock these devices via a MOSFET, operating in the linear region, which acts as a switch turning on the ASL device only when it needs to process information as shown in Fig. 2 . Fig. 3 defines two SCN primitives (derived from ASL) and their functionalities as will be used to design SCNs VOLUME 5, NO. 1, JUNE 2019 in Section III. The primitives are nanomagnet with a spin channel and a spin channel. In particular, nanomagnet takes input charge current I c and injects proportional spin current I s,o in the channel, where β m is a proportionality constant that depends upon the device material and geometry, including the channel length L c . The input spin current I s,in into a spin channel is reduced by a factor of e (L/λ) to generate an output spin current I s,o , where L denotes the channel length, and λ is the spin flip length [7] , [24] . The layouts are obtained by following the λ-rules in [5] .
B. SUPPORT VECTOR MACHINE
A linear SVM [26] is a simple and popular machine learning (ML) algorithm for binary classification. The SVM learns a hyperplane to separate the training feature vectors into two regions as per the following equation:
where w and b denote the trained weight vector and bias representing the separating hyperplane, respectively, x denotes the N -dimensional input feature vector, andŷ denotes its label predicted by the SVM. If the true label is denoted by y, the accuracy of SVM is given by the probability of classification error p e = Pr{ŷ = y}, which can be empirically estimated for a given data set.
C. CLASSIFIER ENSEMBLE VIA ADAPTIVE BOOSTING
A classifier ensemble consists of multiple weak classifiers. Each weak classifier is computationally simple but inaccurate, i.e., with p e close to 0.5. However, decisions of the weak classifiers can be combined to obtain a highly accurate final decision. AdaBoost [27] is a technique to train these weak classifiers sequentially. Each weak classifier is specifically trained to correct errors made by the other weak classifiers trained earlier (see [27] for the training algorithm). Let the output label of ith weak classifier be denoted aŝ
where f w i (.) denotes the ith weak classifier function parametrized by weight vector w i , which is computed during training. The final decisionŷ f is computed by linearly combining the weak classifier decisionsŷ i , followed by thresholding as per the following equation:
where output weights α i s of the linear combiner are also learned during the training phase.
III. SPIN CHANNEL NETWORKS A. BASIC CONCEPT
SCNs exploit the exponential decay of spin current along spin channels for efficient computation. They compute via weighted analog accumulation of spin currents by careful choice of spin channel lengths. SCNs are composed of the two primitives defined in Fig. 3 . The most basic SCN consists of two nanomagnets connected using spin channels having different lengths as shown in Fig. 4 (a). The resulting output spin current I s,o is approximately given by
where a 1 , a 2 ∈ {0, 1} are the digital Boolean inputs, m 1 , m 2 ∈ {−1, 1} denote the directions of magnetization vectors of two nanomagnets, λ denotes spin flip length, I c denotes the ON-current of the NMOS transistors. Each bit a i controls the charge current through one nanomagnet, and the corresponding spin current in weighed by a factor exponential in its channel length. More complex SCNs can be designed to achieve weighted accumulation of spin currents from M nanomagnets placed at lengths L i s, where i ∈ {0, . . . , M −1}.
B. SCN TOPOLOGIES
For M > 2, multiple circuit topologies can achieve the same input-to-output transfer function (up to a scaling constant) depending upon how the nanomagnets and spin channels are interconnected. For example, conceptual diagrams of two extreme topologies, namely the ladder and the star topology, are shown for M = 4 in Fig. 4 (b) and (c), respectively. In the ladder topology, all nanomagnets share a single spin channel that connects them to the output node. The star topology, on the other hand, consists of a unique channel connecting each nanomagnet to the output node. Weighted accumulation of spin currents in SCNs can be used to efficiently implement multiplication in analog. The M × N bit SCN multiplier (SCNM) shown in Fig. 4(d) takes two charge-domain digital operands A and B having bitwidths M and N , respectively, and generates an output spin current proportional to their product A × B. The SCNM consists of M × N input nanomagnets, each contributing spin current corresponding to a partial product. Individual partial products are computed by the AND gates with bits a i s and b j s of operands A and B, respectively. The AND gates drive the gate of the NMOS, thereby controlling the input charge current through the SCN nanomagnets. Fig. 4(d) shows a ladder-ofladders topology of a 4 × 4 bit SCNM, where four vertical ladders are connected horizontally in a ladder topology. It is to be noted at all the spin channel lengths are multiples of λ ln 2 in order to achieve appropriate weighing (in the powers of two) of spin currents corresponding to individual partial products. The output spin current of this 4 × 4 bit SCNM is given by
where the unit spin current I s,lsb corresponds to the least significant partial product for a given NMOS ON current I c . For the SCNM shown in Fig. 4(d) , I s,lsb = (β m I c /2 7 ). The signs of these operands can be accounted for by changing the magnetization vector directions of the corresponding magnets (for A) and by using a differential supply [15] (for B). The energy consumption of such an SCNM is given by
where R spin denotes the series resistance of the nanomagnet and channel, E and denotes the energy consumed in switching of the AND gate, while R mos and C g denote the ON-resistance and gate capacitance of the transistor, respectively. The gate voltage V g is applied to switch ON the NMOS for T g duration.
The other topologies such as star-of-ladders, star-of-stars, and ladder-of-stars are also possible, and this topological degree of freedom will be explored while identifying the energy-efficient layout in Section III-C.
C. HIERARCHICAL LAYOUT CONSTRUCTION
Topological schematic diagrams shown in Fig. 4 are idealized and convey the SCN functionality at a very high level. They neither account for spin current branching at the spin channel junction nor the physical constraints of component placements and maintaining spin channel lengths. We address both of these issues by developing precise layouts for SCN circuits and obtain input-to-output transfer function from SCN layouts.
For layouts, we choose F = 15 nm, where F denotes the DRAM half-pitch [5] , [25] . All SCN layouts need to satisfy λ-rules described (in terms of F) in [25] . For example, the layout pitch between any two contacts needs to be at least 4F. The value of channel length L c turns out to be 5F (in Fig. 3 ) as a direct consequence of λ-rule constraints. Similarly, λ-rules impose constraints on the minimum distance between nanomagnet and a spin channel, and two parallel spin channels.
The layouts of more complex SCNs are particularly challenging since there is a tradeoff between satisfying λ-rule constraints and the magnitude of the output spin current, and hence the energy consumption. We propose a hierarchical construction of SCN layouts. We define nine primitive topologies referred to as clusters [see Fig. 5(a) ]. Each nanomagnet in a cluster contributes identical spin currents to the output node, referred to as the cluster centroid. The layouts of the clusters are fixed per λ-rules. These clusters can be connected in various topologies, such as a ladder, star or a ladder-of-stars, to generate layouts of more complex SCNs in a hierarchical manner. An illustrative ladder topology of three clusters is shown in Fig. 5(b) . Once the clusters are connected, only the lengths between their centroids need to be adjusted to achieve appropriate weighing of the corresponding spin currents and simultaneously satisfy λ-rules. Fig. 5(c) shows a star-of-ladders layout topology of a 4 × 4 bit SCNM. Each cluster centroid J k generates spin current corresponding to p k , where p k is defined as the sum of partial products having identical binary weight k as follows:
The final output spin current in (4) can be computed as
VOLUME 5, NO. 1, JUNE 2019 where the binary weighing of 2 k among the spin current contributions is achieved by adjusting spin channel lengths between them. The clusters are sequentially placed in a spiral order along three ladders as shown in Fig. 5(c) . Thus, along a single ladder, the minimum channel length between any two consecutive clusters corresponds to the spin current weighing of 2 −3 , thus allowing sufficient spacing to satisfy λ-rules. Hence, this layout topology effectively spreads out the nanomagnets radially. The actual channel lengths in the layout are chosen via extensive simulations using SPICE-based circuit models of spin current injection and propagation in spin devices [24] in order to account for spin current branching at the spin channel junctions.
D. STOCHASTIC SLICER
The output of SCN circuits is an analog spin current. We use a nanomagnet as the final decision device. It acts as a sink for the spin current and thresholds it to produce the final decision represented by its magnetization vector. The nanomagnetic switching is stochastic due dominant thermal noise in the nanomagnet [28] . Hence, we refer to such decision generating nanomagnet as a stochastic slicer. The stochastic slicer switches when the magnetization direction of the corresponding nanomagnets flips due to the input spin current. In this paper, we operate the stochastic slicer for the fixed duration of T g . If the slicer switches during this duration, it corresponds to final decisionŷ = 1, otherwise,ŷ = −1. The slicer is reset after every decision. For a given duration T g , the probability that slicer switches p sw can be approximated as a function of its input spin current I s as follows:
where γ = ((I s − I s,T g )/I s,T g ), I s,T g denotes the spin current for which p sw (I s,T g ) ≈ 0.5, and β 1 = (π 2 E b /4kT). In particular, E b denotes the energy barrier of the nanomagnet, while k and T denote Boltzmann's constant and absolute temperature, respectively. The spin current value I s,T g is a device dependent constant for a given switching duration T g . Equation (8) is a good approximation of the p sw expression in [28] , when |I s |, I s,T g I crit , where I crit denotes the minimum spin current required for nanomagnet to switch with probability 1 as T g → ∞. The stochastic slicer strives realize the thresholding operation
However, due its stochastic nature, stochastic slicer probabilistically makes switching errors, i.e., it switches with certain nonzero probability even when I s < I s,T g , and vice versa, as shown in Fig. 6 . Thus, there exists a tradeoff between input spin current magnitude I s (proportional to energy consumption) and switching probability p sw [see (8) ]. There exists a minimum energy operating point (MEOP) for a target switching probability. For example, as shown in Fig. 6 , |I s | > 1.5I s,T g to achieve slicer switching accuracy of 99%. This MEOP of slicer dictates the minimum charge current I c,min through each nanomagnet required for SCN-based binary classifier to achieve certain classification error probability p e . For a fixed decision delay and error probability, MEOP for SCN-based binary classifiers is uniquely defined by the value of I c,min as shown in Section V. Fig. 7(a) and (b) shows the abstract model and transistor-level schematic of the CMOS input driver in a 14-nm technology, respectively. The nanomagnet is controlled by an NMOS, which should switch on only when a i = 1 and b j = 1. This is achieved via a CMOS NOR gate driving the gate of the NMOS N 1 as shown in Fig. 7(b) . The inputs to the NOR gate are driven by identical inverters, who receive ideal step inputs. The NMOS N 1 is sized to provide a charge current of I c,min , while satisfying V DS < 10 mV. The gate voltage of N 1 gets raised to 600 mV, turning it ON in the linear region with overdrive voltage ≥350 mV. The NOR gate is sized so that the CMOS driver switches within 50 ps while driving N 1 and the inverters are minimum sized. We simulate this schematic using 14-nm high performance (HP) FinFET Arizona State University (ASU) predictive technology models [29] to estimate its switching delay and energy consumption. 
E. CMOS DRIVER

IV. DESIGN OF SCN-BASED CLASSIFIERS A. LINEAR SUPPORT VECTOR MACHINE CLASSIFIER
A linear SVM classifier can be realized using the proposed SCNs by connecting multiple SCNMs in parallel [ Fig. 8(c) ] and one stochastic slicer to generate final classification decision. The SCNM and slicer symbols are defined in Fig. 8(a) and (b), respectively. Each multiplier generates spin current corresponding to the product x i w i , where x i and w i denote ith dimension of input feature vector x and the weight vector w, respectively. Both x i and w i are fixed-point binary numbers. The spin currents at the output of SCNMs accumulate in a common channel (generating I s,o ) and feed into the stochastic slicer. The inputs are kept ON for the duration T g since the stochastic slicer requires that much time to produce its decision. Noting that the stochastic slicer thresholding involves a comparison of its input spin current with I s,T g , (1) of SVM can be realized as follows:
where I s,bias is additional bias current. When I s,o = I s,T g , the slicer switches with probability 0.5. In SVM, this operating point would occur when the input feature vector x lies on the classifier hyperplane, i.e., when
To avoid large bias currents, we modify the feature vector x to (x + d1), where 1 denotes all-one vector and d is a constant, thereby transforming (10) into
where I s,bias is now given by
The bias current I s,bias is generated by having an additional magnet with a supply current I c,bias as shown in Fig. 8 . If the signed precision of x i is M bits, we choose d to be 2 M −1 . This makes (x i + d) an unsigned number, removing the need for differential supply. The sign of w is accounted for by changing the magnetization vector directions of the corresponding spin-current injecting nanomagnets appropriately.
B. CLASSIFIER DIMENSIONALITY SCALING VIA BOOSTED TINY SCNs
The exponential decay of spin current exploited in SCNM makes it very hard to route the output spin current to another block as doing so inevitably incurs a significant loss in the VOLUME 5, NO. 1, JUNE 2019 spin current magnitude. This severely limits the ability to scale the classifier dimensionality [ Fig. 8(c) ], which requires the N multiplier output spin currents to be routed to a single stochastic slicer. Assuming a circular layout of N SCNMs with the slicer at its center, we estimate the loss in the spin current I s,o magnitude as a function of classifier dimensionality N as shown in Fig. 9 , when x 1 = c 1 = 0, w 1 = c 2 = 0, and x i = w i = 0 for all i ∈ {2, . . . , N }, where c 1 and c 2 are some constants. Recall from Fig. 6 that the stochastic slicer requires a minimum magnitude of the input spin current in order to operate accurately for a given switching delay. Thus, the exponential loss in the output spin current magnitude results in exponentially increasing classifier energy consumption in order to maintain classifier accuracy. In order to address the problem, we limit the dimensionality of the SCN-based linear SVM classifier to only two dimensions, and refer to the resulting design as tiny SCN (t-SCN). We then employ AdaBoost to design an ensemble of multiple such t-SCNs to implement an arbitrary N -dimensional binary classification task. Fig. 10 shows the boosted t-SCNs architecture. In particular, given an Ndimensional input feature vector, each t-SCN observes only two unique feature dimensions and computes its local decisionŷ i . These local decisions could be inaccurate with higher probability and, hence, are referred to as weak decisions. The final weighted sum block combines these weak decisions to obtain the final decisionŷ f as per (2) . We restrict the number of weak classifiers to (N /2) so that the computational complexity of the boosted t-SCNs architecture is similar to the standard N -dimensional linear SVM implementation [ Fig. 8(c)] .
It is important to note that, in the boosted architecture, the output spin current of the channel network gets processed locally and only the binary weak decisions are routed to the final weighted sum block, thus requiring much shorter spin interconnect routing within each t-SCN. It is straightforward to convert the binary slicer decisionsŷ i i ∈ {1, . . . , (N /2)} to equivalent voltage [8] and then route it using charge interconnects. We designed the linear combiner in (2) in conventional digital 14-nm CMOS. Its complexity in terms of the full adder count is less than 5% of the total complexity of the (N /2) t-SCNs. One can also employ other schemes, such as Boolean logic, Winner-Take-All, to efficiently combine the binary t-SCN outputs, achieving similar energy benefits.
V. SIMULATION RESULTS
We design and characterize a 6 × 6 bit SCNM using the SPICE-based spin device models [24] for the material and device parameters provided in supplementary information Section II. Fig. 11 shows the interpolated SCNM transfer function after carrying out detailed SPICE simulations for 289 different A and B values. The observed (σ/µ) of the deviations from ideal output spin current is 2%. We employ this SCNM transfer function in our system-level simulations to estimate the accuracy of SCNM classifiers, and use benchmarking methodology [5] to estimate energy and delay of all classifier implementations. The simulation methodology is described in supplementary information Section I. We demonstrate the effectiveness of the proposed approach for two classification tasks: 1) 10-D breast cancer detection (UCI repository data set [31] , [32] ) and 2) 100-D face detection (MIT CBCL data set [33] ). We quantify the classification accuracy in terms of detection rate p det = 1 − p e , where p e is classification error probability.
For each t-SCN, there exists a tradeoff between NMOS current I c and weak decision delay T g for fixed p sw . We choose T g = 2.5 ns throughout this paper to make sure that CMOS driver switching energy ≤ ≈ 33% of the total energy. We compare the energy consumption and accuracy of 10-D and 100-D classifier implementations at a fixed final decision delay of 3 and 4 ns, respectively. The remaining duration accounts for the delay of CMOS driver switching, slicer reset, and weighted logic block operation. In particular, CMOS driver can be switched within 50 ps. For 10-D classifier, weighted logic block operation can be approximated as a majority operation. We choose identical I c for all weak classifiers. Given I c and T g , I c,bias is chosen according to (13) for each weak classifier. For a fixed final decision delay (of 3 ns), the tradeoff between the accuracy and total energy consumption of the 10-D boosted t-SCN classifier is shown in Fig. 12 as a function of I c . As expected, both accuracy and total energy decrease with I c . The accuracy degradation occurs due to reductions in p sw s of the stochastic slicers. For accuracy of 95.5%, the classifier MEOP (defined in Section III-D) is achieved at I c,min = 90 µA. Hence, we size the NMOS N 1 to provide I c of 90 µA at V DD2 = 10 mV (see Fig. 7 ). Fig. 13 shows the accuracy versus energy tradeoff for different 10-D classifier implementations. Boosted t-SCN classifier achieves at least 112× lower energy per decision compared to that of the conventional boosted ASL implementation while maintaining accuracy. Such large energy savings can be attributed to the elimination of all intermediate switching nanomagnets in the SCN implementation. It also achieves 14× lower energy compared to boosted 20-nm LV CMOS digital implementation, while operating at the identical final decision delay. We also observe that both boosted CMOS and boosted ASL implementations achieve energy consumption similar to the corresponding N -dimensional linear SVM implementations. For 100-D classifier (Fig. 14) , the energy benefits of boosted t-SCN implementation reduce to 2.5× and 22.5× over CMOS and ASL SVM implementations, respectively. This is primarily because of higher I c requirements for its weak SVM classifiers, resulting from lower class separability of the data set. Thus, the energy benefits are a function of the input data statistics as well. We only compare dynamic energy here, but leakage energy will be the least for SCN implementations due to having fewer transistors compared to both CMOS and ASL implementations. For all ASL implementations, we assume that the clocking transistors are shared across multiple nanomagnets [6] , significantly amortizing their energy consumption. In Fig. 15 , we observe that the CMOS driver conduction energy and switching energy are comparable, and together VOLUME 5, NO. 1, JUNE 2019 dominate the energy consumption of the 10-D boosted SCN classifier. The CMOS driver is expensive to switch due to large size of NMOS N 1, which is necessary due to large charge current requirements (I c,min ≈ 100 µA) of SCN classifiers. Conduction energies of CMOS driver and nanomagnet add up to a constant V DD2 I c,min T g . These trends are similar to ASL implementations.
VI. CONCLUSION
In this paper, we proposed SCNs where multiple input nanomagnets contribute to the spin current required to switch a single decision nanomagnet. These networks exploit exponential spin current decay for efficient local computation to achieve very high energy-efficiency and an ensemble of such isolated networks can solve any given classification task. Moving forward, one needs to evaluate the impact of process variations and temperature on the final decision accuracy of SCN-based classifiers. While inherent robustness of ML classifiers will help in mitigating this impact, one can also employ system-level techniques, such as retraining [34] , Shannon-inspired error compensation [35] , to achieve further robustness. We plan to explore this direction in the future.
This paper demonstrates how algorithmic techniques can be employed to take advantages of some device characteristics, such as exponential decay of spin current in ASL, that appear disadvantageous in conventional implementations.
