INTRODUCTION
Conversion of signals is fundamental to the interfacing of embedded systems [3] , and particularly microcontrollers (µc) [9] , [26] , [33] , microprocessors (µP) [32] , and microcomputers (µC) [18] . The signal conversions include (i) analog-to digital (A/D) in order to translate an analog form of the signal to its sampled and quantized form for digital signal processing, (ii) digital-to-analog (D/A) in order to translate the digital samples to a corresponding boxcar signal for further low-pass filtering and recovery of the original signal, and (iii) digital-to-digital (D/D) to achieve new desired properties of the data such as (a) elimination of a DC component, (b) self-clocking by embedding the transmission clock into the transmitted data, and (c) polarity independence to eliminate the need to colour the wires in a data-carrying cable.
A course on microcontroller interfacing must include all those conversions, in addition to many other pertinent topics [16] , [17] , [32] , [4] . This paper describes such a course, with a focus on teaching the delta-sigma (ΔΣ) conversion that is often omitted because it appears to be the most difficult topic to comprehend and to teach (e.g., [12] , [29] , [2] ).
The most important contribution of this paper is the approach to teaching the ΔΣ conversion in the context of the evolution of ideas related to the operation of other more common techniques such as parallel (flash and its variations) and serial, either linear on nonlinear. The serial signal A/D and D/A converters (ADC and DAC) are often covered in a course on interfacing, and include the following types (i) counting (single ramp), (ii) tracking, (iii) successive approximation (SA), (iv) integrating (dual slope, quad slope), and (v) voltage-to-frequency (V/F)-based converters [16] . The first four types can be classified as voltage-to-timegate (V/T)-based converters. The V/F converter can be treated as a logical dual of the V/T converter, but with several new very desirable features [16] .
Once these relations are established, we can combine the best features of the V/F and V/T converters, and add concepts from signal processing and control to achieve noise shaping through dithering (spectrum spreading). Such ΔΣ converters can attain unprecedented resolution. This paper presents the roots of the ΔΣ conversion as found in the differential pulse-coded modulation (PCM), as well as its theory, implementation, and benefits to embedded systems.
STANDARD ADC CONCEPTS

Data Acquisition System
Types of Signals:
An analog-to-digital (A/D) converter (ADC) is an electronic circuit that converts an analog signal (continuous in both time and magnitude) to a discrete signal (discrete in time, but continuous in magnitude) and to a digital signal (discrete in both time and magnitude), as shown in Fig. 1 .
While the discrete signal is the same as the analog signal at time j, the digital signal approximates the discrete signal to one least-significant bit (LSB), equal to the quantization step, Q. The Q step is the interval of uncertainty in the ADC, and is equivalent to the leastsignificant bit (LSB), and is given by
CEEA Conf. 2014; Paper 107 Canmore, AB; June 8-11, 2014 -2 of 8 -where FS is the full scale (the peak-to-peak) value of the signal), v(t) = V max-pp , and N is the number of bits in a sample.
Fig 1.
Impact of oversampling by a factor M.
Signal Acquisition and Conversion:
The standard way to obtain the discrete signal is to use a sample-andhold (S&H) circuit that remembers the analog discrete 
When the S&H device is not used (due to its expense) and the analog input signal is fed directly to the ADC, another constraint must be considered: the signal must change slowly so that it does not exceed one quantization step during the conversion time. This critical time is called the aperture time, t a . It can be shown that the aperture time can be calculated from [Kins13a]
where (•) max denotes the maximum time derivative of the analog signal. Thus, the conversion time must be faster than the minimum aperture time 
The actual sampling frequency is selected from the worstcase rolloff frequency Notice that the sampling should never be done at the Nyquist frequency in order to avoid phantom DC reconstruction (when the samples are all zero at the zero crossings of the periodic signal). As shown in Fig. 1 , f S > f N , or T S < T N . For example, the telephone-quality speech has a bandwidth from 300 Hz to 3,300 Hz. Thus, the Nyquist frequency is f N = 2×3300 = 6600 sps (samples per second). The sampling is done at f S = 8 ksps (kilosamples per second), which is over 20% higher than f N .
Implementation Issues:
Designing the LPF is not trivial, as steep-skirt filters are difficult to implement, and may become oscillators (e.g., [21, Fig. 6] ).
In high-speed systems, a buffer is also necessary between the LPF and the ADC to separate the filter from the converter in terms of driving capabilities and frequency independence (e.g., [22] , [20] ).
The sampling clock feeding the ADC must also be designed for stability and low jitter to reduce noise (e.g., [21] ). Still another part of the circuit is a high-accuracy full-scale voltage reference required by the ADC. It can be designed using a pulse-width modulation (e.g., [23] , [24] ).
QUEST TO REDUCE N Q
The impact of the quantization noise, N Q , on the number of bits is very pronounced [34] . Spreading of the noise over a wider bandwidth has worked in many signal processing areas, and has been applied in the ADC area.
Theoretical SNR
The signal to noise ratio, SNR, is defined as the ratio of the useful signal to the unwanted noise in the system. The ratio can be expressed in terms of power or magnitude. This measure is very important as it is linked to many performance characteristics of a communications or signal processing system. One of the noise components in a data acquisition system is its unavoidable quantization noise, N Q .
The ADC can be linear or nonlinear [Kins13a], as defined by its signal transfer characteristic (STC). The ADC is said to be linear if all the quantization steps, Q, are equal. The maximum quantization error (noise) is N Qmax = Q or N Qmax = ±Q/2, depending on the ADC implementation. For that maximum quantization error, root-mean squared value of the error is [Kins13a]
Since all the frequencies in the bandwidth B are equally probable, the distribution of the quantization noise is uniform. For a full-scale sine-wave, the theoretical SNR is [15] 
Oversampling to Reduce N Q
Since the quantization noise, N Q , affects the performance of a system, can its impact be reduced? Yes, the impact can be reduced if the signal is oversampled [5] by a factor M = f s /(2f B ), as shown in Fig. 3 .
The quantization noise is reduced because it is now spread over a wider bandwidth, and much of the noise is removed by a LPF which can now be a digital filter. Most of such digital filters are the finite-impulse response (FIR) LPF because their phase response is linear (e.g., Laco07, Fig. 7] ). Digital filters with very steep skirts can be implemented reliably (e.g., [25] , [30] ).
The digital LPF is followed by a decimation stage to reduce the rate of data from the oversampled Mf S to the original sampling rate, f S .
An added advantage of oversampling is the much flatter rolloff of the analog LPF that can now be implemented with a lower-order LPF. In order to improve the SNR more efficiently, we should abandon the uniform spreading of the N Q distribution in favour of a non-uniform N Q distribution. This process is called noise shaping. If we could shape the distribution so that it could be skewed to the right, then the LPF would remove a much larger portion of the unwanted part of the noise. Figure 4 shows the noise shaping for different degrees of complexity. The uniform spectrum of the noise transfer function (NTF) represents the spreading due to the standard oversampling on any ADC [19] . The nonlinear first-order shaping skews the spectrum to the right, so less noise is retained in the bandwidth B of interest (where the curves intersect). Higher-order shaping provides even more gain.
In order to achieve this non-linear noise shaping, oversampling alone will not be sufficient. We must introduce some sort of feedback so that past values of the noise could be reused not just once, but many times in an iterative cascade cycle, as discussed next. 
ΔΣ CONCEPTS
Evolution of Basic Concepts
The delta-sigma (ΔΣ) analog-to-digital (ADC) is used in many applications such as voice, audio, and highresolution measurements. Such devices are slower than flash and successive approximation (SA), but have a much higher effective resolution.
The ΔΣ concept was developed in France in 1946 [7] , US Bell Labs in 1952 [6] , and Holland's Phillips Labs in 1952 [11] . The key ideas and historical evolutions is discussed by Hauser [10] , Azis et al. [1] , Kester [12] , [13] , [14] , and Wooley [35] .
The objective of the ΔΣ ADC is the same as before: convert an analog input signal in the range [v min , v max ] to its digital equivalent with a resolution of N bits per sample, while satisfying the Nyquist-sampling and aperture-time constraints.
The key concepts behind the ΔΣ ADC are: (i) oversampling, (ii) quantization noise spreading and shaping, (iii) digital filters, and (iv) decimation.
Classes of Converters
PWM-Modulator-Based ADC:
The majority of standard serial A/D converters change the sampled signal into a time gate whose duration is proportional to the magnitude of the sampled signal. The single time-gategeneration part of the ADC can be considered a pulsewidth modulator (PWM). During the time-gate pulse, clock pulses are counted. Since the clock frequency, T CK , is constant, the clock count is proportional to the sampled signal, v j . The clock frequency is selected to reach the maximum count for the maximum duration of the time gate. This scheme is illustrated in Fig. 5a . Figure 5b shows the second class of ADC devices. They convert the sampled signal into a unipolar pulse stream whose frequency is proportional to the magnitude of the sampled signal. This charge-pump device can be considered as frequency modulator (FM), and is called voltage-to-frequency (V/F) converter. This pulse stream has a low rate for low voltage, and a high rate for a high voltage. The pulse stream can be considered to be a variable clock which is counted during a CEEA Conf. 2014; Paper 107 Canmore, AB; June 8-11, 2014 -5 of 8 -constant time gate. The duration of the gate is calculated so that the counter reaches its maximum for the highest frequency of the pulses.
Frequency-Modulator-Based ADC:
Delta-Sigma-Based ADC:
The ΔΣ scheme is an extension of this V/F converter in which the sampled input signal (scaled to [-1, 1] for convenience) is also converted to a pulse stream, but now bipolar, with a fixed frequency, a fixed amplitude of -1 or +1, and a fixed duration of τ. This part of the converter is called the delta-sigma (ΔΣ) modulator (DSM). The pulse stream is filtered out by a digital LPF, and its output is decimated to obtain the desired data, as shown in Fig. 5c .
Selection of Pulse Polarity
The polarity of the pulses is selected in such a way that the time average of the pulse stream converges to the sampled input signal to an arbitrary degree of accuracy. The more pulses generated in the each interval T S , the greater the accuracy and precision. This is illustrated in Fig.  6 . To demonstrate the simplicity of the idea, let us consider the desired equality between the input signal sample v j and a measure µ j of the bipolar pulse stream between two successive discrete samples
A simple measure is the average of the M pulses, each of duration τ within each sample interval T S . For example (Fig.  6) , the oversampling is M = 24 (i.e., 15 positive pulses and 9 negative pulses). Since the arithmetic sum of the pulses is 6, the measure is µ 1 = 6/24 = ¼. In general, this can be written as
where S jM denotes the total sum of the pulses in the jth sample interval. It is seen from (9) that the duration of the pulses does not matter (i.e., the pulses can be as narrow as practicable, or can reach the entire pulse period T OS .
Sequence Generation
The key question is how the pulse sequence can be generated. Observe that at the end of the sampling period T S , all M pulses are counted, and the following equality should hold
This expression provides a clue about the pulse steam formation during the sampling interval for k < M. That is, after counting k pulses, if the partial sum, S jk , is too small with respect to kv j , the next pulse should be +1. That is,
On the other hand, if the partial sum is too large (or equal), the next pulse should be
This strategy should produce a pulse stream within each sample interval T S that converges to the value of the corresponding input signal sample v j , provided M is sufficiently large. This can be written as a C-like pseudocode, and shown in Algorithm 1 (e.g., [27] ). 
Block Diagram of ΔΣ Modulator
Algorithm 1 can be translated into a block diagram shown in Fig. 7 . It is clear why the modulator is called ΔΣ. Some literature considers this structure as ΣΔ (summer followed by the difference maker).
Fig 7.
Fundamental block diagram of the ΔΣ modulator.
Notice that the above diagram appears to be different from most ΔΣ modulators reported in literature. The alternative diagrams substitute the summer block (Σ) with an analog integrator (∫), the 1-bit quantizer with a 1-bit A/D converter, and the sample delay (z -1 ) with a 1-bit D/A converter. The alternative diagrams represent a more generalized ΣΔ modulator. Our diagram may be easier to understand, at least initially during teaching in a course for a diverse audience. Table 1 lists several steps that follow the ΔΣ algorithm and the bock diagram. It is intended to re-inforce the concept in a class. Step k
Verification of the ΔΣ Modulator
5. DISCUSSION
Noise Averaging
The entire description assumed that the analog signal v(t) was sampled by a S&H device, and that the sample v j was constant throughout the pulse generation in each sample period T S . In practice, the value of the analog discrete sample v j droops (like a leaky memory), and is also subjected to external noise. Another source of the noise is due to clock jitter. The strength of the ΔΣ ADC is that the many pulses tend to average out the noise through the integration process. Such systems have a comb filter transfer characteristic [Kins13a].
How Many Pulses per T S ?
The number of pulses per sample period T S , should be as large as possible in order to achieve accurate convergence to each sample v j , and to smooth out the variability in v j between the samples. The oversampling is often M = 64 or higher (sometimes 32,768). Since f S = 2f B , and f OS = Mf S ,, then f OS = 64 × 2 f B .
Are All the Pulses Needed?
The large number of bipolar pulses are needed to move the quantization noise out of the band of interest. To extract the signal from the pulse stream, we can use a LPF to reduce the bandwidth back to the original value of B. While the pulse stream is shifted into the filter at the oversampling frequency f OS , the computations have to performed at the sampling rate, f S ,. For example, since the telephone-quality speech has a sampling rate of f S = 8 ksps, the oversampling is f S , = Mf S = 64 × 8= 512 ksps.
Pulse Storage
Direct storage of the +1s and -1s is not necessary, as we can convert the bipolar stream into a unipolar 1s and 0s. If necessary, the original stream can be reconstructed without any loss of information.
Actual Devices
Many manufacturers produce ΔΣ ADCs. For example, Analog Devices AD7760 ADC provides 24 bit resolution at up to 2. 
CONCLUDING REMARKS
This paper provides a new approach to teaching the ΔΣ signal conversion. The key ideas behind the standard serial analog-to-digital converters (ADCs) are classified as pulsewidth-modulation (PWM), and frequency-modulation (FM) ADCs. The discussion of various ADC issues includes oversampling and its ability to reduce the unavoidable quantization noise in ADC. This prepares the ground for another strategy to reduce quantization noise through a non-linear transformation that results from a negative feedback (the ΔΣ ADC).
The oversampling leads to linear quantization noise reduction. Oversampling with feedback (either the firstorder or higher-order) leads to a non-linear noise reduction by skewing most of the noise energy away from the baseband region to higher frequencies. This strategy is a special case of a larger dithering strategy to spread the noise over a much larger bandwidth, either linearly or nonlinearly (e.g., [8] , [31] , [36] ).
Another advancement in the ΔΣ converters is the shift from integer oversampling M to fractional oversampling. As with the fractional calculus in control and signal processing, the fractional ΔΣ converters can expand their flexibilities considerably (e.g., [28] ).
Based on the above setting, the main reason for this paper is the discovery of a simple, well-understandable principle behind the formation of the next bipolar pulse in the pulse stream. The principle is easy to verify on paper, easy to implement on a simple computer, and easy to implement on a microcontroller for embedded systems.
