Many studies have been developed aiming to improve digital filters realizations, recurring to intricate structures and analyzing probabilistically the error´s behavior. The work presented in this paper analyzes the feasibility of fixed-point implementation of classical infinite impulse response notch filters: Butterworth, Chebyshev I and II, and elliptic. To scrutinize the deformations suffered for distinct design specifications, it is assessed: the effect of the quality factor and normalized cut-off frequency, in the number of significant bits necessary to represent the filter's coefficients. The implications brought to FPGA implementation are also verified. The work focuses especially on the implementation of power line notch filters used to improve the signal-tonoise ratio in biomedical signals. The results obtained, when quantizing the digital notch filters, show that by applying second-order sections decomposition, low-order digital filters may be designed using only part of double precision capabilities. High-order notch filters with harsh design constraints are implementable using double precision, but only in second-order sections. Thus, it is revealed that to optimize computation time in real-time applications, an optimal digital notch filter implementation platform should have variable arithmetic precision. Considering these implementation constraints, utmost operation performance is finally estimated when implementing digital notch filters in Xilinx Virtex-5 field-programmable gate arrays. The influence of several design specifications, e.g. type, and order, in the filter's behavior was evaluated, namely in regard to order, type, input and coefficient number of bits, quality factor and cut-off frequency. Finally the implications and potential applications of such results are discussed.
Introduction
Notch filters are very important in a wide variety of instrumentation applications, from telecommunications to biomedical signals processing, where often it is necessary to remove a narrow band or even a single frequency of the measurement signal. Digital implementation of these filters is preferable to an analog implementation due to drift absence and straightforward design of higher quality factors. Nevertheless, digital filter implementation has accuracy limitations due to the arithmetic's finite precision [1−4] , an issue that is much more significant in fixed-point arithmetic than in a floating-point one.
Due to the ease of designing and calculating the coefficients of high-performance digital finite and infinite impulse response (FIR and IIR) filters, the filter outcome is taken for granted, but, particularly if dealing with limited capacity fixed-point platforms (such as microcontrollers, digital signal processors, and field-programmable gate arrays) or with very demanding design constraints, the filtering stage may have a pernicious effect on the signal, completely missing its purpose.
This problem has been studied [2−9] and, disregarding additional error sources originating from the A/D and D/A conversions, the three key error types are: I. Quantization of the input signal into a finite set of discrete levels; II. Representation of the filter coefficients by a short number of bits; III. Propagation of rounding errors occurring in arithmetic operations.
To evaluate these errors influence in the final filter output, several approaches have been proposed [2−9, 12−14] . If input quantization errors, denoted above as type-I, are assumed to be random variables with a uniform probability distribution, a number of analysis tools is available to characterize their behaviour [10−14] . Errors of type-III are incessantly subject of reductions through the implementation of novel structure variations [1−2, 5, 15−17] based in state-space structures and direct form I with error feedback, also known as noise shaping or error spectrum shaping [5, 9, 18] .
A short number of bits to represent the filter's coefficients, errors previously denoted as type-II, also have a comprehensive bibliography, reporting studies on important implementation issues. Some instability thresholds due to these errors were derived [6, 19] , not including notch filters, coefficients sensitivity approach [15, 18−20] , and structural changes to minimize the impact of these errors [2, 5, 8, 17] , but none of these approaches considered notch filters.
Considering specifically biomedical applications, some studies have analyzed the digital filters distortion effect on the signal [15] , but the feasibility and the outcome of the implementation has only recently been discussed [21] . Moreover, several biomedical studies ignore, to some extent, the higher-frequency components of the signals, implementing lowpass filters, or wide band-stop filters. Ballistocardiograms, electrocardiograms, electroretinograms which have sampling frequencies from 200 Hz to 2 kHz, and other biomedical signal high-resolution processing systems benefit from the use of power line notch filters. However, recent applications of biomedical systems tend to use wireless communications, limiting the maximum sampling frequency to about 200 Hz, so the fundamental power line frequency is the only concern. Hence notch filters are apposite to such applications, instead of comb filters, which may be of interest in higher sampling rates.
Since acquisition systems work at distinct sampling rates, the analysis of IIR digital notch filters performance at different normalized cut-off frequencies allows ensuring that most biomedical signals fit in the tested range, and so the conclusions are applicable to a broad variety of digital biomedical signal processing systems. Although IIR filters are known for their phase distortion, the effect of notch filters with a very high quality factor is very limited in frequency and concentrated only in the stop band [22−26] , so it is no major concern. Furthermore, techniques for compensation of distortion have been published [27] . Thus the approach taken is valid.
Subsequently, MATLAB processing capabilities are used to evaluate the fixed-point arithmetic numerical accuracy requirements to realize several types of IIR notch filters, at different design specifications and structures. The performance of a Xilinx FPGA of the Virtex 5 family, when subject to these structural modifications, was assessed using Xilinx ISE 10.1. The discussion and conclusions on the FPGA behaviour complete this work.
Second order filters
Using dedicated filter design software, floating-point double precision coefficients were computed for the following filter types: Butterworth, Chebyshev types I and II, and Elliptic. A normalized notch frequency vector Ω test was considered with 9 points per decade spaced from 10 -4 to 0.3 (totalling 30 points) and a quality factor vector Q test also with 9 points per decade spaced from 1 to 10 4 (totalling 37 points) and filters of even orders from second to tenth were designed.
In view of the fact that the quantization induces pole movement, a stable filter after quantization may become unstable or even if the quantized filter is confirmed to be stable, its outcome may be unacceptable, thus stating that although the poles remain in the interior of the unit circle, the quantization is too coarse and the poles and zeros movement deforms the filter behavior.
To diminish the wandering of poles and zeros, one valuable method is the implementation of the filter in second-order sections (decomposing an N th order filter in the product of N/2 second order filters, provided that N is even), considering that the coefficients' quantization causes minor pole movement than in higher order sections. The impact of this option will also be evaluated.
Filter definitions
The normalized frequency Ω is defined as the ratio between the frequency and the Nyquist rate, thus resulting in units of half-cycles per sample.
The quality factor Q is the ratio between Ω 0 and the bandwidth (difference between upper and lower cut-off frequencies Ω 1 and Ω 2 ), while the notch frequency Ω 0 , the centre of the stop band, is the geometric mean of Ω 1 and Ω 2 . Since results should be parameterized as functions of Ω 0 and Q and filter design algorithms process Ω 1 and Ω 2 , (1) was used to obtain Ω 1 and Ω 2 from design specifications in Ω 0 and Q. .
The filters were implemented using both Direct-Form I and II, depicted in Fig. 1 , for a second-order section. The respective transfer function, H(z), is presented in (2). Stability assessment was made searching for poles of the filter's transfer function, H(z), outside the unit circle. The n bit fixed-point filter deviations to the floating point double precision format design (16 decimal digits of precision in calculations, IEEE decimal64 format) [28] was measured making use of its frequency response magnitude, |H n bit (jΩ)|, root mean square error ε n bit (in dB), using (3) 
where H n_bit (jΩ) and H float (jΩ) the transfer functions of both filters, and Ω freq_resp is a vector with 221 points, varying from the 31.8µ half-cycles per sample of Ω min up to 1 half-cycle per sample of Ω max , in which the transfer function discrepancies will be evaluated. It should be noticed that ε n bit could have been defined in linear units or using the phase or group delay difference, but since the magnitude in dB is the most widely employed method to assess filter response, the parameter ε n bit was chosen to measure directly this dissimilarity in dB. Filter deviations are problematic both in pass and in stop band, since deviations start to manifest in the stop band and afterwards spread to the pass band also. The root mean square (rms) error, defined in (3), copes with this by equally weighting all frequencies.
Filter structures details
Direct-Form 2 implementation of a digital filter is also known as the canonical form [22−26] , this filter realization method uses the minimal number of delay elements, which is equal to the order of the transfer function denominator. The second-order case was presented in Fig. 1 and the transfer function in (2) . The difference equation implementation instead of the usual form, which is the base of Direct-Form I, (4) , is changed to (5).
Implementing ( For Direct-Form I (DF I), as the delays duplicate, the time sequence of the implementation is reduced to half. Therefore, on a parallel computation platform, such as an FPGA, the result is computed in half the time of Direct-Form II (DF II). Another very appealing property of DF I over DF II is that it cannot overflow internally, when fixed-point arithmetic with two's complement is used, and the output signal is in range [23] .
However, for higher order direct-form filters, quantization errors in the filter coefficients grow, especially in notch filters, where the poles and zeros are very close [24] , and DF I hardly can be expanded, which is confirmed by the results of Section 4. One further remark is that when quantizing a floating-point filter implemented in second-order Sections, DF I and DF II will have exactly the same behaviour, as both will have the same coefficients. Therefore, the choice of one over the other will depend only on the characteristics of the implementation platform, namely parallel processing capabilities and memory.
Second order filter results and discussion

Filter stability
Second order band-stop filters of the Butterworth, Chebyshev I and II, and elliptic types were implemented using fixed-point arithmetic and the defined Q test and Ω test vectors. It was found that, for every filter type, only the 16 bit implementation was stable for all (Q test , Ω test ) pairs. The minimum quality factor to design an unstable filter, Q u_min is 40, in the normalized notch frequency, Ω 0 u_min was found as 8×10 -3 . If the number of bits of the implementation changes the filter stability may vary, in some specific zones of the (Q test ,Ω test ) grid. Generally, designing a different filter type, for the same specification of quality factor, normalized cut-off frequency, and number of bits, will not modify the stability. Regarding power line notch filter implementation in biomedical systems, the range of the normalized notch frequencies where the filter is unstable represents an important drawback because implementations with sampling rates from 2 kHz down to 200 Hz will cross the two main instability peaks found in Fig. 2 . Despite this, if quality factors below 40 are tolerable, the implementation of 10 to 16-bit fixed-point IIR notch filters is straightforward.
Filter deviations
Second order band-stop filters of the stated types were implemented using the defined Ω test and Q test vectors. The results obtained for ε n bit in a second order fixed-point Butterworth filter, at a fixed Ω 0 of 0.05, thus situated in the more disturbing zone, are presented in Fig. 3 . A grid is displayed with n from 10 to 16 bits, and the vector Q test .
Smaller values of the quality factor, namely 1, have the higher differences, which is due to the fact that the floating-point filter fixed-point implementation creates a deeper notch than the small deviation due to the fixed-point conversion is able to mimic truthfully. If the quality factor is above 400, the filters have very small notch frequency attenuation and a small amplitude resonance peak both in fixed and floating-point implementations. For quality factor values above 1000 this peak vanishes and the filter acts as an all-pass filter, having no discrepancy from fixed to floating point. To exemplify this behavior in Fig. 4 Butterworth filters are presented as examples in the last two figures, but the other fixedpoint filters have exactly the same characteristics regarding the error and the magnitude response progress with the quality factor and all others also generate resonance peaks at very high quality factors. 
Filter optimization
In these implementations we searched for the minimum coefficient word length that guaranteed stability and the optimal word length, considering the rms error defined in (3). The filter demanding wider coefficient word lengths to guarantee stability was the elliptic filter. The Chebyshev type I was the most demanding to minimize the mean square error to the floating point implementation. The coefficient word length dependency on quality factor and normalized cut-off frequency for these two cases is shown in Fig. 5 . 
Higher order filter results and dicussion
Repeating the design procedure, 4 th , 6 th , 8 th and 10 th order filters were implemented in single section and second-order sections. The results regarding filter stability and quantization effects are subsequently presented.
Filter stability
Regarding the filter stability, Table 1 presents the number of stable filters designed for each order and each filter type, when using single section (SS) and second-order sections (SOS) implementations of both Direct Forms. The total number of pairs in the (Q test ,Ω test ) grid is 1110, thus 1110 is the maximum number of stable filters possible. The maximum coefficient word length allowed was 16 bits. When decomposing the filter structure into second-order sections implementation, the rearrangement of the coefficients allows the minimization of deviations from the poles actual value in such a way that only five elliptic filters of 10 th order are unstable, while the remaining 1105 elliptic filters of 10 th order are stable. All the other filters of every type and order form 4 th to 10 th are stable. In Table 1it is visible that for the 4 th order, the single section implementation is no longer valid, since only 12 to 14.6 % of the filters implemented using this structure are stable. For even higher orders even fewer designed filter implementations are characterized by stability.
Other important result is the maximum and the average number of bits required to ensure that the SOS filters are stable for all the pairs in the (Q test ,Ω test ) grid. The results are presented in Table 2 . The second-order section filters preserve the behavior presented in Fig. 2 . Only a few tens of them require more than 10 bits. When increasing the order the requirements of this residual minority also increase, but only 10 bits are needed for almost every filter implementation.
SOS Filter deviations
The results of previous Section 2.1 indicate the importance of analyzing not only the global (Q,Ω 0 ) mesh but also the zones with more demanding coefficient word length to ensure stability. Table 3 and Table 4 summarize some of the measurements made. The first shows the great increase in the number of bits to ensure stability, for a normalized cut-off frequency value of 0.05, thus in the most critical zone. The second represents the average, for n from 10 to 16 bits, of the root mean square error to the floating point implementation, ε n bit , defined in (3), for a normalized cut-off frequency of 0.05. The average, median and most frequent coefficients' word length were taken as estimates of central tendency. These three quantities highlight the changes verified on this region, by comparison with Table 2 . The averages are significantly larger than generally necessary. The most frequent number of bits increased in several cases, and fluctuations in the median are also noticeable, underlining the volatility of the designs in this region. The results obtained for the average root mean square error at a fixed normalized cut-off frequency of 0.05, thus situated in the most troubling zone, have their minimum in the Chebyshev type II filter, which has minimum deviations in every order. Chebyshev type II deviations to the floating-point implementation are presented in Fig. 6 . Table 5 summarizes the coefficient word length, when optimizing this quantity, for the (Q test ,Ω test ) grid, to ensure the minimum root mean square error from the floating-point implementation. It is displayed the coefficients' average, median, and most frequent word length. The average, median and most frequent coefficients' word length, the three estimates of central tendency taken, illustrate the global behaviour. Although the most frequent is always 10, the fluctuations in the median and in the average, show that this is not the optimum value, and that changes occur among filter types.
SOS Filter optimization
Contrary to what one might a priori expect, it is seen that the 4 th order has the higher average. Behavior verified also for ensuring stability in the critical zone, as previous Table 3 presents. The most demanding filter to minimize the error for all the points in the grid (Q test ,Ω test ) is the 4 th order Chebyshev type II filter. The coefficient word length dependence on quality factor and normalized cut-off frequency in these cases is shown in Fig. 7 . According to these results, it is not possible to obtain an exact expression to determine the optimal number of bits, from the normalized cut-off frequency and quality factor specifications. This was already suggested by the results of second order, as previous Fig. 5 exemplifies.
FPGA Performance
The time sequence of Direct-Form II implementation delay consists of 2 multiplications and 2 additions, whereas in Direct-Form I, the delay necessary to implement a second-order section consists of merely 1 multiplication and 1 addition. Regarding higher order filters implementation, given the previous results, only the performance of SOS realization will be estimated, which will generate an increment in the number of operations proportional to the filter's order increase, e.g. 4 th order SOS will require 4 multiplications and 4 additions in DF II, and half in DF I.
The additional path delay necessary to link the second-order sections will be disregarded, since it is not relevant, thus top performance will be estimated, hence also the latency will be proportional to the order increase from the 2 nd order filter.
Operation latency
The input data will be considered to have 12 and 16 bits, which is a familiar value in FPGA applications for digital filtering [29] , and common in data acquisition boards and analogue to digital converters (ADCs). The results presented were obtained using Xilinx Virtex 5 SX95T speed -3, with timing performance as optimization goal. Pipeline stages were not employed. The fixed point architectures were signed with maximum output precision.
The multiplications have an increase in the latency that would be perfectly linear if the operands always had an even number of bits. However, as seen in Fig. 8 , some combinations using one or two odd operands, sometimes have a nonlinear behaviour, differing from the expected.
The additions have even simpler logic, presenting a linear variation in the number of lookup tables with the dimension of the operands, from 1.552 to 1.842 ns. The latency of an addition is less than one third of the respective multiplication. All these latencies are very small (below 6 ns), due to the simplicity of the logic, and have a behavior that may be approximated by linear functions with reasonable errors, especially if both the operands have an even number of bits. An even number of bits is standard in commercial ADCs, and from the previous results most filters are optimized with an even number of bits, hence the errors are in fact small.
Filter frequency of operation
Since the previous results and the considerations taken so far allow the linear interpolation of the delays with minor errors, it is possible to compute the ceiling performances regarding all the combinations of coefficient word length and ADC number of bits.
The filter maximum frequency of operation, f MAX , estimates are presented in the following Table 6 and Table 7 , for a 12 bit ADC, based on the FPGA latencies determined in the previous section, and the implementation architecture of each Direct-Form after the place-and-route process. Table 6 presents the frequency of operation of the filters, regarding average and maximum coefficient number of bits to ensure stability in all points in the (Q test ,Ω test ) grid. Table 7 displays the frequency of operation associated with the optimum behavior of the filters. The results in terms of coefficient number of bits were presented in Table 2 and Table  5 respectively. The previous tables report the improved performance of DF I over DF II. Moreover it is attested that the relative performance variation for different types of filter is very small. The impact of order change in the maximum frequency of operation is coherent with the abovementioned latency change.
The succeeding tables, Table 8 and Table 9 , present the estimates, for a 16 bit ADC, of the maximum frequency of operation, obtained with the same methodology as the previous results for a 12 bit ADC. The maximum frequency of operation for any other number of bits of the ADC may be computed by extrapolation, since it was observed that the operations' latency increases linearly. It should be recalled that very few products differ from the linear behaviour, these points may be disregarded in the linear regressions without significant error to extrapolate the latency of other products.
The results of Table 8 and Table 9 show that the increase from 12 to 16 on the number of bits of the ADC is irrelevant for DF I, and of small impact on DF II. Hence, the high speed of computation of the Xilinx device is underlined, especially in lower orders. 
Filter performance analysis
The previous results show that SOS notch filters of 2 nd to 10 th order can be implemented with a frequency of operation of tens of MHz, even with extreme conditions of quality factor and normalized cut-off frequency. Furthermore, it was verified that the number of bits of the ADC is of small influence in the performance of the FPGA, which also happens with the choice of the type of filter. The reason to this insensibility is the fact that the Virtex-5 device employs DSP48E modules, which are composed of a 25 by 18 bits multiplier, and an adder, and an accumulator of 48 bits, elements with exceeding capacity to deal with operands with the dimension considered for this application.
Changing from DF II to DF I allows the duplication of the working frequency. Therefore, after confirming that the coefficient number of bits is enough to ensure the required filter response, one should choose the implementation of Direct-Form I. The filter type and ADC number of bits, are not constraints in the performance, and it is sure that its implementation in SOS will have a steady frequency of operation and that the filter will be stable. Stability concerns are minor in view of the fact that only 4 elliptic filters of 10 th order are unstable. The filter order ends up being the only variable to influence the FPGA implementation performance.
Conclusions
In this work the effect of the design specifications was investigated, namely the quality factor and the normalized cut-off frequency, in the number of significant bits necessary to represent the coefficients of an infinite impulse response notch filter. Since implementing these filters using fixed-point arithmetic has much higher accuracy constraints than the common floating-point implementation, the deformations introduced with these specifications were also assessed. The type of structure to implement the filter was also evaluated, and a comprehensive assessment of performance in an FPGA for different ADCs was presented. Since such assessment has never been done, the paper presents an important contribution for the signal processing field, both in notch filter analysis and FPGA implementation.
The first important result found is that it is forbidden to increase the filter's order above the 2 nd if the filter is implemented in a single section. However, the order increase is practically harmless if the filter is decomposed into second-order sections. The simulation results obtained provide comprehensive understanding of the stability requirements. Two critical areas, of quality factor and normalized cut-off frequency values, in which filter stability is compromised for some coefficient word lengths, even for 2 nd order, were found. These critical areas are especially problematic for biomedical signal processing, since the problematic values of normalized cut-off frequency are typical of these applications, but it will only be important if the quality factor considered is very high, hence not affecting standard applications where a lower quality factor may be tolerated, or even necessary because of slight frequency swings.
The filter deviations were measured and were found to be much increased when going from 6 th to 8 th and 10 th order. From the classical families of IIR filters it was seen that Chebyshev type II is the filter family which suffers less with fixed point implementation, and it is also the less demanding in terms of the average number of significant bits necessary to represent the coefficients. The filter deviations in the critical zone were measured, and were found to increase significantly when rising above the 6 th order. Regarding the minimization of deviations to the floating-point implementation, the average word length is near 12 bits for every filter order, but the optimal number of bits is very dependent on the actual values of quality factor and normalized cut-off frequency. The optimal number of bits, regarding root mean square error minimization, has been seen to define extremely irregular surfaces.
The FPGA implementation of these filters was estimated to be very fast, below 80 ns in a Virtex 5 SX95T-3. The device's performance is insensitive to the number of bits of the ADC, the number of bits of the coefficients used in the filter fixed-point representation, and IIR filter type (Butterworth, Chebyshev, or elliptic). Therefore the filter order and the structure chosen for the implementation (Direct Form I in second-order sections preferably) is almost the unique influent parameter to define filter performance. Maximum frequency of operation is obtained for the DF I structure, as the parallel computation capability of the FPGA extracts the maximum profit from the canonical number of operations performed to compute the output.
It is possible to implement dynamical reconfigurable filters changing the filter specifications of type, coefficients' number of bits, quality factor and normalized cut-off frequency, without disturbing the maximum operating frequency of the FPGA. Even changes on the evolving hardware, for instance input ADC and output DAC, are possible without significant performance variations. This is a very powerful study outcome, since it reveals the possibility of creating a single-chip dynamically reconfigurable digital filter with variable precision, and auto-adaptation properties to minimize numerical errors due to the fixed-point implementation, without significant changes in performance if the order is unchanged. Fig. 9 . Example of filtered ECG signal acquired with 50 Hz noise (top), and its amplitude spectrum (bottom). A 4 th order Butterworth notch filter with Q =100 was implemented (red), as well as a recursive 2 nd order Butterworth notch filter with Q =1000 and 100 repetitions. Real quality factors are very similar, but the recursive presents an attenuation of 37.18 dB while the other only attenuates 17.38 dB.
Moreover, given that, for instance any 2 nd order filter in a Xilinx Virtex 5 SX95T-3 requires less than 20 ns to filter the input signal, if the implementation requires a sampling frequency below 1 MHz, the filter could recursively filter its own output at least 50 times, thus completely eradicating the notch frequency from the signal. Such values are perfectly acceptable in several signal acquisition tasks, namely those in biomedical engineering, where notch filters are often necessary, and will not introduce a noteworthy delay. Results for an example ECG signal acquired with a 12 bit ADC are presented in Fig. 9 .
Despite having to introduce a number of practical considerations and adjusts in this conceptualized scenario, this study sustains a number of potential new developments using FPGA in the digital signal processing area, and confirms FPGA as a very powerful solution in the analogue signal processing field as well.
