

# Design of LMS algorithm for noise canceller based on FPGA

Sheikh Md. Rabiul Islam, A. F. M. Nokib Uddin

Department of Electronics and Communication Engineering Khulna University of Engineering and Technology, Khulna-9203, Bangladesh E-mail: robi@ece.kuet.ac.bd, nokib.ece@gmail.com

## Abstract

This paper presents the design of an adapting filtering method to remove the noise in the biomedical signal records. The major concern about analyze the presence of various artifacts in ECG records and modular artifacts in EEG records caused due to various noise factors. Here, we have proposed a design based on LMS (Least Mean Square) algorithm to remove the artifacts from biomedical signal using Verilog HDL based on been mapped on commercially available FPGAs (Field Programmable Gate Arrays). In this design the LMS algorithm used as a noise canceller and the reference signal was adaptively filtered and subtracted from primary signal to obtain the estimated biomedical signal. The original biomedical signal can be reconstructed by passing the digital bit stream through a low pass filter. This design is suitable for its low power biomedical instrument design and it reduces the whole system cost.

**Keywords:** LMS algorithm, noise canceller, Verilog HDL, artifacts, biomedical signal, Low power application.

## I. Introduction

In the digitizing biomedical signal analysis, the adaptive Noise Cancelling is a technique to remove the unwanted noise affecting the desired signal within an electronic system. The application of adaptive Noise Canceller is increased in modern communication as it has the advantages of easy implementation, low computational complexity and low power application. This technique can also be applied to low voltage application, high frequency signals and low power circuit design. But in all the high speed applications of Adaptive noise canceller usually high speed DSP processor is used with dedicated hardware implementation [Di Stefano A.; Scaglione, A.; Giaconia C(2005)]. Digital signal processors have a wide variety of applications in the biomedical signal analysis. Now days, it's becoming increasingly important in our daily life but it imposes the constraints on area, power, low voltage, speed and cost. The most commonly used design tools used for hardware implementation are Application Specific Integrated Circuits (ASIC), Digital Signal Processors (DSP) and FPGAs [Tian Lan,Jinlin Zhang(2008)].

Among the various filtering method, LMS (Least Mean Square) algorithm is one of the most widely used adaptive filtering algorithms. The significance of LMS algorithm is its simplicity and robustness, good tracking capabilities both in terms of computational load and easiness of implementation. Moreover, it does not require correlation functions, nor does it require matrix inversion. LMS algorithm is a stochastic implementation of the steepest descent algorithm [Zheng-wei Hu; Zhi-yuan Xie(2009].LMS

requires only a finite-impulse response filter and a first order weight update equation. Therefore, it has been successfully applied to a wide variety of adaptive filtering problems, including plant identification and noise cancellation applications [Boo-Shik Ryu; Jae-Kyun Lee; Joonwan Kim; Chae-Wook Lee(2008)].

The contributions of this research were designed of an adapting filtering method to remove the noises .The application of this method to enhance performance of adaptive noise cancellation in biomedical signal. The focus of this paper is on tuning of the LMS algorithm, based on XILINX SPARTAN XC2S150 FPGA board processor for noise removals.

This paper is structured as follows. Section II describes background of adaptive filtering method such as LMS algorithm. Section III shows proposed architecture for LMS algorithm. Section IV demonstrates the Synthesis Results of the proposed architecture. Finally in Section V, a conclusion is presented.

## II. Background Study

Least mean squares (LMS) algorithms[Boo-Shik Ryu; Jae-Kyun Lee; Joonwan Kim; Chae-Wook Lee(2008) , A. Elhossini, S. Areibi and R. Dony (2006)] are a class of adaptive filter used to mimic a desired filter by finding the filter coefficients that relate to producing the least mean squares of the error signal (difference between the desired and the actual signal). It is a stochastic gradient descent method in that the filter is only adapted based on the error at the current time. The optimal method cannot be performed in real-time, because the entire signal must be given in order for the solution to work. The LMS solution avoids both of these problems by adapting the filter weights.



Fig.1 Block diagram of adaptive noise canceller.

As shown in Fig 1, an adaptive noise canceller is dual input, closed loop feedback system. The two input are, primary input signal y(i) ( the desired signal corrupted by noise) and the reference input signal x(i) (an interfering noise supposed to be uncorrelated with the desired signal but correlated with the noise affecting the desired signal in an unknown way).

The basic idea of LMS algorithm is the adjustment of the filter parameters let the mean squares error between the filter output signal and the expectations output signals be smallest, such as system output is the best estimate of useful signal. Based on the steepest decline of the least mean square error (LMS) algorithm iterative formula is as follows:

Corrected biomedical signal,

$$e(i) = y(i) - \sum_{i=0}^{n-1} w(i) * x(i)$$
(1)

In the equation, where n is the filter order. We use the second order adaptive filter in this proposed design of LMS algorithm. In equation 1, the tap inputs x (i), x (i-1)..... x (i-n+1) from the elements of references signal x (i), where n-1 is the number of delay elements.

Coefficient updating equation,

$$w(i+1) = w(i) + \mu * x(i) * e(i)$$
(2)

While adapting the weights, the step size  $\mu$  has chosen properly so that the algorithm converges. It's found that step size should be within 0 and 2/tap input power. In this proposed design we use step size as 1. In equation 2 the weights update in accordance with the estimation error.

## III. Proposed Architecture For LMS Algorithm

In this section, block diagram of our proposed architecture based on LMS algorithm as a noise canceller has been represented It has two section one is data bus and other is control bus including internal connection as well as two input signal and one output signal respectively. As we considered the input signals were primary input and reference input.



Fig.2: Block diagram of the system of LMS algorithm.



In Fig.2  $Pro\_PC$  block is used to control the output state of the PC (program counter)for execution process. The operation of this devices as the primary input data and reference input data bits are saved in the internal register of the Input block. When the PC starts counting the data in the execution process at the same time the Control block starts to control the operation of this device automatically. This control unit also performs the educational specification of the proposed LMS algorithm architecture and controls all the subroutine of the system. On the other side *RAM* is the permanent memory of this architecture. The arithmetic and logic unit (*ALU*) performs the all arithmetic and logic operation of the proposed architecture and save data temporarily. The Register *A*, *B* and *C* were used to save data for performing the operation of *ALU* and the output block is used to get the output. Rest of the blocks are connect through the data and control bus. In the proposed architecture there is a common bus design to perform the blocks internal connection.



Fig.3 Block diagram of proposed architecture including control bus.

### **IV. Synthesis Result**

The simulation result of the LMS algorithm for the proposed architecture contains the block diagram and the timing diagram of each block.Fig.4 shown the basic devices of noise canceller for biomedical signal processing.











As in Fig.5 shown the block diagram of *Pro\_PC* and also in Table .I shows the input-output relationship of the *Pro\_PC* block. The output of *Pro\_PC* block goes to the *Pro\_In* input of the *PC* block. In Fig.7 it shows that the output of *PC* remains unchanged of the corresponding input.

TABLE. I OUTPUT OF PC ACCORDING TO THE CORRESPONDING OUTPUT OF PRO\_PC.

| Input of         | Output of     | Output of  |
|------------------|---------------|------------|
| pro pc(main clk) | pro pc/input  | pc(binary) |
|                  | of pc(binary) |            |
| 0                | 0             | 0          |
| 1                | 01            | 0          |
| 0                | 01            | 0          |
| 1                | 10            | 0          |
| 0                | 10            | 0          |
| 1                | 11            | 01         |
| 0                | 11            | 01         |
| 1                | 0             | 01         |
| 0                | 0             | 01         |
| 1                | 01            | 01         |
| 0                | 01            | 01         |
| 1                | 10            | 01         |
| 0                | 10            | 01         |



| 1     | 11 | 10 |
|-------|----|----|
| 0     | 11 | 10 |
| So on |    |    |



Fig.6: Block diagram of PC (Program counter).

Fig.8 & Fig.9 shows the block diagram of ALU function and its the timing diagram that the execution of the multiplication and subtraction of data is done. We have given the data at pins a and b in the block and the multiplication results C goes out in the pin *out1* and the subtraction result of data a and c is at the pin *out2*.

| u File                 | Export              | Edit  | Bus           | Libra  | ries             | Proj                    | ect    | Editor | Sim      | ulate             | Rep       | ort     |
|------------------------|---------------------|-------|---------------|--------|------------------|-------------------------|--------|--------|----------|-------------------|-----------|---------|
| Simu                   | late Pr             | oject |               | •      | Auto             | Run                     | Т      |        | <b>N</b> | > <b>&gt;</b> > > | STOP      | M       |
| Add Signa<br>Add Clock |                     |       | Delay<br>Hold |        | ip San<br>it Mai |                         | HIG    |        |          |                   | INV<br>⊃® | al<br>≊ |
| 34.00ns                | s <mark>34.0</mark> | Ons   | Ons           |        | 1                | 50ņ:                    | в ,    |        | 100      | ns                |           |         |
|                        | tb.                 | clk1  | $\mathcal{N}$ | $\sim$ | $\sim$           | $\overline{\mathbb{N}}$ | $\sim$ | 7      |          |                   |           |         |
|                        | tb.e                | ena1  |               |        |                  |                         |        | _      |          |                   |           |         |
| tb.                    | pro_in1             | [0:1] | 0             | )( 1   | X                | 2                       |        | _      |          |                   |           |         |
|                        | tb.cl               | ear1  | Л             |        |                  |                         |        |        |          |                   |           |         |
| tb.p                   | oc_out1             | [0:4] | 9             |        | 0                |                         |        |        |          |                   |           |         |
|                        |                     |       |               |        |                  |                         |        |        |          |                   |           |         |

Fig.7 Timing diagram of the PC.





Fig.8: Block diagram of ALU.

| Add Signal Add Bus<br>Add Clock Add Spacer | Delay Setup Samp<br>Hold Text Mark |        | / TRI VAL |       |
|--------------------------------------------|------------------------------------|--------|-----------|-------|
| 125.0ns 125.0ns                            | Ons                                | 50ns   | 100ns     | 150ns |
| tb.alu_clk1                                |                                    | $\sim$ | $\sim$    |       |
| tb.aclr1                                   | <u> </u>                           |        |           |       |
| tb.a1[0:7]                                 | 'bx                                |        | 4         |       |
| tb.b1[0:7]                                 | 'bx (                              |        | 2         |       |
| tb.c1[0:7]                                 | 'bx )                              |        | 3         |       |
| tb.ena1[0:2]                               | 'bx (1) 6                          | ( 5 )  |           | 7     |
| tb.out11[0:7]                              | bx (                               |        | С         |       |
| tb.out22[0:7]                              | "bx                                | X      |           | 2     |

Fig.9 Timing diagram of ALU.

| 🕂 File Export Edi                         | t Bus Libraries Project Editor Simulate Report View                                                                                                                                                                                                          |
|-------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Simulate Projec                           | t 🛛 🗸 Auto Run 🔢 🔃 🕨 🕪 🚥 🕅 🔍 🛛                                                                                                                                                                                                                               |
| Add Signal Add Bus<br>Add Clock Add Space | Delay         Setup         Sample         HIGH         LOW         TRL         VAL         INVal         VHI         WLI           Hold         Text         Marker         Imple         LOW         TRL         VAL         INVal         VHI         WLI |
| 0.000ps 0.000ps                           | Ons  50ns  100ns  150ns                                                                                                                                                                                                                                      |
| tb.ena                                    |                                                                                                                                                                                                                                                              |
| tb.out                                    |                                                                                                                                                                                                                                                              |
| tb.clk1                                   |                                                                                                                                                                                                                                                              |
| tb.cli                                    | <u></u>                                                                                                                                                                                                                                                      |
| tb.in[0:7]                                | 0 3D                                                                                                                                                                                                                                                         |
| tb.aout[0:7]                              | <b>bx</b> X 3D                                                                                                                                                                                                                                               |
|                                           |                                                                                                                                                                                                                                                              |

Fig.10 Timing diagram of Register A.

In the timing diagram as shown in Fig.10, 8 bit data 3D is stored in register A and after some time intervals it is read. The operation of register B and C are same as register A. The block diagram of input block, the input signals of the algorithm is temporarily saved and transferred to the *RAM*.



The block and timing diagram of *RAM* as shown in Fig.11& 12, where the resultant data 7 was saved in memory location 1 in the *RAM* and after that the data is shifted from this location 1 to location 2 and some instance the data is read from location 2.

After the synthesis of all the results the control unit perform final execution of process data and got the signal *A* in hexadecimal format and makes a 26 bit control signal as shown in Fig.14.Table.II showed the different pin configuration of this proposed design.







Fig.12 Timing diagram of RAM.



Fig.13 Block diagram of Control unit.



| 🗛 File 🛛   | Export Ed            | it Bus | Libraries                           | Proje  | ct E | ditor | Simu | late | Repo  | rt Vie | ew C | )ptions |
|------------|----------------------|--------|-------------------------------------|--------|------|-------|------|------|-------|--------|------|---------|
| Simula     | nte Projec           | t      | ▼ Au                                | to Run | В    |       | • •  | •••  | STOP  | € G    | Ve   | rilog   |
| Add Signal | Add Bus<br>Add Space |        | <mark>ay</mark> Setup S<br>d Text N |        | нідн | LOW   | TRI  | VAL  | INVal | WHI    | WLO  | HEX     |
|            | 1.000ns              |        |                                     | 50ns   | _    |       | 100r |      |       | 150    |      |         |
|            | tb.clk1              |        |                                     |        |      |       |      |      |       |        |      |         |
|            | tb.in1[0:4]          |        | 0 19                                |        |      |       |      |      |       |        |      |         |
| tb         | .out1[26:1]          |        | ©(0) 48000A                         |        |      |       |      |      |       |        |      |         |

Fig.14 Timing diagram of Control unit.

TABLE. II CONTROL SIGNAL AND ITS ASSOCIATED PINS TO THE OTHER BLOCK

| Module Name           | Associated control pins with                               |
|-----------------------|------------------------------------------------------------|
|                       | blocks signal                                              |
| PC ( Program counter) | ena(1)                                                     |
| ALU                   | ena (2,3,4), aclr (5)                                      |
| RAM                   | Ram_clr(6), read(7), shift(8),<br>write(9), addr(10,11,12) |
| Device_input          | ena(13), ena1(14), out(15)                                 |
| Device_output         | write(16), out(17)                                         |
| Register_A            | clr_a(18), ena_a(19),                                      |
|                       | out_a(20)                                                  |
| Register_B            | clr_b(21), ena_b(22),                                      |
|                       | out_b(23)                                                  |
| Register _C           | clr_c(24), ena_c(25),                                      |
|                       | out_c(26)                                                  |

The proposed model is loaded on XILINX FPGA board. It is implemented upon XILINX SPARTAN XC2S150 FPGA board processor. When it was implemented on FPGA board processor, the clock frequency of the processor was 100KHz. The proposed model has the same power consumption, signal bandwidth and CMOS technology is used on the XC2S150 processor. Synthesis is the process of constructing a gate level netlist from a register-transfer level model of a circuit described in Verilog HDL. After synthesizing in Xilinx project navigator we got RTL schematic diagram of our proposed design which is shown in Fig.15. Fig.4 shows synthesized RTL schematic of the top level of our proposed design.





Fig.15 RTL schematics of our proposed design.

## V. Conclusion

The architecture of low-power of LMS algorithm has been presented. Due to the lower bandwidth of biomedical signals, this algorithm is feasible. This proposed architecture is very much suitable as noise canceller (removal of artifacts) for biomedical signal because it's use the digital circuit for analyze as evidence of above figures. Using Verilog HDL in designing the LMS algorithm, not only the restraints in analogy circuit can be relaxed, the quantization noise can also be reduced better than any other design technique. After all, the analog biomedical signal can be reconstructed from the digital bit stream of algorithm output by simple low pass filter.

### References

Di Stefano A.; Scaglione, A.; Giaconia C(2005). "Efficient FPGA implementation of an adaptive noise canceller", Computer Architecture, Machine Perception, 2005. CAMP 2005. Proceedings. Seventh International Workshop on Digital Object, Identifier: 10.1 109/CAMP.2005.22.

Tian Lan; Jinlin Zhang(2008), "FPGA Implementation of an Adaptive Noise Canceller", Information Processing (ISIP), 2008, Symposiums on 23-25,Page(s):553 – 558

Zheng-wei Hu; Zhi-yuan Xie(2009) "Modification of Theoretical Fixed-point LMS Algorithm for Implementation in Hardware", Commerce and Security, 2009. ISECS '09. Second International Symposium on Volume: 2 Digital Object Identifier: 10.1 109/ISECS.2009.40, Page(s): 174 – 178

Boo-Shik Ryu; Jae-Kyun Lee; Joonwan Kim; Chae-Wook Lee(2008) "The Performance of an adaptive noise canceller with DSP System Theory". SSST 2008. 40th Southeastern Symposium on 16-18.

B. Widrow, J. R. Glover, J. M. McCool and et al (1975).,"Adaptive Noise Cancelling: Principles and Applications", Proc. IEEE, vol. 63, pp. 1692-1716.

D. Nicolae, R. Romulus(2004), —Noise canceling in Audio signal with adaptive filter", University of Oradea, Vol. 45, pp 599-602

A. Elhossini, S. Areibi and R. Dony (2006), "An FPGA Implementation of the LMS Adaptive Filter for Audio Processing," in Proc. IEEE International Conference on Reconfigurable Computing and FPGAs, pp. 1-8.

S. Haykin(2002) "Adaptive Filter Theory", Prentice- Hall, third edition.

Kim, J., Poularikas, A.D. (2002) "Performance of noise canceller using adjusted step size LMS algorithm", System Theory, 2002.Proceedings of theThirty- Fourth Southeastern Symposium on, 18-19.

G. Long, F. Ling and J. G. Proakis(1989), "The LMS algorithm with delayed coefficient adaptation," IEEE Trans. on ASSP, vol. 37, Sept., pp. 1397-1405.

Widrow B. and Stearns S.D(1985), "Adaptive signal processing" Prentice-Hall, Englewood Cliffs, N.J 07632.

V. Zarzoso, J. Millet-Roig and A. K. Nandi(2000), "Fetal ECG Extraction from Maternal Skin Electrodes Using Blind Source Separation and Adaptive Noise Cancellation Techniques", in: *Computers in Cardiology*, Vol. 27, Boston, MA, pp. 43 1-434.

J.CLiberti, T.S. Rappaport, J.G Proakis (1991), "Evaluation of Several Adaptive Algorithms for Cancelling Acoustic Noise in Mobile radio Environments," *Proc. Of Vehicular Technology Conf.*, pp. 126-132.

Simon Haykin (2002), "Adaptive Filter Theory", 4<sup>th</sup> edition, Prentice Hall, New Jersey.

Oravec, R. Kadlec, J. Cocherova E, Simulation of RLS and LMS algorithms for adaptive noise cancellation in matlab" Department of Radioelectronics, FEI STU Bratislava, Slovak Republic UTIA, CAS Praha, Czech Republic.

Olga Shultseva1, Johann Hauer (2008), "Implementation of Adaptive Filters for ECG Data Processing". IEEE REGION 8 SIBIRCON.

Carlos E. David (1994), Member IEEE, "An Efficient Recursive Total Least Squares Algorithm for FIR Adaptive Filtering", IEEE Transactions on Signal Processing **Volume:**42,Issue2,**Page(s):**268-280.

http://en.wikipedia.org/wiki/Adaptive\_filteradaptive

### **Author Biographies**

**Sheikh Md. Rabiul Islam** received the B.Sc.in Engr. (ECE) from Khulna University, Khulna, Bangladesh in December 2003, and M.Sc. in Telecommunication Engineering from the University of Trento, Italy, in October 2009. He joined as a Lecturer in the department of Electronics and Communication Engineering of Khulna University of Engineering & Technology, Khulna, in 2004, where he is currently an Assistant Professor in the same department in the effect of 2008. He has published 12 Journal and six International conferences. His research interests include Numerical analysis, VLSI, Wireless Communications, signal & image processing, and Biomedical engineering.

**A. F. M. Nokib Uddin** received a B.Sc. Engg. (ECE) in the Department of Electronics and Communication Engineering at Khulna University of Engineering & Technology, Khulna\_9203, Bangladesh. He joined as a research assistant in the department of Electronics and Communication Engineering of Khulna University of Engineering & Technology, Khulna He has published four Journal and two International conferences. His research interests include VLSI, Wireless Communications, signal processing, Pattern recognition.

This academic article was published by The International Institute for Science, Technology and Education (IISTE). The IISTE is a pioneer in the Open Access Publishing service based in the U.S. and Europe. The aim of the institute is Accelerating Global Knowledge Sharing.

More information about the publisher can be found in the IISTE's homepage: <u>http://www.iiste.org</u>

## CALL FOR PAPERS

The IISTE is currently hosting more than 30 peer-reviewed academic journals and collaborating with academic institutions around the world. There's no deadline for submission. **Prospective authors of IISTE journals can find the submission instruction on the following page:** <u>http://www.iiste.org/Journals/</u>

The IISTE editorial team promises to the review and publish all the qualified submissions in a **fast** manner. All the journals articles are available online to the readers all over the world without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. Printed version of the journals is also available upon request from readers and authors.

## **IISTE Knowledge Sharing Partners**

EBSCO, Index Copernicus, Ulrich's Periodicals Directory, JournalTOCS, PKP Open Archives Harvester, Bielefeld Academic Search Engine, Elektronische Zeitschriftenbibliothek EZB, Open J-Gate, OCLC WorldCat, Universe Digtial Library, NewJour, Google Scholar

