VLSI Design of a neurohardware processor implementing the Kohonen Neural Network algorithm by Rajah, Avinash
  
 
 
 
 
 
 
 
 
 
 
 
PART ONE 
THESIS CONTENT 
 CHAPTER 1 
 
 
 
INTRODUCTION 
 
 
 
This thesis proposes the VLSI design and implementation of a neural network 
hardware, or neurohardware for pattern recognition. The aim is to produce a 
neurohardware that executes the Kohonen Neural Network at massive parallelism for 
high-speed pattern classification and serves as a comprehensive computing platform 
for pattern recognition applications. In this first chapter, the background of the 
domain is discussed, providing the rationale and focus points behind the research.  
 
 
 
1. 1  Background and Motivation 
 
Although conventional logic based computing has been successful in many 
applications, it has not been effective in solving a variety of critical and complex 
problems. At the same time, these perplexing problems are solved trivially and 
routinely in real-time by human beings. It is this intriguing predicament that has led 
to the study of information processing by the human brain and subsequently the 
emergence of artificial neural networks. Artificial neural networks, or simply neural 
networks, attempt to mimic the computational power of the mammalian brain by 
massively interconnecting very simple computational units called neurons (Misra 
1997). The adoption of this similar design philosophy provides neural networks with 
the ability to learn and solve tasks challenging to conventional computing. 
 
 2
Neural networks have found a wide range of applications, with the majority 
associated with the pattern recognition domain. The pattern classification attribute of 
neural networks have been instrumental in the following examples of pattern 
recognition applications; predictive and preventive maintenance, condition 
monitoring, character recognition, speech synthesis, intelligence based medical 
diagnosis and intrusion detection and predictive penetration services in computer 
networks.  
 
As artificial neural networks gain popularity for pattern recognition in a 
variety of application domains, it is critical that these models are able to be executed 
speedily and generate results in real-time (Lindsey 1998). Although a number of 
implementations of neural networks are available on conventional general purpose 
machines, most of these implementations require an inordinate amount of time to 
train or run neural networks and are not able to provide real-time response, especially 
when the network sizes are large. Large network sizes are often required in real 
world applications and execution performances by these machines are simply 
unacceptable. This drawback is apparently due to the fact that general purpose 
computers are traditionally based on the von-Neumann architecture which is 
sequential in nature (Schoenauer 1998). Artificial neural networks on the other hand 
have a parallel structure by conception. 
 
The most obvious solution to this problem is to accelerate the execution 
artificial neural network algorithms is through simulation on dedicated parallel 
hardware. The massively parallel and distributed processing brand of neural 
networks suggest massively parallel hardware as an obvious implementation choice 
to obtain appropriate algorithm-architecture matching and high execution speeds. 
When implemented in parallel hardware, neural networks can take full advantage of 
their inherent parallelism and run in orders of magnitude faster than software 
simulations on sequential machines. Parallel processing with multiple simple 
processing elements working together can provide tremendous speed-ups in neural 
network and produce real-time responses and fast learning phases. 
 
Consequently, a new breed of hardware, termed neurohardware, have 
emerged. Neurohardware is typically defined as dedicated hardware designed to 
 3
implement neural algorithms and take full advantage of the inherent parallelism in 
neural networks through parallel processing. At present, a wide spectrum of 
neurohardware implementations that primarily differ in terms of performance-space 
compromise, degree of parallelism and system architecture approaches are available. 
However, with the continuous burgeoning of neural paradigms and increasing 
applicability of neural networks in real-time based applications, there is a great 
demand and market for massively parallel and dedicated neurohardware.  
 
 
 
1.2 Problem Statement 
 
Neurohardware providing parallel execution platforms for neural networks 
can adopt two different architectural directions in doing so; general-purpose 
architectures that emulate a wide range of neural network models, and special-
purpose architectures dedicated to a specific neural paradigm (Ruckert 2002). 
Dedicated implementations are able to be built at a low hardware cost to execute the 
algorithm more quickly and efficiently compared to general-purpose architectures. 
The speed achieved by special-purpose architectures is unattainable by general-
purpose architectures (Liao 2001). Therefore, special-purpose architectures would be 
viable for neurohardware targeting high-speed execution of specific neural 
paradigms. 
 
Neural network can be effectively grouped into three categories that are 
distinguishable by their learning approaches; supervised, reinforcement and 
unsupervised (Cheang 2003). Unsupervised learning possesses a number of 
advantages over the other types of learning, which includes faster training and 
execution for large networks. This brand of networks would be suitable for pattern 
recognition problems, given their ability to detect structures and relations in data that 
are not so apparent. One such neural paradigm that has been successful in pattern 
classification and recognition applications is the Kohonen neural network (NN) 
algorithm. The Kohonen NN is a proven algorithm and could be easily mapped onto 
hardware than other neural paradigms (Glesner and Pochmuller 1994).  
 
 4
In developing the ASIC implementation of neurohardware, two main stream 
technologies can be considered. FPGAs have evolved tremendously under the current 
advancements of VLSI process technologies. The flexibility and reconfigurability of 
FPGAs advantageously support parameterized designs and rapid prototyping of 
advanced hardware architectures. VLSI implementations on the other hand are able 
to guarantee higher compaction and speed for hardware designs, compared to 
FPGAs. However, both technologies can be jointly utilized and advantageously 
exploited for implementation of neurohardware.  
 
 
 
1.3  Objectives 
 
 From the discussion in the preceding sections, the objectives of the work 
presented in this thesis are as follows: 
 
1) To design a neurohardware that provides high-speed pattern classification. 
The Kohonen NN algorithm is to be implemented, in a massively parallel and 
dedicated manner, to deliver high-speed pattern classification. 
   
2) To design a neurohardware that serves as a comprehensive computing 
platform for pattern recognition applications, based on the Kohonen NN 
algorithm. 
 
3) To propose a viable VLSI implementation approach for the designed 
neurohardware and to develop a prototype for demonstration of real-world 
pattern recognition applications. 
 
 
 
 
 
 
 
 5
1.4 Scope of Work 
 
Neuroprocessor
GUI
Nios
Standard
Peripherals
Avalon
Bus
Interface
Controller
A
v
a
lo
n
 
B
u
s
Nios CPU
Nios Library
Module
PCR
Module
UART
Host PC
Array Computation Engine
Neuro Co-processor
Neuro Core
 
Figure 1.1 : Top-Level Block Diagram of UTM-Neuroprocessor 
 
 
Based on the outlined objectives, the neurohardware design illustrated by 
Figure 1.1, the UTM-Neuroprocessor, is proposed in this work. The scope of work 
involved in producing the proposed neurohardware is as follows: 
 
1) Comprehensive study of the Kohonen NN algorithm and determining 
necessary algorithmic modifications for efficient hardware implementation of 
the algorithm.  
 
2) Design and FPGA prototyping of the UTM-Neuroprocessor, using the Altera 
Nios embedded system development kit. The Neuro co-processor that 
executes the Kohonen NN at massive parallelism is designed using VHDL. 
 
3) Development of the software components of the UTM-Neuroprocessor. The 
embedded software component, the PCR module, is developed in C while the 
GUI program is developed using Visual Basics. 
 
 6
4) VLSI implementation and fabrication of the array computation engine in the 
AMI 0.5 µm process technology, to realize implementation of the UTM-
Neuroprocessor in the proposed combined FPGA-VLSI approach. 
 
5) Demonstration of real-world pattern recognition datasets on the UTM-
Neuroprocessor. Datasets from selected application domains are used to 
verify the classification speed and viability of the neuroprocessor as a 
computing platform for pattern recognition applications. 
 
 
 
1.5 Research Contributions 
  
1) A systematic study and modification of the Kohonen NN algorithm for 
efficient hardware implementation. 
 
2) A comprehensive design and prototyping flow for FPGA-based embedded 
systems using the Altera Nios development kit. 
 
3) A viable ASIC design methodology for the UTM-ECAD VLSI Laboratory, 
based on the AMI 0.5µm process technology. The design methodology 
incorporates industry standard EDA tools and is applicable from design entry 
stages to tape-out. 
 
 
 
1.6 Organization of Thesis 
 
 The work in this thesis is presented conveniently over eight chapters. The 
first chapter outlines the motivations and objectives of the thesis and subsequently 
presents the scope of work involved in meeting the research objectives. 
 
 7
 The second chapter provides brief summaries of the literature reviewed prior 
to engaging in the mentioned scope of work. Review of literature on previously 
attempted efforts assists in achieving the research objectives. 
 
 Chapter three presents the VLSI design methods and tools adopted in 
producing the FPGA prototyping and ASIC implementation of the neurohardware in 
this work.  
 
 Chapter four provides a detailed discussion on the implemented Kohonen NN 
algorithm and outlines the necessary algorithmic modifications. Based on the 
modifications, the architectural and design specifications of the neurohardware is 
ascertained. 
 
 Chapter five delivers a description of the top-level architecture and behaviour 
of the UTM-Neuroprocessor. Subsequently, the focus is shifted to the design of the 
hardware module which implements the Kohonen NN algorithm at massive 
parallelism for the neuroprocessor, is detailed elaborately in the chapter. 
 
 Chapter six dwells into the embedded system design and prototyping of the 
UTM-Neuroprocessor using the Altera Nios development kit. The software 
components of the neuroprocessor are also discussed in the chapter. 
 
 Chapter seven focuses on the VLSI implementation of the array computation 
engine, for implementation of the UTM-Neuroprocessor in the combined FPGA-
VLSI approach. The chapter also presents ASIC design and fabrication of a 
prototype design of the computation engine, termed the Array_2x2 microchip, in the 
AMI 0.5 µm process technology. 
 
 Chapter eight provides details the application demonstration work on the 
UTM-Neuroprocessor, using real-world pattern recognition datasets. Performance 
evaluation and benchmarking of the neuroprocessor against previous works are also 
reported by the chapter. 
 
 8
 In the final chapter of the thesis, the research work is summarized and 
deliverables of the research are stated. Potential extensions and improvements to the 
design are also given. 
 
 
 
1.7 Summary 
 
In this chapter, a brief introduction was given to the background and 
motivation. The need for neurohardware that executed neural networks at massive 
parallelism for high-speed pattern classification was identified. Correspondingly, 
several objectives were outlined to meet this need in the research. The UTM-
Neuroprocessor was proposed to fulfill the objectives of the research. The UTM-
Neuroprocessor executes the Kohonen Neural Network at massive parallelism for 
high-speed pattern classification and serves as a comprehensive computing platform 
for pattern recognition applications. The following chapter will discuss some 
literature relevant to producing the proposed neurohardware and covers previous 
works accomplished on the design of neurohardware for the similar objectives. 
