In Xilinx ISE 8.2i and SPARTAN 3 FPGA board. 
INTRODUCTION
A microcontroller is a small and low-cost computer built for the purpose of dealing with specific tasks, such as displaying information in a microwave LED or receiving information from a television"s remote control. Microcontrollers are mainly used in products that require a degree of control to be exerted by the user.
Whether using ASIC, FPGA or CPLD based realizations, it is essential to incorporate the microcontroller module, as an integral part of the system. Functional microcontroller has been developed using VHDL coding using structural design of logic blocks which generates control and timing signals used for the data processing operation.
Present day VLSI technology has lead to the design and development of millions of gates on a chip. Hardware designers create several VLSI modules for their research and development purposes. It is often important to re-use these modules to reduce product development time, thereby minimizing the time to market. Therefore, it is important to design hardware in a modular fashion, so that these modules can be included in the development of a complex system. The design and development of such a modular design microcontroller helps other designers to incorporate this module with minimal or no modifications to the hardware module [1] .
LITERATURE SURVEY
Although many innovative methodologies have been devised in the past, to handle more complex control problems and to achieve better performances, the great majority are still controlled by means of simple microcontrollers. When compared to von Neumann processor architectures, the Harvard architecture improves the bus bandwidth as in von Neumann architectures both program and data memory are being accessed through a shared bus.
Majority of previous works done have used VHDL to describe all the modules in the design which is a very useful tool with its degree of concurrency to cope with the parallelism of digital hardware. The VHDL software reduces the complexity and also provides a graphic presentation of the system. The key advantage of VHDL when used for systems design is that it allows the behaviour of the required system to be described (modelled) and verified (simulated) before synthesis tools translate the design into real hardware (gates and wires). This software not only compiles the given VHDL code but also produces waveform results.
[2]
In designing a CPU, first its instruction set needs to be defined, and how the instructions are encoded and executed. The system development starts with top-down planning approach and the blocks are designed using bottom-up implementation. The programs are written and simulated using Electronic Data Automation (EDA) tool like ModelSim.
[2]The main advantage of using a state machine in embedded design consists in its flexibility to add, delete or change the flow of the program without impacting the overall system code structure.
Increasing performance and gate capacity of recent FPGA devices permits complex logic systems to be implemented on a single programmable device. Such a growing complexity demands design approaches, which can cope with designs containing hundreds of thousands of logic gates, memories, high-speed interfaces, and other high-performance components. [4] 3. SYSTEM ARCHITECTURE When compared to Von Neumann processor architectures, the Harvard architecture improves the bus bandwidth as in Von Neumann architectures both program and data memory is being accessed through a shared bus. Thus the architecture implemented is Harvard architecture.
Block Diagram
The RISC processor core provides an 8-bit ALU. The ALU receives its input from two eight bit registers namely the Accumulator (Register A) and Register B. If there is only one operand then that operand will be the Accumulator. The ALU supports simple arithmetic operations like addition, subtraction, increment and decrement; boolean logic operations like AND and OR; data transfer instructions and branching instructions.
Fig -1: General block Diagram

MODULE DEVELOPMENT
Arithmetic Logic Unit
Fig -2: ALU
It is a multi operational combinational logic circuit which performs arithmetic and logical operations like ANDing, ORing, ADDITION, SUBTRACTION, etc. The word length of ALU depends upon internal data bus. It is 8 bit and is always controlled by timing and control circuits.
The inputs to the ALU are "x" and "y", which are 8 bit data and a 4 bit control input "a". Apart from these there are clock, alu_reset and alu_enable inputs. ALU is positive edge triggered. "z" is the 8 bit output from ALU. reg_a_move and reg_b_move are two outputs used for move instructions and also for storing data to accumulator after an instruction is executed.
Accumulator Fig -3: Accumulator
Accumulator is an 8 bit register. It is positive edge triggered. Inputs to the register are control, data_in and reg_enable. Control input is used to decide whether data is to be read from or written to the register. The output of register A "data out" is given as input to the ALU. "a_zero_status" is a flag that reflects the status of accumulator. If content of the accumulator is zero, the flag is set.
Program Counter Fig -4: Program Counter
Program counter is a special purpose register which stores the address of the next instruction to be executed. Microcontroller increments the program counter whenever an instruction is being executed, so that the program counter points to the memory address of the next instruction to be executed. The inputs to the program counter are input(4 bit), opcode(4 bit), pc_enable, pc_reset and a_zero_status. Output of program counter is given as the 4 bit address to ROM.
Program Memory
Fig -5: Program ROM
The program is stored inside a 16*8 ROM. It is positive edge triggered. The inputs to the ROM are addr and rom_enable. addr is a 4 bit address. Opcode is stored in the memory location given by this addr. When rom_enable is high, this opcode will be available at the output pin dout.
Instruction Register
Fig -6: Instruction Register
Instruction register is an 8-bit register just like every other register of microcontroller. The instruction may be anything like adding two data, moving data etc. When such an instruction is fetched from memory, it is directed to instruction register. So the instruction registers are specifically to store the instructions that are fetched from memory.
Timing and Control Unit
Timing and control unit is a very important unit as it synchronizes the registers and flow of data through various registers and other units. This unit consists of an oscillator and controller sequencer which sends control signals needed for internal and external control of data and other units.
I/O Interface
An I/O interface is required whenever the I/O device is driven by the processor. The interface must have necessary logic to interpret the device address generated by the processor. If different data formats are being exchanged, the interface must be able to convert serial data to parallel form and vice-versa. There are direct and register MOV operations. In direct addressing, the content of the specified address is moved to the accumulator. In register addressing, the contents of the source register is moved to the destination register (either A to B or vice versa).
STATE DIAGRAM
The IN instruction inputs a value and stores it into A. The IN state waits for the Enter key signal before looping back to the START state. In doing so, several values can be read in correctly by having multiple input statements in the program. Notice that after the Enter signal is asserted, there is a zero state that waits for the Enter signal to be de-asserted, i.e. for the Enter key to be released.
The OUT instruction copies the content of the accumulator to the output port.
The JMP instruction loads the PC with the specified address of the IR. The JZ instruction loads the PC with the specified address if A is zero. Loading the PC with a new address simply causes the CPU to jump to this new memory location. The JNZ (Jump Not Zero) instruction tests to see if the value in A is equal to 0 or not. If A is equal to 0, then nothing is done. If A is not equal to 0, then the last four bits of the instruction, designated as addr in the encoding, is loaded into the PC. The four bits, addr, represent a memory address. When this value is loaded into the PC, we are essentially performing a jump to this new memory address, since the value stored in the PC is the location for the next fetch operation.
NOP instruction performs no operation.
The INC and DEC instructions increment and decrement the content of A by 1 respectively, and store the result back into Accumulator.
Once the FSM enters the HALT state, it unconditionally loops back to the HALT state, giving the impression that the CPU has halted. Table - Here two eight bit numbers at alu input pins reg_a_data_in (00001110) and reg_b_data_in (00000001) are added together and the result (00001111) appears in the output pin a_move. 
Simulation Result for Addition
SYNTHESIS RESULTS
Design Summary
CONCLUSIONS
In this paper, the design and the development of a basic 8-bit microcontroller has been discussed. 
