JAVA-BASED MICROPROCESSOR by Md. Khuzaimah, Mohammad Faiz
JAVA-BASED MICROPROCESSOR
by
Mohammad Faiz bin Md. Khuzaimah
2836
Dissertation submitted in partial fulfilment of
the requirements for the
Bachelor of Engineering (Hons)









Mohammad Faiz bin Md. Khuzaimah
A project dissertation submitted to the
Electrical& ElectronicsEngineering Programme
Universiti Teknologi PETRONAS
in partial fulfilment of the requirement for the
BACHELOR OF ENGINEERING (Hons)








This is to certify that I am responsible for the work submitted in this
project, that the original work is my own except as specified in the references
and acknowledgments, and that the original work contained herein have not
been undertaken or done by unspecified sources or persons.
(MOHAMMAD FAIZ BIN MD. KHUZAIMAH)
Student ID: 2836
ABSTRACT
Java-based Microprocessor is a project aimed to develop a processor that
implements Java virtual machine (JVM) instruction set into the hardware. The objective
of the project is enabling a Java application to be executed without the need of JVM, but
in a more specific term, it is aimed to be an alternative non commercial processor as a
supporting base for educational research and development of embedded systems. With
the current application of Java, the Java Runtime Edition (JRE), an inter medium Java
OS, must be installed in every machine that is intended to execute Java bytecode. This
proved to be inefficient, especially in embedded system where the resources are limited
and upgrading is highly expensive.
The project was developed to be an easily comprehensible HDL, allowing others
to pursue with advancement without complications. Thus, the HDL design were coded
with behavioural style. In order to be more transparent for others to view the project
development, the entire design is being developed by bottom-up approach. Four
modules comprises the entire design - ALU, stacks, program counter and datapath.
These modules were designed individually, allowing a separate test bench and test
parameters, which alsoprovided a betterperspective of the microprocessor design.
The project has already progressed from an 8-bit processor in mind towards a 32-
bit computer. The JVM has strict rules, allowing only certain instructions to execute
with proper operands with the right data type. The project was not planned to allow
operations of floating point number and doubles.
In conclusion, as for the use for supportingeducational research and development,
Java-based Microprocessor shall provide a solid foundation to embedded systems,
where more enhancements would be needed before it can be utilized reliably.
in
ACKNOWLEDGMENT
First and foremost, all praises to Allah The Almighty that by His blessings I have been
able to complete my final year project, the Java-Based Microprocessor. I would like to
thank the following people who helped me in my final project.
Mr. Patrick Sebastian, my supervisor and Computer System Architecture
lecturer, who came with the idea of this project and helped me with references
projects and moral support all the way.
Mr. Lo Hai Hiung, a lecturer, who had gave me a good insight of HDL and
Altera Quartus II.
My Parents, Mr. Md. Khuzaimah and Mrs. Hasnah, who has been very
supporting, caring for my well-being and prayed for my success.
Mr. Faizan, a tutor, who had, taught me a good deal of HDL coding technique
and introduction to Altera Quartus II
Dr. Yap Vooi Voon, a lecturer, for his critique of my project development.
I would also like to thank Nadirah Khairul Anuar, for her loving support every





1.2 Problem Statement 1
1.3 Objectives & Scope of Study 2
2.Literature Review 3
2.1 Previous Work on Java Processor 3
2.1.1 Sun Microsystems'picoJava 3
2.1.2 Java Optimized Processor 4
2.1.3 Bernd Paysan's bl6 Forth 5
2.2 Java Virtual Machine 6
2.2.1 Fundamentals of Bytecode 7
2.3 Stack Machine 8
2.3.1 JVM as Stack-based Machine 8
3.Project Work 10
3.1 Research and Design Approach 10
3.2 Development and Simulation 12
3.2.1 Using Behavioural Verilog 12
3.2.2 Using Extensive Test bench 12
3.3 Hardware Verification 13
4.Results & Discussion 14
4.1 Arithmetic& Logic Unit 14
4.2 Operands and Return Stacks 16
4.3 Program Counter 19
4.4 Datapath and Modules Integration 20
5.Conclusion & Recommendation 25
6.References 26
7.Appendices 27
Appendix Al: MJava ALU Verilog Code 28
Appendix A2: MJava Stacks Verilog Code 30
Appendix A3: MJava Program counter Verilog Code 33
Appendix A4: MJava Datapath Verilog Code 36
Appendix B: MJava Simulation results 43
Appendix C: MJava Stacks Synthesized Circuit 47
Appendix Dl: JVM Instructions Hexadecimal Values 50
Appendix D2: JVM Instructions and Operands Description 55
LIST OF ILLUSTRATION & TABLES
Table 2.1: JVM primitive data types 6
Table 4.1: Ports in MJava ALU module 17
Table 4.2: Instructions executed within ALU module 18
Table 4.3: Ports in MJava stack module 20
Table 4.4: Ports in MJava program counter module 21
Table 4.5: Ports in MJava Datapath module 23
Table 4.6: Instructions implemented in MJava processor 25
Figure 2.1: Block diagram of the picoJava cores[Pl] 3
Figure 2.2: The picoJava core employ the circularregister file to support stack-based
processing[P2] 4
Figure 2.3: Block diagram ofJOP cores 5
Figure 2.4: Block diagram of bl6 cores 6
Figure 4.1: Status flag defined in ALU module 17
Figure 4.2: MJava stack module declaration, reg type stackmem[7:0] is the actual stack
memory array 20
Figure 4.3: MJava datapath module declaration. Many type of regs and wires were declared
and used 24
Figure 4.4: Lines of code fetched for testing datapath functionality 24
Figure 4.5: Sequential flow ofMJava data path 26
NOMENCLATURE
ALU Arithmetic and Logic Unit
ASM Algorithmic State Machine
CAD Computer-aided Design
HDL Hardware Description Language
JRE Java Runtime Environment
JVM Java Virtual Machine
LIFO Last in First out
OS Operating System
VHSIC Very High Speed Integrated Circuit





Java applications have stormed the mobile industry lately, with current
smart phones and mobile phones equipped with Java-enabled games and such.
While the embedded systems industry is moving towards Java, there are
several technical issues that prevent Java from being widely implemented in
embedded devices such as set-top boxes, automotive systems and smart
controllers.
The issues that prevent Java from being widely implemented are its
performance and runtime execution efficiency. In order to execute a Java
bytecode, the JRE must be running on top of a machine original operating
system (OS) and this concept uses high resource. This has led to several
developments of Java-based processor that is capable to execute the bytecodes
without the need of JRE. These developments had been around since 1997 and
one of the most Java processorwaspicoJava, designed by Sun Microsystems .
Java processor had been widely, and at the same time narrowly,
developed to support embedded systems industry. Even in term education and
research, there are many projects running that requires non commercial
processor to support their development.
1.2 PROBLEM STATEMENT
The current concept of executing Java bytecode requires JRE to run on
top of a machine OS. While using high resources, this also results in slow
program load and unpredictable time-cycle execution. This drawback is
considered trivial on personal computer, but in embedded systems and small
devices such as handheld, the effect can be unacceptable.
Many Java processors being developed and many of them differ in
Java-Based Microprocessor 1.Introduction
features and targeted media. Most of them were developed to suit medium-end
to high-end small devices. In this project, the development focuses on the very
basic of bytecode implementation and targeting only for embedded system with
very limited resources.
Although the processor being developed in this project is a basic 32-bit
signed integer, it is important to note that, in embedded systems application,
building a complex and powerful processor is very costly. As a result, the
processor in this project is devised to support fundamental features, dropping
out the complex features that were entailed for higher performance systems and
ensure that it will cater to embedded systems expeditiously.
1.3 OBJECTIVES & SCOPE OF STUDY
In general, Java-based Microprocessor (MJava) is aimed to implement
JVM instruction set into a hardware stand-alone processor. In more specific
term, it is aimed to be an alternative non commercial processor as a supporting
base for the educationalresearch and developmentof embeddedsystems.
In order to achieve the objective, some parameters had been refined and
redefined, in which two of them are; to implement the JVM instruction with
minimal use of external memory space; and keeping the final outcome as
simple as possible with only the most basic requirement to execute Java class
file properly and correctly.
Java-Based Microprocessor 2. Literature Review
2. LITERATURE REVIEW
2.1 PREVIOUS WORK ON JAVA PROCESSOR
Work on Java processor is not a new concept. It has been around since
picoJava was initiated in 1998, but it is increasing in popularity. Several
previous work had been used as references to the project. Each provided a
different perspective on how to approach the solution.
2.1.1 Sun Microsystems' picoJava
picoJava is the first attempt on Java processor, developed under Sun
Microsystems as the next step to popularize Java. Its advancement ideally
suited the consumer electronic manufacturers need of small size processor core
and high performance. It has been licenced to at least four (4) major
companies m
Its success in commercial values lies mostly on its high performance
design computer architecture. The variable-sized cache, choice of with or
without floating-point unit and the "stack register file" significantly improved
performance. Its ability to execute legacy C/C++ as efficient as comparable
RIS CPU is also a big advantage. Figure 2.1 shows the architecture of
picoJava cores, while the stack register file operation, treating file as a circular
buffer is shown in Figure 2.2.










Figure 2.1: Block diagram ofthepicoJava
[i]
cores
Java-Based Microprocessor 2. Literature Review
Figure 2.2: The picoJava core employ the circular register
file to support stack-basedprocessing^11
2.1.2 Java Optimized Processor
Java Optimized Processor (JOP) was developed as part of a thesis
project, focused on designing a processor for time-predictable execution of
real-time tasks. Its primary implementation is in a field programmable gate
array and the research demonstrates hardware implementation of the Java
virtual machine results in a small design for resource-constrained devices. It
had been designed to implement only the most frequently used instructions in
the hardware level, while leaving the remaining to be executed on the software
level.
In all measurements, JOP stated that the load of local variables and
constants onto the stack accounts for more than 40% of instructions executed.
This shows that an efficient realization of the local variable memory area, the
stack and the transfer between these memory areas is mandatory. On the other
hand, the implementation of these three subjects, especially the stack, is
critical to the project and thus, required.'21
Java-Based Microprocessor 2.Literature Review
JOP's own Java bytecode is named microcode. It is the native language
for JOP. The microcode is translated from Java native language, bytecodes
during execution, and both instruction sets are designed for an extended stack
machine. In addition, JOP is fully pipelined architecture but with single cycle
execution of microcode. It, however, used a fresh approach to mapping the
Java bytecode to these instructions. Figure 2.3 shows the data path of JOP,
where it can be observed that the stack architecture allows for a short pipeline.
















Figure 2.3: Block diagram ofJOPcoresm
Microcode
RAM
2.1.3 Bernd Paysan's bl6 Forth
The bl6 processor is being developed as a Forth processor in an FPGA
by Bernd Paysan. In this most brief summary, it has shown most promise as a
better base to the project title Java-Based Microprocessor (MJava) that the
JOP. Not only it is basically a stack-based processor, its minimalist design fits
into small FPGA is most suitable for embedded systems application.
This processor is inspired by cl8 from Chuck Moore, a popular forth
processor, and is design entirely using Verilog HDL - a most convincing
advantage for MJava side. Its basic processor architecture proved to be very
Java-Based Microprocessor 2.Literature Review
simplistic and practical for small application. Its stack machine was a radical
approach but still has rooms for improvement.
RAWROM










[3]Figure 2.4: Blockdiagram ofbl6 cores
2.2 JAVA VIRTUAL MACHINE
Java Virtual Machine (JVM) is an abstract computing machine, acting
like a real computing machine, but executing Java bytecode instead of an
assembler. It has an instruction set and capable of manipulating various
memory areas at run time. JVM is also a stack-based machine in general,
consisting several stacks for operands and return addresses. The stack-based
JVM is further explained in subsection 2.3.1 JVM as Stack-based Machine.
Java class file is translated into Java bytecode, which is used by JVM to
be translated again into the specific native machine language. In short, JVM is
a second layer operating system (OS) to the work station native OS, used in
order to execute Java bytecode. The operation of bytecode basics is further
explained in subsection 2.2.1 Fundamentals of Bytecode.
Java-Based Microprocessor 2.Literature Review
JVM instructions consists of an opcode, which specify the operation to
be performed, and followed by zero or more operands. This allow us to assume
that implementing a complete JVM instruction set will result in exponentially
increasing complexity, depending on the extent of how many instructions are
being implemented. Certain JVM instructions can embody up to 14 operands
each.
The JVM supports seven (7) primitives data types, listed in Table 1.
Currently JVM consists of 202 instructions, although, many of the instructions
are for similar operationbut different data types involved. This was intendedto
make the bytecodes compact, by forcing opcodes to identify the data types
involved instead of leaving it to the operands itself like in many other machine
languages, (refer Appendix Dl for a list of JVM opcodes with their
corresponding hex values and Appendix D2 for JVM opcodes with their
relevant operand(s) type).I41[5]
Table 2.1: JVMprimitive data types
' Data Type IMiiiiiimi-' ,;^..^.
byte one-byte;signed two's complement integer
short two-byte; signed two's complement integer
int 4-byte signed two's complement integer
long 8-bytesigned two's complement integer
float 4-byte IEEE754 single-precision float
double 8-byte IEEE754 double-precision float
char 2-byte unsigned Unicode character
2.2.1 Fundamentals of Bytecode
Bytecode is the machine language of the JVM. Since it was
designed to be compact, bytcodes are fetched in streams. When an opcode
reached the JVM, it indicates whether to encode zero or more operands
Java-Based Microprocessor 2.Literature Review
from the streams that immediately follow. Opcodes and operands in the
bytecodes stream are aligned on byte boundaries, which means each
opcode or operand is one byte of size. Operands of datatype larger than a
byte are broken into several bytes, stored in big-endian order in the
bytecodes stream.
2.3 STACK MACHINE
Two major types of computer stack are Last-in First-out (LIFO) and
First-in First-out (FIFO). While the latter act like a buffer, the former is being
used vastly in main computing as a significant temporary storage, mainly to
improve performance and to favour in compact machine code. LIFO stack by
definition is conceptually the simplest way of storing information temporarily
for use in common computation such as mathematical expression evaluation
and recursive subroutine calling.
LIFO stacks can be constructed in software easily by allocating an array
in memory and a variable with the array index number to keep track of the
array position, known as stack pointer. The significant properties of LIFO
stacks is the push and pop operations. Apush will store information in the top
most location (as defined by the stackpointer), while apop extract information
from the top most location to central processing (which later is deleted from
the stack).
Stack-based machine or computer is increasingly becoming a favoured
choice. Mostly due to its excellent mechanism of handling operations within
procedures or recursive invocations. A nested branch and goto functions can be
implemented very well with the use of LIFO stack. This also eliminates the
needto specify location of return addresses, whichcouldbe space consuming.
2.3.1 JVM as Stack-based Machine
Computation in JVM centres on the stack to perform many
operations, especially in arithmetics and returning from subroutines. In
S
Java-Based Microprocessor 2.Literature Review
JVM there are two separate stacks - operands stack and return stack. The
latter was used strictly for return addresses, while the former is used for
other information or operands. As Java bytecode was designed to be
compact, many of the instructions are of zero operand. These instructions
take values from the stacks. The stackwill pop (read and delete) as many
operands from the stacks as indicated by the opcode. The resultants are
also usually pushed (stored) backonto the stacks.'41151
Assisting the stacks are the local variables, similar to working
registers in many register-based machine. However, local variables use
are limited to certain instructions and a programmer can barely
manipulate this temporary storage. A number of instructions are dedicated
for handling information between local variables and operands stack, but
i
the direct use of local variables in calculation is unclear.
Java-Based Microprocessor 3.Project Work
3. PROJECT WORK
A revised methodology presents several key changes in the project flow.
Due to unforeseen delay caused by new findings, which led to new obstacles,
and switch to Xilinx ISE, the hardware implementation on an FPGA kit has
been deemed optional. In all, this project may end up as simulation-only if any
of Xilinx FPGA is unavailable at the project disposal.
3.1 RESEARCH AND DESIGN APPROACH
Selecting and researching on Java ISA is not a direct precedence to
project design and development. Still, it may provide key points to the
direction of the development in term of the key elements that are necessary to
be implemented.
Java ISA consists of 230 instructions, with three (3) reserved opcodes
and 25 _quick opcodes. Nevertheless, current Sun JVM support only the 202
instructions (without the reserved and_quickopcodes) and manyJavaprogram
had been written with these assumption. Thus, it is irrelevant to pursue the
project development by includingthese unnecessaryopcodes.
There are two major concerns in implementing Java ISA - the instruction
set itself and the JVM stack machine (as explained in Chapter 2, Stack
Computers). Preliminarily, only the basic opcodes will be implemented,
including all stack related, arithmetic, logic and return/jump operations but
ruling out the remaining such as long, float, double, array and conversion
operations. The instructions being implemented in MJava project is show in
List 1 (next page).
10
Java-Based Microprocessor
Pushing Constants onto the Stack
bipush sipush
Loading Local Variables onto the Stack
iload iload_<n>




















List 1: JVMinstructions being implemented in MJava
11
3.Project Work
Java-Based Microprocessor 3.Project Work
3.2 DEVELOPMENT AND SIMULATION
Development of the project were approached by systematical individual
approach. The design was subjected to a work breakdown system (WBS) of a
full integrated processor system. Necessary modules are identified and
approach individually - ALU, stacks, program counter and data path. With the
individual approach, each module were able to be subjected to several test
simulations. These modules were then integrated using the data path design,
done in behavioural style and tested again as a whole unit. This can ensure that
the integrity in whole and reliabilityof each module is proven.
Simulation of the processor can be done in one of two ways or both
combined, of Verilog HDL model and/or block diagram schematics. While
Verilog HDL model is a text-based approach, block diagram schematics is a
graphical-based approach that seems appropriate and easier option for simple
and fundamental operations. However, when designing a far more complex
processor, it is best to choose to model in Verilog. Simulation and synthesis
will go through two procedures of functional simulation and timing simulation.
The former only concerns of its fundamental of functional operation, while the
latter takes into account additional parameter- processor clock.
3.2.1 Using Behavioural Verilog
It was decided that the design of the entire project would done in
behavioural Verilog. The behavioural programming is similar to
programming in C and C++, allowing designers to define their circuits
based on how it would behave or function. This is contrast to RTL coding
style that define components of circuits and their connections. With
behavioural style, the code is more transparent, portable and extensible
even to other people who decided to proceed the project works.
3.2.2 Using Extensive Test bench
In this project, some glitches resulted in possibility of no hardware
12
Java-Based Microprocessor 3.Project Work
implementation for verification. Thus, to verify that the design works, an
extensive testing fixture must apply. Simulations were to run with strict
rules, experimenting with every possible corner case - reaching the limit
ofwhat the modules can do and go beyond it.
3.3 HARDWARE VERIFICATION
When designing the microprocessor, the targeted device must be kept in
mind. Most times, a circuit design for a particular device are not synthesizable
on other device. Although the codes are written in portable behavioural style
and the simulations shows expected execution.
It is highly preferred to verify circuit design with hardware
implementation. But circuit synthesis can be an issue. Early in the project
progression, it has been decided the design will be implemented in Altera's
FPGA developmentkit, but halfwaythrough, it was switched to Xilinx's FPGA
due to limitation in Quartus II compiler.
13
Java-Based Microprocessor 4.Results ft Discussion
4. RESULTS & DISCUSSION
4.1 ARITHMETIC & LOGIC UNIT
The Arithmetic & Logic Unit (ALU) was design as a 32-bit signed two's
complementarithmeticand logic evaluator. The inputs into the module consists
of two input arguments, which are to be evaluated, and an instructions selector.
The output from the modules are the evaluation resultants, embodiedwith three
status flags - Z flag for indicatingzero value resultant, V flag for indicating an
overflow and N flag for indicating the sign of the resultant. Table 4.1 shows
the relevant ports declared inside the ALU module.
The status flags were designed from scratch, although the two Z and N
flags are very simple. Z flag indicate a zero value resultant, achieved by
ANDing the resultant bits. Z flag is set to one (1) if the resultant in zero in
value and reset to zero (0) if it is a non zero value. N flag indicate the sign of
the resultant, and thus only taking the most significant bit (MSB) of the
resultant into argument. N flag is set to one (1) if the resultant is a negative
number and reset to zero (0) if it is positive. V flag has more complex design,
where it has to indicate whether an overflow had occur while evaluating the
input arguments. This usually can occur with the following situations,
• Two positive values added.
• two negative values subtracted.
Two values (of any ;sigri) multiplied.
V flag was design by applying Karnaugh Map and supplying the above
situation. V flag is set to one (1) if an overflow occur and reset to zero (0) if




assigrif;,flag__y-'*=; -';;(~instr==2'b01) ? ••-•
-'•'('(ADlpp-l] rresult[lop-1])' '&
.(B[lop-H A result[lpp-l:])) :
[:: '((instrf^'Mu)"?"
"/ 'If .(A [iop~li -^r^sull^opSl -) ••*•&
Figure 4.1: Statusflag definedinALUmodule
Table 4.1: Ports in MJava ALU module
4.Results & Discussion




•"*.'. _ 7 •:-\">(^-,wHptjpn-i'-«^Jr^jr'jy,i
A First argument of the evaluation.
B input 32 Second argument of the evaluation.
instr input 8 Select operation to perform. Also act as a
trigger to invoke operation selection.
Cout output 1 The 33rd bit, reserved for future use.
result output 32 The resultant of the ALU evaluation.
flag_z output 1 Asserted when the resultant is zero
flag v output 1 Asserted when a an overflow occur
flag n output 1 Asserted when the resultant is a negative
number.
Input instr is fetched directly from the opcode itself. This should
behave like a switch, where the module will be asserted when the input instr is
assigned with a valid opcode from the bytecodes stream. Operationsare chosen
with a case statement, putting the input instr into the case argument. A total of
27 instructions available for execution, with highly extensible data path. The
instructions chosen are fundamentals and significant to ensure reliability of the
processor. Table 4.2 shows list of instructions available in the ALU module.
15
Java-Based Microprocessor 4. Results Et Discussion
Table 4.2: Instructions executed within ALU module
liMi-iii-limi Imp. DuMTipiiiiii
nop No operation.
iadd Add two int operands. Two values popped
from stack.
isub Subtract two int operands. Two values
popped from stack.
ineg Negatean int operand. One value popped.
ishl Arithmetic shift left. Twovalues popped.
ishr Arithmetic shift right. Twovalues popped.
iand Boolean AND two int.
ior Boolean OR two int.
ixor Boolean XOR two int.
• Note: Imp. = implementation.
At the moment the implementation status shows only limited instruction
had been implemented. The ALU module is designed to use take operations
selection arguments directly from the opcodes for high extensibility. Any
instruction that put two values into argument with one resultant can be easily
implemented inside the module. Full Verilog code and simulation result for the
ALU module can be referred in Appendix Bl.
4.2 OPERANDS AND RETURN STACKS
The operands and return stacks are instantiated from the same LIFO stack
module design. However, instead of having a single stack for operands and
return addresses, they are separated to increase integrity in performing nested
subroutines and prevent mismatch fetch of operands for operation. It would
give a great complexity if the operands and return addresses were to share
same stack, resulting in an inefficient and larger-size cores.
Stack operates in two modes; (i)push operand onto top most location and
(ii) pop operand(s) from the one or two top most location(s). Any data, pushed
16
Java-Based Microprocessor 4.Results & Discussion
and poppedfrom the stack is of 32-bit width. Prior to push operation, smaller
data types are signed extended, while larger data types are broken into several
32-bit width data. Pop operation will output a 32-bit wide data. It is up to the
central processing to combine or disjoint the necessary operands. For the pop
operation, a single pop will read the top of stackand write to the outputport 1.
A doublepop will read the top two of stack and write to the output port 1 and
port 2.
The module design utilise hardware memory array for the stack, declared
as type reg. It can occupy up to eight (8) data of 32-bit width, stored in
systematic bottom-top fills. Stack pointer indicates where data input will be
stored, starting at bottom most location and increase by a location after each
successful push and decrease by a location after each successful pop. The
memory array also behave like a circular buffer. It rotates to the top most
location whenever it reaches lower than the bottom most location and rotates to
the bottom most location whenever it reached upperthat the top most location.
As in the ALU module, input instr [1:0] act as a trigger to execute the
selected operation, where it must be reset if not in use. The instruction value is
fetched during the decode phase in the data path. An output ststore provides
indication whenever data has successfully beenpushed, assisting the data path
to determine the appropriate next operation Table 4.3 shows relevant ports
declared inside the stack module.
The design approach maintain the stacks safe from data corruption due to
manual overrides in input ports. The stacks remain inside the core without
direct connections and accessed only via double doors system, where
instructions instr are not direct association of any opcodes - unlike the ALU.
Figure 4.2 shows how the memory array was declared. Full Verilog code and
simulation results can be referred in Appendix B2.
17
Java-Based Microprocessor 4. Results & Discussion






'" X.• '*•'. "^^csLTipljonr."^".^:-.-A.-••<.,
Clock.
reset n input 1 Reset port
data in input 32 Data input (for push) port.
read__n input 1 Enable read (pop) port.
write n input 1 Enable write (push) port.
pop_2 input 1 Double pop indicator.
data outl output 32 Output port 1.
data out2 output 32 Output port 2.
pushed output 1 Successful push indicator.
popped output 1 Successful pop indicator.
module MSt;ackf;clk, data,, instr> stStore; out),-




output stS.lore;* ' -v ''J," •• '•"
output 'L.op' out; •-'»•., •'•=••; - ' ' ''".: -' ''T:- -
. r^g,-•-Lt>p stackmem .[dep-l:Q>l-.; .'*..,
re:g T|^ep>l:.01 sptr; ^; •;/. ;. -
reg stStd£e-;;_; \ _ _J_ .. V:- ,;• - ••?.- "- !.\
reg 'Lop diit.; ;v - -'"*- "• '-":*v — ""••" .[. •'•?•• •-•
Figure 4.2: MJava stackmodule declaration, reg type stackmem[7:0] is
the actual stackmemory array.
18
Java-Based Microprocessor 4.Results & Discussion
4.3 PROGRAM COUNTER
Program Counter (PC) is also a stack-based module, but instead, utilises
a FIFO type stack. The purpose of PC is mainly to provide a storage to streams
of instruction like a long buffer. Thus it allows bytecodes stream to be kept in
closeto the processor cores. The implementation of FIFO-type PC also add the
extensibility to perform branch and jump instructions.
The design is fairly simple and common. It has five (5) input port and
four (4) output port. Table 4.4 shows the relevant ports declared inside the PC
module. The PC has a memory array pc_mem [] that stores all the instructions,
in bytes. The memory array has 16 locations of a byte wide. The small size is
chosen as experimental value. It is easily extensible with only a line of code
change. Since PC is FIFO stack, it has two pointers - read and write. These
pointers indicate the read and write location within the pc_mem[] array.
Whenever a buffer overflow or underflow occur, a flag is asserted at the output
port (see Table 4.4). An internal counter is used to determine whether or not
overflow or underflow occur.
This module start with writing instructions, whenever write port is
asserted, from external programmer, buffering them into the memory array.
During this period, no operation is allowed in the data path and read operation
remain de-asserted. As soon as the write port get de-asserted, it indicate to the
data path that it is ready for processor operations. Succeeding operation (read
from PC) is controlled; by the data path, until interrupted again whenever
write port is reasserted. The cycle continues.
Table 4.4: Ports in MJavaprogramcountermodule
Ports •M .,...-. Description-!-" - •V**/^Vr
elk input Clock.
reset n input Reset port
data in input 32 Data input (for push) port.
read n input Enable read (pop) port.
19





W id III • iH-MTipiimr
1 Enable write (push) port.
data_out output 32 Output port.
full output 1 PC overflow indicator.
empty output 1 PC underflow indicator.
half output 1 Indicate pointer at midway.
4.4 DATAPATH AND MODULES INTEGRATION
Datapath module is a collections of wires and ports connecting the
necessary external modules to their respective operation. Datapath is
responsible for the integration between modules instantiated. It provides the
way for the ALU, operands and return stacks, and program counter to function
as a single unit. Design technique employed in the project is simple, but as
number of instructions increase, it also increase in complexity. As in other
modules, the datapathwasdeveloped using behavioural style Verilog.
Datapath module has three modules instantiation - the ALU and operand
stack and program counter. The data path utilises many always @block,
triggering action only whencertain inputchanges values. PC fetches bytecodes
stream by bytes to the data path, whenever pc_read is asserted. It then decode
the opcode fetched and translated it for proper parameters setting in the first
always block. Following through the sequence, the opcode parameters will
indicate which modules to assert first and whether to use the stacks, local
variables, etc. Operations in the data path are of sequential flow. The basic
operation sequence is presented in Figure 4.5.
Since microprocessor circuits are meant to execute concurrently, design
in sequential flow resulted in a mixed complexity. Nevertheless, the
performance were not taxed since the complexity only lies on the codes and not
the circuitry. The simulation runs several instructions and testing the
20
Java-Based Microprocessor 4. Results & Discussion
functionality of each modules. Instructions fetched are shown in Figure 4.4,
where immediate values were pushed several times onto the stack before
calling the addition, subtraction, negate and swap operations.
Datapath has four (4) input port and one (1) output port. Its significant
input argument is byte__in[], used to transfer instructions from external
programmer and buffer them inside the PC module. The byte_in[] is of a
byte wide, which correspond to the PC byte wide input, storage and output. A
master reset port, is used to reset and reinitialized all inputs and pointers.
Table 4.5 shows the relevant ports declared inside the MJava main data path
module.
Decoding instructions required several always @statements that get
asserted whenever the input arguments changes values. As a result, many regs
and wires are declared along with the inout ports to assist the decoding
operations. Figure 4.3 shows the MJava Datapath module declaration. Full
code of the MJava Datapath module can be referred in Appendix B4.
Table 4.5: Ports in MJavaDatapath module
elk input Clock.
- -"•*.•?•.*.:
reset input 32 Manual reset.
ctrlword input Fetch the bytecodes.
result output Output to external.
21
Java-Based Microprocessor 4.Results 8t Discussion
module•.MJ-a^a (cik, reset, write>>-;by%™e_in,. out_s>treamH*
input •• cik,- .';" ; -.
•"'*"•• 'input' -•' • .., . ^ • "** - . - re'Setr;*h!; "".. \
...;. rinpu*:' ••'"%'-\ ..... '...*.... ,---'.-"> .write;.. •••:'%*••• "%-"
-.'.-;. ^np^f:'' ,..^: -,^BY^J^|)T^i:::0j,,byte_in7 ;::^: ,l^li.
output P INT_WIDTH-i: 0]^ qujz^stream; ';;
wire elk; 7
:... .^^; . -:• ' ,' .p: >) - ' ; reset;,-.• .•••%:••
.wire'-- ".'.write;
.wire'"'' '•'[ *BYTE_WIDTH-i: 0-1- byte_in;
reg riNT_WID-TH-l':0] . out^stream;
Figure 4.3: MJava datapath module declaration. Many type ofregs and
wires were declared and used.
always @{posedge cik) begin
..-•<;•*••• ,;" ..;• write .= l;v.w. -.*-,-, ••• '<S ;
.. ."-x.i •**• •• :;^.byt|._in--- '8r';hlf;
./,/> push byte l.; .-v.-
f^PEpXbyte\in\> /8''.'hAA; •••-"
' St-'. •' -"•?<'•'
fr^PElCbyr^in;-' 8;,hl0; 7/ push -'byte • ::'£2;
#,PER- byte in = 8'hBB; .' .
. -;-,;• FPERvbyte^in = 8£M4&
- 4J push shbrt^. s -.-.-
.';;: ~#^PER:;.byt:e.,.in;> aVhCO;
- - -•. '.'" Z...
:' ' :#sPEfnbyte_irf'= FhDD; ,•••-',• • ' '".- •%.:•
" #'c'PER\;byte_in. = 8'h60; ti ''integer add
#SPER byte_.in = 8'h78; ll integer- sh;i
#-%.PER: byte_in = 8'.h,8:0s;,.. in integer ;pr- -, -
. fr'PER byte in = 8ih3e; 7f iS.tore- 1
„.. r.PER;.byte_in. = '8*?hib> B ildad 1 • *.
"#.*'-PER: byte^iri"'= 8^h04v •71 icffifhst 2 ' --^
t^PER.-byfce-jLn! = 8,'hOO;
# sPER. byte_in •= 8'hOO;
write = Q; ;
#.500.$stop; " '". ;,
endF /Z.'.^ER.;'==^l,0f, .•'* ' [7 \%
Figure4.4: Linesofcodefetchedfor testing datapathfunctionality.
22
Java-Based Microprocessor 4. Results Et Discussion
Table 4.6: Instructions implemented in MJavaprocessor.
^^^^^^^B Imp.. * • rBe>^c>-iptiaiiJ..:ilJ..;..V^rJ.'' i"3
nop No operation.
iadd Add two int operands. Two values popped
from stack.
isub Subtract two int operands. Two values
popped from stack.
ineg Negate an int operand. One valuepopped.
ishl Arithmetic shift left.Twovalues popped.
ishr Arithmetic shift right. Two values popped.
iand Boolean AND two int.
ior Boolean OR two int.
ixor Boolean XOR two int.
bipush An immediate byte is pushed onto the
operand stack.
sipush An immediate short is pushed onto the
operand stack.
swap The top two value in operand stack are
swapped and pushed back onto the stack.
istore_<n> Store value from operand stack into local
variable of corresponding <n>
iload <n> Load value from local variable of
corresponding <n> and push onto top of
operand stack.
iconst <n> Pushing constants of corresponding <ri>
onto the operand stack.
if icmp<cond> Branch if int comparison succeeds. Two-
byte jump address is embodied in the
instruction stream. Two values popped from
the stack, where value1 is top of stack and
value2 is next top of stack.
if icmpeq succeeds if and only if value1 ~ value2
if icmpne succeeds if and only if value1 ¥=value2
if icmplt succeeds if and only if value1 < value2
if icmple succeeds if and only if value! < value2
if icmpgt succeeds if and only if value1 > value2


















Figure 4.5: Sequentialflow ofMJava datapath
24
4. Results & Discussion
Java-Based Microprocessor 5.Conclusion a Recommendation
5. CONCLUSION & RECOMMENDATION
Project title Java-based Microprocessor is a huge topic by itself.
However, with proper planning and specific target, it did not appear to be as
overwhelming as some people would assume. Throughout the project several
constraints and obstacles faced that in some ways change the direction.
Nevertheless, the project manage to achieve its basic objective of
implementing the core of JVM intoa hardware circuitry.
The ALU module was developed accordingly, achieving its target as
computation module for arithmetic and logical operations. All necessary ALU
instructions had been implemented but with the lack of more complex
synthesis.
The stacks module, the operand stack and program counter is most
convincing fully synthesizable modules. Their exceptions lies on properdesign
from highly reliable sources, proven and reused many times by others. Circuit
synthesisare presented in Appendix C.
The Datapath module achieve its purpose, but lack of understanding in
data path design led to lengthy HDL code. It meets the objectives of linking
other modules and allow them to work as unit and allow further extension of
additional instructions easily without tempering with original design. Decision
to develop the data path using comprehensible behavioural style codingproves
to be advantageous.
Performance may not be the strong side of this project, yet it is a pilot
project for other colleagues to pursue in the future. The implementation of
JVM instructions are limited to basic operations involving only integers, shorts
and bytes. Although the data path design was unique, there is room for




[1] Harlan McGhan and Mike O'Connor, picoJava: A DirectExecution
Enginefor Java Bytecode, Sun Microsystems
[2] Dr. Andreas Steininger and Dr. Peter Puschner, Java Optimized
Processor, 2005
[3] Bernd Paysan, bl6 - a ForthProcessor in FPGA, 2003
[4] Tim Lindholm and Frank Yellin, The Java Virtual Machine
Specification, 2 Edition, Addison-Wesley
[5] The Java Virtual Machine Specification, Sun Microsystems, 1998
[6] Philip Koopman Jr.j Stack Computers: The New Wave, Mountain
View Press, 1989
[7] Carpinelli, Computer Systems Organization & Architecture
[8] Mark Gordon Arnold, Verilog Digital Computer Design: Algorithms
to Hardware, Prentice Hall PTR





Appendix Al: MJava ALU Verilog Code
Appendix A2: MJava Stack Verilog Code
Appendix A3: MJava Program Counter Verilog Code
Appendix A4: MJava Datapath Verilog Code
Appendix Bl: MJava ALU Simulation Results
Appendix B2: MJava Stack Simulation Results
Appendix B3: MJava Program Counter Simulation Results
Appendix B4: MJava Datapath Simulation Results
Appendix C: Stacks Synthesized Circuits
Appendix Dl: JVM Instructions Hexadecimal Values
Appendix D2: JVM Instructions and Operands Description
27
Java-Based Microprocessor 7.Appendices





"timescale Ins / Ins
module MALU(A, B, instr, Cout, result, flag_z, flag_v, flag_n);
parameter lop=32, loc=8,-
input "Lop A, B;
input "Loc instr;
output "Lop result;




wire flag_z, flag_n, flag_v;
always @(A or B or instr)
begin
case{instr)
8'h84 Cout, result} = (A + B); // increment
8'h60 Cout, result} = (A + B) ; // addition
8'h64 Cout, result} = (A - B); // subtraction
8'h74 Cout, result} = (-A + l'bl); // negation
8'h78 result = (A << B); // shift left
8'h7a result « (A » B) ; // shift right
8'h7e result = {A & B); // boolean AND
8'h80 result = (A | B); // boolean OR
8'h82 {result = (A A B}; // boolean XOR
endcase
end
assign flag_z = result? 0:1;
assign flag_n = result [lop-1] ,-
assign flag_v = (instr==2'bOl)?
endmodule // alu
?age: 1
((A[lop-l] A result[lop-1]) & (B[lop-l] A
({instr==2'bl0)?





APPENDIX A2: MJAVA STACKS VERILOG CODE
29
D:\Programs\Xi1inx\ISEworkingdir\MJava\MStack.v





"define INT WIDTH 32
// Clock-to-output delay. Zero
// time delays can be confusing
// and sometimes cause problems.
// Depth of stack (number of bytes)
// Number of bits required to
// represent the FIFO size


















input read n, write n;
input pop_2;
// OUTPUTS


























'/ Look at the edges of reset_n
ilways @(reset_n) begin
if (reset_n == l'bl) begin
// Reset the stack pointer
#"DEL;
assign st_pointer = "ST_DEPTH - l'bl;
assign popped = 0;









'/ Look at the rising edge of the clock
ilways @(posedge clock) begin
// Popping data from stack
if (readjn == l'bl) begin
'age: 1
D:\Programs\Xilinx\ISEworkingdir\MJava\MStack.v
// Output the data
data_outl = #"DEL stjnem[st_pointer];
//st_mem[st__pointer] = 32 'hOOOOOOQO;
// Decrement the stack pointer
// If the pointer has gone beyond the bottom of stack,
// bring it to the top of stack.
if (st_pointer == 0)
stjpointer = #"DEL "ST_BITS'bill;
else
st_pointer = #"DEL st_pointer - 1;
if {pop_2 ass l'bl) begin
data_out2 = #"DEL st_mem[st_pointer];
//st_mem[st_pointer] = 32'hOOOOOOOO;
if (st_pointer == 0)
st_pointer = #"DEL "ST_BITS'blll;
else




// Pushing data onto stack
if (write_n == l'bl) begin
// Increment the stack pointer
// If the pointer has gone beyond the top of stack,
// bring it to the bottom of stack.
if{st_pointer == "STJDEPTH-1)
st_pointer = #"DEL "ST_BITS'bO;
else
st_pointer =s #"DEL st_pointer + 1;
// Store the data







APPENDIX A3: MJAVA PROGRAM COUNTER VERILOG CODE
30
3:\Programs\Xilinx\ISEworkingdir\MJava\MPC.v











// Clock-to-output delay. Zero
// time delays can be confusing
// and sometimes cause problems.
// Depth of PC (number of bytes)
// Half depth of PC
// (this avoids rounding errors)
// Number of bits required to
// represent the PC size




















































eg ["BYTE WIDTH-1:0] pc_mem[0:"PC_DEPTH-l];
// How many locations in the PC
// are occupied?




ssign #"DEL full = (counter == "PC_DEPTH) ? l'bl :
ssign #"DEL empty = (counter == 0) ? l'bl ; 1'bO;
ssign #"DEL half = (counter >= "PC_HALF) ? l'bl : 1
/ Look at the edges of reset_n
lways @(reset_n) begin
if (reset_n — l'bl) begin
// Reset the PC pointer
#"DEL;
assign rdjpointer = "PC__BITS'bO;
assign wr_pointer = "PC_BITS'bO;













// Look at the rising edge of the clock
always @(posedge clock) begin
if (read_n == l'bl) begin
// Check for PC underflow
if (counter == 0) begin
$display("\nERROR at time %0t:", $time);
$display("PC Underflow\n"};
$stop; // Use $stop for debugging
end
// If we are doing a simultaneous read and write,
// there is no change to the counter
if (write_n == 1'bO) begin
// Decrement the PC counter
counter <= #"DEL counter - 1;
end
// Output the data
data_out <= #"DEL pc_mem[rd_pointer] ;
// Increment the read pointer
// Check if the read pointer has gone beyond the
// depth of the PC. If so, set it back to the
// beginning of the PC
if (rd_pointer == "PC_DEPTH-1)
rdjsointer <= #"DEL "PC_BITS'bO;
else
rd__pointer <= #"DEL rd_pointer + 1;
end
if (write_n == l'bl) begin
// Check for PC overflow
if (counter >= "PC_DEPTH) begin
$display("\nERROR at time %0t:", $time);
$display("PC Overflow\n");
// Use $stop for debugging
$stop;
end
// If we are doing a simultaneous read and write,
// there is no change to the counter
if (read_n == 1'bO) begin
// Increment the PC counter
counter <= #"DEL counter + 1;
end
// Store the data
pc_mem[wr_jpointer] <= #"DEL data_in,-
// Increment the write pointer
// Check if the write pointer has gone beyond the
// depth of the PC. If so, set it back to the
// beginning of the PC
if (wr_pointer == "PC_DEPTH-1)
wr_pointer <= #"DEL "PC_BITS'bO;
else






APPENDIX A4: MJAVA DATAPATH VERILOG CODE
31
D:\Programs\Xilinx\ISEworkingdir\MJava\MJava.v











input ["BYTE_WIDTH-1:0] byte in;
// OUTPUTS






reg ["INT_WIDTH-1:0] out stream;




wire ["INT WIDTH-1:0] st_outl;





reg ["INT WIDTH-1:0] buffE;
reg ["BYTE WIDTH-1:0] bytel ,-
reg ["BYTE WIDTH-1:0] byte2;
reg ["BYTE_WIDTH-1:0] opcode;
reg ["BYTE WIDTH-1:0] aluOper;
reg [1:0] counter_pc;
reg [1:0] counter op;
reg cbuhter_5f;










reg ["INT_WIDTH-1:0] local_var [0:4];
'/ Instantiating the necessary modules for the hardware
















































always @(posedge elk) begin




clk_count <= clk_count + 1;
if(clk_count == 0)
opcode <= 8'hOO;











end // end of always®





st_write <= ~st_write,- // enable write to stack
end
8'hll: begin











8'hlO: begin // Case: bipush
op_count = I,-
counter_op = 0;





pc_read = -pc_read; // enable read from pc
end
8'h60: begin // Case: iadd
pop_2 <= ~pop_2; // enable pop2
st_read <= ~st_read; // enable read from stack
end
8'h64: begin // Case: isub
pop_2 <= ~pop_2; // enable pop2
st_read <= ~st_read; // enable read from stack
end
8'h74: begin // Case: ineg
st_read <= ~st_read;
end
8'h78: begin // Case: ishl
pop_2 <= ~pop__2; // enable pop2
st_read <= ~st_read; // enable read from stack
end
8'h7a: begin // Case: ishr
pop_2 <= ~pop_2; // enable pop2
st_read <= ~st_read; // enable read from stack
end
8'h7e: begin // Case: iand
pop_2 <= ~pop_2; // enable pop2
st__read <= ~st_read; // enable read from stack
end
8'h80: begin // Case: ior
pop_2 <= ~pop_2; // enable pop2
st_read <= ~st_read; // enable read from stack
end
8'h82: begin // Case: ixor
pop_2 <= ~pop_2; // enable pop2
st_read <= ~st_read; // enable read from stack
end
8'h5f: begin // Case: swap
pop_2 <= ~pop_2; // enable pop2
st_read <= ~st_read; // enable read from stack
end
8'h3b: begin // Case: istore_0
st_read <= ~st_read; // enable read from stack
end
8'h3c: begin // Case: istore_l
st_read <= -st_read; // enable read from stack
end
8'h3d: begin // Case: istore_2
st_read <= ~st_read; // enable read from stack
end
8'h3e: begin // Case: istore__3
st_read <= ~st_read; // enable read from stack
end
















8'h06: begin // Case: iconst_3
































counter_op = counter_op + 1;
if(opcode_n) begin
opcode = byte_out;
opcode_n = ~opcode_n,- // opcode is assigned






if(counter_op == op_count) begin
pc_read = -pc_read; // disable read from pc
// for operands retrieval

















st_write = ~st_write; // enable write to stack







8'h60: begin // iadd
pop_2 <= ~pop_2; // disable pop2


















































































st read <= ~st read;
// isub
// disable pop2
// disable read from stack
// ineg
// disable read from stack
// ishl
// disable pop2
// disable read from stack
// select 5 LSB
// ishr
// disable pop2
// disable read from stack
// select 5 LSB
// iand
// disable pop2
// disable read from stack
// ior
// disable pop2
// disable read from stack
// ixor
// disable pop2
// disable read from stack
// swap
// disable read from stack
// enable write to stack
// reset swap counter
// istore_0
// disable read from stack
// istore_JL
// disable read from stack
// istore_2
// disable read from stack
// istore_3





















































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































> > > >








































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































I ICo Co 1 4^















APPENDIX D2: JVM INSTRUCTIONS AND OPERANDS DESCRIPTION
35
JAVA VIRTUAL MACHINE INSTRUCTION SET
mnemonic mnemonic















..., value1, value2 =»
..., value3
A longer description detailing constraints on operand stack con
tents or constant pool entries, the operation performed, the type of
the results, etc.
[f any linking exceptions may be thrown by the execution of this
instruction they are set off one to a line, in the order in which they
must be thrown.
[f any runtime exceptions can be thrown by the execution of an
instruction they are set off one to a line, in the order in which they
must be thrown.
Other than the linking and runtime exceptions, if any, listed for an
instruction, that instruction must not throw any runtime exceptions
except for instances ofVi rtualMachineError or its subclasses.
Comments not strictly part of the specification of an instruction are
set aside as notes at the end of the description.
Figure 6.1 An example instruction page
Each cell in the instruction format diagram represents a single 8-bit byte. The
instruction's mnemonic is its name. Its opcode is its numeric representation and is
153
154 THEJAVA™ VIRTUAL MACHINE SPECIFICATION
given in both decimal and hexadecimal forms. Only the numeric representation is
actually present in the Java Virtual Machine code in a cl ass file.
Keep in mind that there are "operands" generated at compile time and embed
ded within JavaVirtual Machine instructions, as well as "operands" calculated at
run time and supplied on the operand stack. Although they are supplied from sev
eral different areas, all these operands represent the same thing: values to be oper
ated upon by the Java Virtual Machine instruction being executed. By implicitly
taking many of its operands from its operand stack, rather thanrepresenting them
explicitly in its compiled code as additional operand bytes, register numbers, etc.,
the JavaVirtual Machine's code stays compact.
Someinstructions are presented as members of a family of related instructions
sharing a single description, format, andoperand stack diagram. As such, a family
of instructions includes several opcodes arid opcode mnemonics; only the family
mnemonic appears in the instruction format diagram, and a separate forms line
lists all member mnemonics and opcodes. For example, the forms line for the
konst_<l> family of instructions, giving mnemonic andopcode information for the
two instructions in that family (Iconstj and Iconstj), is
Forms konstJ=9 (0x9),
Iconstj = 10 (Oxa)
In the description of the Java Virtual Machine instructions, the effect of an
instruction's execution on the operand stack(§3.6.2) of the current frame (§3.6) is
represented textually, with the stack growing from left to right and each word
(§3.4) represented separately. Thus,
Stack ..., valuel, value2=$
..., result
shows an operation that begins by having a one-word value2 on top of the operand
stack with a one-word valuel just beneath it. As a result of the execution of the
instruction, valuel and value2 are popped from the operand stackandreplaced by a
one-word result,which has been calculatedby the instruction. The remainder of the
operand stack, represented by an ellipsis (...), is unaffected by the instruction's exe
cution.
The types long and doubl e take two words on the operand stack. In the oper
and stack representation, each word is represented separately using a dot notation:
JAVA VIRTUAL MACHINE INSTRUCTION SET 155
Stack ..., valuel wordl, valuel word2, value2 wordl, value2 word2 =>
..., result wordl, result word2
The Java Virtual Machine specification does not mandate how the two words are
usedto represent the 64-bit 1ong or double value; it only requires that a particular
implementation be internallyconsistent.





Forms bipush = 16 (0x10)
Stack
..., value
Description The immediate byte is sign-extended to an i nt, and the resulting
valueis pushed onto the operandstack.
171
198 THEJAVA™VIRTUAL MACHINE SPECIFICATION
dup dup
Operation Duplicate top operand stack word
Format dup
Forms dup = 89 (0x59)
Stack ..., word=$
..., word, word
Description The top word on the operand stack is duplicated and pushed onto
the operand stack.
The dup instruction must not be used unless word contains a 32-bit
data type.
Notes Except for restrictions preserving the integrity of 64-bit data types,
the dup instruction operates on an untyped word, ignoring thetype
of the datum it contains.
JAVA VIRTUAL MACHINE INSTRUCTION SET
dup2 dup2
Operation Duplicate top two operand stack words
Format dup2
Forms t/up2= 92 (0x5c)
Stack ..., word2, wordl =>
..., word2, wordl, word2, wordl
Description The top two words on the operand stack are duplicated and pushed
onto the operand stack, in the original order.
The dup2 instruction must not be used unless each of wordl and
wordlis a wordthat contains a 32-bitdata type or both together are
the two words of a single 64-bit datum.
Notes Except for restrictions preserving the integrity of 64-bitdata types,
the dup2 instruction operates on untyped words, ignoring the types
of the data they contain.
201






Forms goto = 167 (0xa7)
Stack No change
Description The unsigned bytes branchbytel and branchbyte2 are used to
construct a signed 16-bit branchoffset, where branchoffset is
(branchbytel « 8) | branchbyte2. Execution proceeds at thatoffset
from the address of the opcode of this goto instruction. The target
address must be that of an opcode of an instruction within the













Description Both valuel and value2 must beoftype i nt. The values are popped
from the operand stack. The i nt result is valuel + value2. The
result is pushed onto the operand stack.
If an iadd overflows, then the result is the low-order bits of the true
mathematical result in a sufficiently wide two's-complement for
mat. If overflow occurs, then the sign of the result will not be the
same as the sign of the mathematical sum of the two values.
240
iand








THEJAVA™ VIRTUAL MACHINE SPECIFICATION
iand
Description Bothvaluel and value2 must be of type i nt. They arepopped from
the operand stack.An i nt resultis calculatedby taking the bitwise









iconstjnl = 2 (0x2)
iconstj = 3 (0x3)
iconstj = 4 (0x4)
iconstj =5 (0x5)
iconstj = 6 (0x6)
iconstj = 1 (0x7)
iconstj =8 (0x8)
...,<!>
THEJAVAm VIRTUAL MACHINE SPECIFICA TION
iconst <i>
Description Push the int constant <i> (-1, 0, 1, 2, 3, 4 or 5) onto the operand
stack.
Notes Each of this family of instructions is equivalent to bipush <i> for
the respective value of <i>, except that the operand <i> is implicit.
JAVA VIRTUAL MACHINE INSTRUCTION SET
if_icmp<cond>





















Description Both valuel and value2 must be of type i nt. They are both popped
from the operand stack and compared. All comparisons are signed.
The results of the comparison are as follows:
• eq succeeds if and only if valuel = value2
• ne succeeds if and only if valuel •*• value2
• It succeeds if and only if valuel < value2
• le succeeds if and only if valuel < value2
• gt succeeds if and only if valuel > value2
• ge succeeds if and only if valuel > value2
245
246 THE JAVA™ VIRTUAL MACHINE SPECIFICATION
ifJcmp<cond> (cont) if_icmp<cond> (cont)
If the comparison succeeds, the unsigned branchbytel and
branchbyte2 are used to construct a signed 16-bit offset, where the
offset is calculated to be (branchbytel « 8) | branchbyte2. Execu
tion then proceeds at that offset from the address of the opcode of
this ifJcmp<cond> instruction. The target address must be that of
an opcode of an instruction within the method that contains this
ifJcmp<cond> instruction.
Otherwise, execution proceeds at the address of the instruction fol
lowingthis ifJcmp<cond> instruction.
JAVA VIRTUAL MACHINE INSTRUCTION SET
iinc




Forms iinc= 132 (0x84)
Stack No change
unc
Description The index is an unsigned byte that must be a valid index into the
local variables of the current frame (§3.6). The constis a immediate
signed byte. The local variable at index must contain an i nt. The
value const is first sign-extended to an i nt, then the local variable
at index is incremented by that amount.
Notes The iinc opcode can be used in conjunction with the wide instruc
tion to access a local variable using a two-byte unsigned index and




THE JAVA™ VIRTUAL MACHINE SPECIFICATION
iload
Operation Load i nt from local variable
Format iload
index
Forms iload= 21 (0x15)
Stack
..., value
Description The index is an unsigned byte that must be a valid index into the
local variables of the current frame (§3.6). The local variable at
index must contain an i nt. The value of the local variable at index
is pushed onto the operand stack.
Notes The iload opcode can be used in conjunction with the wide instruc
tion to access a local variable using a two-byteunsigned index.
JAVA VIRTUAL MACHINE INSTRUCTIONSET
iload <n> iload <n>






iloadj = 26 (Ox la)
iloadj = 27 (Oxlb)
iloadj = 28 (Oxlc)
iloadj = 29 (Oxld)
..., value
The <n> must be a valid index into the local variables of the cur
rent frame (§3.6). The local variable at <n> must contain an i nt.
The vaiue of the local variable at <n> is pushed onto the operand
stack.
Notes Each of the iload_<n> instructions is the same as iload with an
indexof <n>, except that the operand <n> is implicit.
253
JAVA VIRTUAL MACHINE INSTRUCTION SET
meg









Description The value mustbe of type i nt. It is popped from the operand stack.
The i nt result is the arithmetic negation of value, -value. The
resultis pushed onto the operandstack.
For int values, negation is the same as subtraction from zero.
Because the Java Virtual Machine uses two's-complement repre
sentation for integers and the range of two's-complement values is
not symmetric, the negation of the maximum negative int results
in that same maximum negative number. Despite the fact that over
flow has occurred, no exception is thrown.












THEJAVA™ VIRTUAL MACHINE SPECIFICATION
wr
Description Both valuel and value2 mustbothbe of type i nt. Theyare popped
from the operand stack. An i nt result is calculated by taking the
bitwise inclusive OR of valuel and value2. The result is pushed
onto the operand stack.
JAVA VIRTUAL MACHINE INSTRUCTIONSET
ishl









Description Both valuel and value2 must be of type i nt. The values are popped
from the operand stack. An i nt result is calculated by shifting
valuel left by s bit positions, where s is the value of the low five
bits of value2. The result is pushed onto the operand stack.
Notes This is equivalent (even if overflow occurs) to multiplication by 2
to the power s. The shift distance actually used is always in the
range 0 to 31, inclusive, as if value2 were subjected to a bitwise












THE JAVA™ VIRTUAL MACHINE SPECIFICATION
ishr
Description Both valuel and value2 mustbe of type i nt. The values arepopped
from the operand stack. An i nt result is calculated by shifting
valuel right by s bit positions, with sign extension, where s is the
value of the low five bits of value2. The result is pushed onto the
operand stack.
Notes The resulting value is L(vaiuel)/2SJ, where s is value2 &Oxlf.
For nonnegative valuel, this is equivalent to truncating int divi
sion by 2 to the power s. The shift distance actually used is always
in the range 0 to 31, inclusive, as if valueZ were subjected to a bit
wise logical AND with the mask value Oxlf.
JAVA VIRTUAL MACHINE INSTRUCTION SET
istore
Operation Store int into local variable
Format istore
index
Forms istore= 54 (0x36)
Stack ..., value
istore
Description The index is an unsigned byte that must be a valid index into the
local variables of the current frame (§3.6). The valueon the top of
the operand stack must be of type int. It is popped from the oper
and stack, and the value of the local variable at index is set to value.
Notes The istoreopcode can be used in conjunction with the wide instruc
tion to access a local variable using a two-byte unsigned index.
275
276 THE JAVA™ VIRTUAL MACHINE SPECIFICATION
istore <n> istore <n>
Operation Store i nt into local variable
Format istore <n>
Forms istorej = 59 (0x3b)
istorej = 60 (0x3c)
istoreJ = 61 (0x3d)
istorej = 62 (0x3e)
Stack ..., valuer
Description The <n> must be a valid index into the local variables of the cur
rent frame (§3.6). The value on the top of the operand stack must
be of type i nt. It is popped from the operand stack, and the value
of the local variable at <n> is set to value.
Notes Each of the istore_<n> instructions is the same as istore with an
index of <n>, except that the operand <n> is implicit.
JAVA VIRTUAL MACHINE INSTRUCTION SET
isub









Description Both valuel and value2 mustbe of type i nt. The values arepopped
from the operand stack. The i nt result is valuel - value2. The
result is pushed onto the operand stack.
For int subtraction, a - b produces the same result as a + (™b).
For i nt values, subtraction from zero is the same as negation.
Despite the fact that overflow or underflow may occur, in which
case the resultmay have a different sign than the true mathematical
result, execution of an isub instruction never throws a runtime
exception.
277
JAVA VIRTUAL MACHINE INSTRUCTION SET
ixor









Description Both valuel and value2 must both be of type i nt. They are popped
from the operand stack. An int result is calculated by taking the
bitwise exclusive OR of valuel and value2. The result is pushed
onto the operand stack.
279






Forms jsr = 168 (0xa8)
Stack
..., address
Description The address of the opcode of the instructionimmediatelyfollowing
this jsr instruction is pushed onto the operand stack as a value of
type returnAddress. The unsigned branchbytel and branchbyte2
are used to construct a signed 16-bit offset, where the offset is
(branchbytel « 8) | branchbyte2. Execution proceeds at that offset
from the address of this jsr instruction. The target address must be
that of an opcode of an instruction within the method that contains
this jsr instruction.
Notes Thejsr instruction is used with the retinstruction in the implemen
tation of the f i nal 1y clauses of the Java language (see Section
7.13, "Compiling finally"). Note thatjsrpushes the address onto
the stack and ret gets it out of a local variable. This asymmetry is
intentional.




Forms nop = 0 (0x0)
Stack No change
Description Do nothing.
JAVA VIRTUAL MACHINE INSTRUCTION SET 323
pop pop
Operation Pop top operand stack word
Format I ~pop
Forms pop = 87 (0x57)
Stack ...,word=>
Description The top word is popped from the operand stack.
The pop instruction must not be used unless word is a word that
contains a 32-bit data type.
Notes Except for restrictions preserving the integrity of 64-bit data types,
the pop instruction operates on an untyped word, ignoring the type
of the datum it contains.
324 THE JA VA™ VIRTUALMACHINE SPECIFICA TION
pop2 pop2
Operation Pop top two operand stack words
Format pop2
Forms pop2 = 88 (0x58)
Stack ..., wordl, wordl =>
Description The top two words are popped from the operand stack.
The pop2 instruction must not be used unless each of word wordl
and word2 is a word that contains a 32-bit data types or together are
the two words of a single 64-bit datum.
Notes Except for restrictions preserving the integrity of 64-bit data types,
the pop2 instruction operates on raw words, ignoring the types of
the data they contain.
JAVA VIRTUALMACHINE INSTRUCTION SET
ret ret
Operation Return from subroutine
Format ret
index
Forms ret = 169 (0xa9)
Stack No change
Description The index is an unsigned byte between 0 and 255, inclusive. The
local variable at index in the current frame (§3.6) must contain a
value of type returnAddress. The contents of the local variable
are written into the Java Virtual Machine's pc register, and execu
tion continues there.
Notes The ret instruction is used with jsr or jsr_w instructions in the
implementation of the finally keyword of the Java language (see
Section 7.13, "Compiling finally"). Note that jsr pushes the
address onto the stack and ret gets it out of a local variable. This
asymmetry is intentional.
The ret instruction should not be confused with the return instruc
tion. A return instruction returns control from a Java method to its
invoker, withoutpassing any value back to the invoker.
The retopcode canbe usedin conjunction with the wide instruction
to access a local variable using a two-byte unsigned index.
329









Description The immediate unsigned bytel and byte2 values are assembled into
anintermediate short where the value ofthe short is (bytel « 8) |
byte2. The intermediate value is then sign-extended to an int, and
theresulting value is pushed onto the operand stack.
333
334 THEJAVA™ VIRTUAL MACHINE SPECIFICATION
swap swap





swap = 95 (0x5f)
..., word2, wordl
..., wordl, word2
Description The top two words on the operandstack are swapped.
The swap instruction must not be used unless each of word2 and
wordl is a wordthat contains a 32-bitdata type.
Notes Except for restrictions preserving the integrity of 64-bit data types,
the swap instructionoperates on untyped words, ignoring the types
of the data they contain.
