Design and implementation of a suitable core for on-chip long-term verification by Viejo Cortés, Julián et al.
Design and implementation of a suitable core for
on-chip long-term veriﬁcation
J. Viejo, J. I. Villar, J. Juan, A. Millan, M. J. Bellido, and E. Ostua
Grupo ID2 (Investigacion y Desarrollo Digital)
Departamento de Tecnologia Electronica-Universidad de Sevilla
E. T. S. Ing. Informatica, Campus Universitario Reina Mercedes
41012 Sevilla (SPAIN)
Email: {julian,jose,jjchico}@dte.us.es, amillan@us.es, {bellido,ostua}@dte.us.es
Abstract—Traditional on-chip and off-chip logic analyzers
present important shortcomings when used for the long-term
veriﬁcation of industrial embedded systems, forcing the designer
to implement ad-hoc veriﬁcation solutions. This contribution
presents a suitable solution for long-term veriﬁcation of FPGA-
based designs consisting on a veriﬁcation core that uses the
Picoblaze microcontroller, dedicated logic and a serial port
communication in order to monitor the internal signals of the
system in a continuous way. The core design focuses on low
resource requirements and reusability and has been successfully
applied to the veriﬁcation of a real industrial synchronization
platform showing remarkable advantages over commercial on-
chip solutions like Xilinx’s ChipScope Pro.
I. INTRODUCTION
Over the past decade, FPGAs has become the major imple-
mentation technology for industrial embedded digital systems
due to its fast prototyping, short time-to-market and increasing
capabilities. The always reducing cost of FPGAs make them
suitable not only during the prototype phase but also for low
and medium volume production. The re-programmability of
FPGA chips is also very valuable during the design and pro-
duction phases of the system allowing for easier veriﬁcation,
debugging and support of the systems.
FPGA veriﬁcation can be done by simulation and hardware
execution [1]–[3]. Simulation is specially useful during the
ﬁrst stages of the design ﬂow to assure the correct operation
of the systems. But there are scenarios where simulation is not
a feasible approach, like the veriﬁcation of a whole complex
system, because the simulation time would be extremely huge
and/or computational resources may be exhausted. In this
cases, hardware execution is an interesting alternative derived
from the re-programmable nature of FPGAs that permits the
observation of the design under study in real time during
its operation. Such observations are taken in several ways
including external and on-chip logic analyzers. On-chip logic
analyzers like Chipscope [4] are able to acquire data at a
very high frequency and store the results in memory for
later processing, so data acquisition is limited by the storage
resources available in the chip [5], [6].
While the above mentioned limitation is not a serious
problem for the veriﬁcation of many types of systems, it
is important to note that on-chip logic analyzers share the
resources with the system under test and may consume an
important part of the area available, requiring signiﬁcantly
more resources during veriﬁcation than in the ﬁnal production
system.
A special case is the veriﬁcation of the correct long-term
operation of a system. This is specially useful to test the
robustness of the implementation in industrial aggressive envi-
ronments where systems are supposed to operate continuously
for months or years. In this case, the collection of data from
internal signals over a long period of time should be possible.
These data should be communicated to the outside of the
chip in order to avoid large internal storage resources (that
would be exhausted over time) and to allow for the continuous
monitoring of the system.
Neither on-chip or off-chip solutions ﬁt well to do this kind
of long-term veriﬁcation so designers typically have to develop
custom solutions (test logic and tools) for each design. A good
example is a network synchronization system remote terminal
units previously developed by the authors [7] where internal
data need be collected every few seconds for a period of days
or even months.
This contribution presents a more general solution for long-
term veriﬁcation of digital systems implemented on FPGA
that can be adapted to several problems to avoid the cost
of designing custom veriﬁcation logic. The proposed solution
takes the form of a test core based on the Picoblaze micro-
controller and associated tools that can greatly facilitates long
term veriﬁcation of complex systems with minimal resources
or external equipment requirements when compared to on-chip
or off-chip logic analyzers.
The paper is organized as follows: in section two an outline
of common veriﬁcation tools is presented from the point
of view of their applicability to scattered event acquisition
and analysis, section three describes the architecture of the
proposed long-term on-chip data acquisition core, in section
four the proposed solution and the commercial ChipScope Pro
test system are compared against a real application and, ﬁnally,
section ﬁve summarizes the most relevant conclusions derived
from this experience.
II. CURRENT SOLUTIONS FOR SYSTEM VERIFICATION
Currently there are three main types of solutions when
approaching digital system veriﬁcation: standalone logic ana-
lyzers, on-chip logic analyzers and custom cores for speciﬁc
purposes. In this section we will highlight the main features,
advantages and disadvantages of each one from the perspective
of their applicability to long-term veriﬁcation.
Standalone Logic Analyzers (SLAs) are very powerful tools
for debugging an already implemented design. This kind of
equipment is able to acquire data at a very high frequency
from any signal that can be accessed at the pins of the chip.
Moreover, they may have a large number of channels (100
or more) that makes them a very useful tool for debugging
high speed buses and signals between components. The main
disadvantage of SLAs is that they cannot reach signals inside
the design. To overcome this issue, designs are modiﬁed in
order to route the desired signals to external pins accessible
by the SLA thus modifying the characeristics and timing
parameters of the original design.
An evolution of SLAs are On-chip Logic Analyzers (OLAs)
like Xilinx’s ChipScope, that have become very popular in
the ﬁeld of programmable logic. This kind of analyzers are
hardware components that connect to the desired signals inside
the chip and communicate over a standard bus (usually RS232
or JTAG) with a computer that executes software for data
analysis. These components are an intrusive solution since the
veriﬁed design is different from the production design when
analysis components are removed.
These two types of veriﬁcation tools have a common
denominator: the limit of the capture size is given by the size
of the storage memory since they store a complete capture
frame before sending it to the processing unit.
Some tests require to capture some kind of events from
within the system continuously. To capture these events, de-
velopers usually create custom debugging cores (CDCs) for
every speciﬁc purpose when SLAs or OLAs do not fulﬁll the
veriﬁcation requirements [8], [9].
III. LOGICAL EVENT ANALYZER
A. Architecture of the Logical Event Analyzer designed
To overcome the cost of designing a CDC for every ap-
plication, we propose a general purpose device, the Logical
Event Analyzer (LEA), that can ﬁll the gap between SLA
and OLA and substitute CDC in several practical cases. As
it can be observed in Fig. 1, SLAs and OLAs are used to
verify high-speed systems where the number of samples is not
a critical aspect. However, LEA can be used for debugging
systems where it is necessary to capture a large number of
samples spaced in the time. The LEA also features a much
lower footprint than an OLA.
The architecture of the proposed analyzer is based on the
PicoBlaze microprocessor from Xilinx [10]. Fig. 2 shows
the block diagram of the designed analyzer. As it can be
observed from the diagram, a set of input ports of PicoBlaze
are reserved for trigger, clock and communication control
signals. The remaining ports are dedicated to capturing data
signals. PicoBlaze allows for the addressing of 256 8-bit ports.
Thus, using a single PicoBlaze module and dedicating N
ports to trigger, clock and control signals the analyzer can

 





	








	

Fig. 1. Field of applicability of several veriﬁcation solutions.
	
	 


	
	
		
	
	
	 
Fig. 2. Architecture of the designed analyzer.
capture (256−N)× 8 bit signals. If it is necessary to capture
more signals a simple solution is to add an external n bits
register to extend the port selection signal port id. By using
this alternative, the maximum number of signals that can be
sampled would be (256 − N) × 2n ∗ 8. The number of data
signals that can be acquired is also limited by the rate at which
captured data can be transmitted out of chip as we will discuss
later.
PicoBlaze uses one of their output ports to communicate
with an UART that is in charge of transmitting the captured
data via serial. The UART has a half full buffer signal which
indicates that the FIFO is half full, and its baud rate can be
set as needed. PicoBlaze will use the half full buffer signal
to control the data transmission to the UART. Assuming the
following UART conﬁguration: ﬁxed data format ”8N1” (8
data bits, 1 stop bit and no parity), communication through the
RX and TX signals (no other signals needed), and ﬂow control
disabled, the maximum bit rate (bps) that can be obtained is
calculated according to (1).
BitRate =
BaudRate ∗ 8
10
(1)
Finally, the program module corresponds to the program that
will be run by PicoBlaze. This program performs the following
tasks:
1) Trigger condition veriﬁcation. PicoBlaze reads the ports
assigned to the trigger signals and applies the conﬁgured
logical function. In the case this condition is veriﬁed,
data acquisition begins.
2) Data acquisition according to the clock signal. The clock
signal corresponds to an event of the designed system,
so that whenever this event occurs PicoBlaze starts to
read the data connected to the input ports.
3) Data processing and transmission. This processing con-
sists of calculating a checksum of the transmitted data
so that the receiver can verify the correct reception of
information. Data will be sent to the UART.
4) Communication control. Periodically, the microproces-
sor will check the status of half full buffer signal. If it
is active PicoBlaze will wait for the FIFO to becomes
half empty before sending more data.
IV. APPLICATION EXAMPLE AND RESULTS
In this section we describe the application of the LEA to
the on-chip veriﬁcation of a SNTP client and server fully
implemented in hardware. SNTP is a simpliﬁed version of
the more general Network Time Protocol (NTP) [11] that
is commonly used for synchronizing the clocks of computer
systems over data networks such as the Internet. The operation
of this protocol is to send periodic time requests to a server
synchronized with an accurate time source like a GPS receiver
at request intervals that can vary from a few seconds to several
minutes. When the server reply is received, the client uses a
set of timestamps to calculate the round trip time and the time
offset between the client’s and server’s clocks.
The client can then adjust its local clock based on these
calculations. In a typical scenario, the client will be accurately
synchronized to the server only after several request-response
cycles. Since the time between requests can vary from a few
seconds to several minutes the most important aspect in the
testing analysis of these systems is not the speed at which
samples are acquired but to capture a large number of system
events, covering a wide time interval.
On-chip veriﬁcation tools like ChipScope Pro feature high
frequency sampling which allow the testing of high-speed
buses and systems, but they face some limitations regarding
the maximum number of samples that can be obtained from
the system. This is mainly due to: 1) internal resources of the
FPGA are used to store the samples, and 2) some of these
resources, depending on the type of programmable device
used, are often limited. The number of LUTs, FFs and BRAMs
used by ChipScope depending on the number of signals and
the number of samples are shown in Figures 3, 4, and 5,
respectively. As it can be observed from Figures 3 and 4, LUTs
and FFs depend mainly on the number of signals. However,
Figure 5 shows that the main problem when performing
on-chip veriﬁcation using such tools is that the number of
BRAMs used is directly proportional the number of signals
and the number of samples. Furthermore, an additional BRAM
must be included for each added Integrated Logic Analyzer
(ILA) since each ILA only can capture a maximun of 256
signals. The total number of BRAMs is calculated according
to (2). Thus, considering that BRAMs are the most limited
resource and ﬁxing a number of signals to capture, this type
512
1024
2048
4096
8192
16384
32
64
128
256
512
0
500
1000
1500
2000
Number of signalsNumber of samples
N
u
m
b
e
r 
o
f 
L
U
T
s
Fig. 3. Number of LUTs dependency on number of signal and samples using
ChipScope.
512
1024
2048
4096
8192
16384
32
64
128
256
512
0
500
1000
1500
Number of signalsNumber of samples
N
u
m
b
e
r 
o
f 
F
F
s
Fig. 4. Number of FFs dependency on number of signal and samples using
ChipScope.
512
1024
2048
4096
8192
16384
32
64
128
256
512
0
200
400
600
Number of signalsNumber of samples
N
u
m
b
e
r 
o
f 
B
R
A
M
s
Fig. 5. Number of BRAMs dependency on number of signal and samples
using ChipScope.
of simulation is unfeasible if the goal is to capture a large
number of samples.
NumBRAMs =
∣
∣
∣
∣
NumSignals×NumSamples
BRAMsize(bits)
∣
∣
∣
∣
+
+NumILAs (2)
For the case under discussion, the SNTP client and
server have been implemented on a Spartan-3E FPGA device
(xc3s500e). These FPGAs have a total of 20 BRAMs and each
0 2000 4000 6000 8000 10000 12000 14000 16000
0
20
40
60
80
100
120
Number of samples
R
e
s
o
u
rc
e
s
 u
s
e
d
 (
%
)
LUTs
FFs
BRAMs
Fig. 6. Percentage of used resources dependency on the number of samples
for 256 signals using ChipScope.
0 100 200 300 400 500
0
10
20
30
40
50
60
70
80
90
100
Number of signals
R
e
s
o
u
rc
e
s
 u
s
e
d
 (
%
)
LUTs
FFs
Slices
BRAMs
Fig. 7. Percentage of used resources dependency on the number of signals
using LEA.
block contains 18, 432 bits of fast static RAM, with 16 Kbit
allocated for data storage. For each design a total of 256
signals have been sampled: timestamps (least signiﬁcant part),
time offset, round trip time and adjustment parameter of the
local clock. With this conﬁguration, the percentage of used
resources (LUTs, FFs and BRAMs) to verify the system in
terms of number of samples is shown in Figure 6. As shown
in that ﬁgure, the LUT and FF usage is not critical, since it
is a 9% and 8% respectively of total resources in the worst
case. However, there is an excessive use of BRAMs even for
a small number of samples. A maximum of 1024 samples can
be captured which is insufﬁcient for the type of system that
is intended to verify.
The developed LEA has the advantage that it does not store
data in BRAM but transmits them via serial. Therefore the
number of BRAMs used to verify the system does not depend
on the number of signals or the number of samples so the
system can be tested indeﬁnitely, only limited by external
resources. For the rest of the FPGA resources, they become
solely dependent on the number of signals (Fig. 7). For the
same scenario presented for ChipScope (sampling of 256
signals) the percentage used of LUTs, FFs, Slices and BRAMs
has been 3,79%, 1,68%, 5,35% and 5% respectively. It is worth
noting that only a BRAM is used (this memory stores the
program that will be run by PicoBlaze).
V. CONCLUSION
In this contribution, a veriﬁcation core based on Picoblaze
for long-term on-chip veriﬁcation is presented. The proposed
solution allow developers to avoid the implementation of
custom veriﬁcation cores in many cases, greatly improving
the design and veriﬁcation time.
The proposed solution has been compared to ChipScope
Pro on-chip logic analyzer in the veriﬁcation of a real syn-
chronization system. The results show that the ChipScope Pro
tool is not suitable to verify the system because this would
need excessive internal resources to store the captured data
even for a small number of samples to acquire. The proposed
core does not store data using internal resources but transmits
them via serial port, so the system can be veriﬁed indeﬁnitely,
only limited by external computer storage.
Future work involves the replacement of PicoBlaze with an
open source multivendor microcontroller, the development of
a high speed interface to wider the range of applications and
the study of more complex and ﬂexible triggering possibilities.
ACKNOWLEDGMENT
This work has been partially supported by the Ministry of
Education and Culture of the Spanish Government through the
TEC2007-61802/MIC (HIPER) project and the PROFIT-MITC
SEPIC TSI-020100-2008-258 project.
REFERENCES
[1] T. Wheeler, P. Graham, B. Nelson, and B. Hutchings, “Using Design-
Level Scan to Improve Design Observability and Controllability for
Functional Veriﬁcation of FPGAs,” in 2001 Proceedings International
Conference on Field-Programmable Logic and Applications (FPL),
Belfast, Northern Ireland (UK), Aug. 2001, pp. 483–492.
[2] P. S. Graham, “Logical Hardware Debuggers for FPGA-Based Systems,”
Bringham Young University, Department of Electrical and Computer
Engineering, PhD Dissertation, 2001.
[3] N. Ohba and K. Takano, “Hardware debugging method based on signal
transitions and transactions,” in IEEE Proc Asia and South Paciﬁc conf
on Design Automation, Yokohama (Japan), Jan. 2006, pp. 454–459.
[4] Xilinx, ChipScope Pro 11.1 Software and Cores User Guide. Xilinx,
Inc., Apr. 2009.
[5] K. Arshak, E. Jafer, and C. Ibala, “Testing FPGA based digital system
using XILINX ChipScopeTM logic analyzer,” in 29th International
Spring Seminar on Electronics Technology (ISSE), St. Marienthal (Ger-
many), May 2006, pp. 355–360.
[6] L. Ehrenpreis, P. Ellervee, and K. Tammemae, “Open Source On-Chip
Logic Analyzer for FPGAs,” in 2006 International Baltic Electronics
Conference, Tallinn (Estonia), Oct. 2006, pp. 1–4.
[7] J. Viejo, J. Juan, M. J. Bellido, E. Ostua, A. Millan, P. Ruiz-de Clavijo,
A. Mun˜oz, and D. Guerrero, “Design and implementation of a SNTP
client on FPGA,” in Proc. 2008 IEEE International Symposium on
Industrial Electronics (ISIE), Cambridge (United Kingdom), Jun. 2008,
pp. 1971–1975.
[8] T.-Y. Lee, Y.-H. Fan, S.-C. Yen, C.-C. Tsai, and R.-S. Hsiao, “An
Integrated Functional Veriﬁcation Tool for FPGA Systems,” in Interna-
tional Conference on Innovative Computing, Information and Control,
Kumamoto (Japan), Sep. 2007, p. 203.
[9] G. Knittel, S. Mayer, and C. Rothlaender, “Integrating Logic Analyzer
Functionality into VHDL Designs,” in 2008 International Conference on
Reconﬁgurable Computing and FPGAs, Cancun (Mexico), Dec. 2008,
pp. 127–132.
[10] K. Chapman, PicoBlaze 8-Bit Embedded Microcontroller User Guide
for Spartan-3, Virtex-II and Virtex-II PRO FPGAs. Xilinx, Inc., Nov.
2005.
[11] D. L. Mills, “Network Time Protocol (Version 3) Speciﬁcation,
Implementation and Analysis,” RFC 1305 (Draft Standard), Mar. 1992.
[Online]. Available: http://www.ietf.org/rfc/rfc1305.txt
