FPGA implementation of eye tracker video processing by The Pennsylvania State University CiteSeerX Archives
 
 FPGA implementation of eye tracker video processing  
Project Report  
Dan Pontillo, College of Imaging Science, MS student  Faculty Sponsor: Jeff Pelz  
 
Abstract  
Current wearable video video-based eye tracking technology has considerably strict 
limitations in size and processing speed. Hardware used to process video must be lightweight, 
compact and somewhat modular. The video should be processed as it is being collected in real 
time and accurately enough to generate useful, reliable data. The utilization of general 
purposegeneral-purpose logic processors makes the implementation of basic pupil detection and 
eye tracking algorithms simple and compact, but it introduces accuracy issues due to the 
necessary simplicity of the computational methods used to detect and track the pupil. Large, 
complicated hardware implementations are more accurate, but they are quickly outdated and 
unwieldy. Application Application-specific programmable logic devices solve these issues in 
part since they allow users to synthesize a fast hardware device that realizes complex and robust 
algorithms. The devices are also capable of being updated or redesigned quickly and easily, as 
often as needed. When applied to eye tracking, this could act to improve on the accuracy issues 
and maintain functionality within a small, self-contained physical device. A basic video 
processing device implementation was attempted within severe functional constraints for this 
project. An FPGA was purchased and several test architectures were implemented, but no 
functional firmware could be produced. The FPGA support software and service were misleading 
and unable to assist in the development process, so a complete product was not attained. While 
the project shows some latent promise as a first step toward a hardware implementation, a 
practically useful device has not yet been completed.  
Scientific justification  
Programmable logic devices offer an inexpensive, compact and highly customizable 
medium for applications that require unique or complex computational methods. Field 
Programmable Gate Arrays (FPGAs) have been widely accepted as effective and versatile 
devices that help to overcome the cost and time issues normally associated with application-
specific hardware implementations. The use of FPGAs in video processing has in fact become 
quite common in scientific imaging applications. Xilinx, a major manufacturer of FPGA 
hardware and firmware, markets an FPGA platform specifically created for video processing 
tasks. The ability to design logic specifically for a given computation such as signal processing 
can significantly increase speed by removing the computational overhead induced by translating 
high levelhigh-level algorithms into terms that general arithmetic logic devices can handle.  
The video processing done in existing wearable eye trackers varies greatly in its 
complexity and speed. Pupil center corneal reflection (PCCR) techniques often utilize a simple 
centroid calculation to estimate the center of the pupil and relate it with the corneal reflection to 
determine a sight trajectory for the subjectan observer’s point-of-regard (POR). The centroid 
estimation can be calculated using basic statistical techniques. The simplicity allows it to be 
implemented on slower hardware and even as a high level software application that can be run on 
an off-the-shelf computer attached to the device in real time. This approach introduces several 
problems that are seen in boundary conditions, such as extreme pupil positions, obstructions 
from eyelids, and poor lighting. More complicated geometrical techniques for pupil ellipse fitting 
offer a higher degree of accuracy and less error. They are, however, significantly more 
computationally challenging, and in practice these methods often utilize non-configurable 
proprietary hardware components for fast processing.  In the RIT Multidisciplinary Vision Research Lab (MVRL), the Yarbus system uses the 
former high levelhigh-level software technique, and the ISCAN system uses the latter, more 
accurate technique, implemented largely in hardware. While Yarbus is simple and modular in its 
approach, it has problems with track extrapolation and pupil determination, as described 
previously. ISCAN utilizes a set of application application-specific video video-processing 
hardware, and its technology is quickly becoming outdated. While it is effective, the complexity 
of its hardware setup is its drawback.  
 
Implementation and Results 
The initial method attempted for fast prototyping was to build applications off of the sample 
applications that came with the FPGA kit. Starter applications included architecture for Sobel 
filters, gamma correction, and brightness/contrast adjustment. These basic functions would 
have been useful as a basis upon which to build the full scale pupil and corneal reflection 
detection systems.  
Unfortunately, the source code included with the FPGA kit did not synthesize correctly for 
the hardware. None of the sample applications were successfully synthesized, and the only 
functional aspect of the kit was the pre-synthesized bit files that came with the sample 
software.  
The second method attempted for implementation was to write a simple image thresholding 
firmware from scratch. This simply involved routing the signal from one of the video inputs 
on the daughter card into the FPGA core, doing a simple multiplexer operation on the 
digitized image signal, then routing it out of the port on the daughter card.  While the 
implementation was written in VHDL very simply, with minimal issue and a successful 
synthesis, the implemented code did not output any video signals. The main culprit for this failure was the lack of supportive software included with the video kit. The drivers and 
updates included did not support the very hardware that was purchased. Even after updating 
the device definitions and contacting the Xilinx support channels, the only way the software 
would recognize the FPGA was through manually editing the device file. This was the most 
likely reason for the entire setup being dysfunctional.  
Conclusion 
While ostensibly, no useful practical implementation was achieved, the project was useful in 
that the requirements for such a system have been better defined and the platform now exists 
for continued development if more funding can be obtained. A newer version of the software 
and perhaps the replacement of the FPGA hardware itself could lead to a more successful 
outcome in the future. Xilinx support's unwillingness to provide answers to the problems 
confronted with the basic included software, and the lack of built in device support within the 
video kit are the major causes of the lack of practical output. Since there are no more direct 
avenues for this, the project has been tabled until further resources can be procured. 
 References  
C.H. Morimoto and M. R.M. Mimica, “Eye gaze tracking techniques for 
intractive applications,” CVIU, special issue on eye detection and 
tracking, 2005.  
Li, F., Kolakowski, S. and Pelz, J. B., “A model-based approach to video-based eye 
tracking, ” Journal of Modern Optics (2007).  
Budget  
XtremeDSP Video Starter Kit — Spartan-3A DSP Edition $1595  
One Spring quarter partial stipend tuition funding -$3905  