An Evaluation of the application of partial evaluation on color lookup table implementations by Hibbits, Jordan




An Evaluation of the application of partial
evaluation on color lookup table implementations
Jordan Hibbits
Follow this and additional works at: http://scholarworks.rit.edu/theses
This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion
in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact ritscholarworks@rit.edu.
Recommended Citation
Hibbits, Jordan, "An Evaluation of the application of partial evaluation on color lookup table implementations" (2012). Thesis.
Rochester Institute of Technology. Accessed from
An Evaluation of the Application of




A Thesis Submitted in Partial Fulfillment of the Requirements for the




Assistant Professor Dr. Dorin Patru
Department of Electrical and Microelectronic Engineering
Kate Gleason College of Engineering




Dr. Dorin Patru, Assistant Professor
Thesis Advisor, Department of Electrical and Microelectronic Engineering
Dr. Eli Saber, Professor
Committee Member, Department of Electrical and Microelectronic Engineering
ii
Dr. Sohail Dianat, Professor
Committee Member, Department of Electrical and Microelectronic Engineering
Dr. Sohail Dianat, Department Head
Department Head, Electrical and Microelectronic Engineering
Thesis Release Permission Form
Rochester Institute of Technology
Kate Gleason College of Engineering
Title:
An Evaluation of the Application of Partial Evaluation on Color Lookup
Table Implementations
I, Jordan A. Hibbits, hereby grant permission to the Wallace Memorial





An Evaluation of the Application of Partial Evaluation on Color
Lookup Table Implementations
Jordan A. Hibbits
Supervising Professor: Dr. Dorin Patru
A number of SRAM-based field-programmable gate arrays (FPGAs) allow
for partial reconfiguration, allowing a part of the device to be reconfig-
ured while the rest of the device continues operating. Partial evaluation, or
instance-specific design, allows a design to be optimized to a specific set
of inputs. When combined with partial reconfiguration, the reconfigurable
module can be reinstantiated based on the inputs to be processed and im-
prove the performance of the design.
This thesis explores the effects, particularly the performance vs flexibil-
ity tradeoff, of using partial evaluation on the color look-up tables (CLUTs)
of a color-space conversion module implemented on an FPGA. This the-
sis examines the impact of implementing the CLUTs as distributed RAMs,




Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3 Design of Experiment . . . . . . . . . . . . . . . . . . . . . . 9
3.1 Tested Implementation Variants . . . . . . . . . . . . . . . 10
3.1.1 Initialized Block RAMs . . . . . . . . . . . . . . . 10
3.1.2 Distributed RAMs . . . . . . . . . . . . . . . . . . 11
3.1.3 Distributed ROMs . . . . . . . . . . . . . . . . . . 11
3.2 Block ROMs . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Test Bench Design . . . . . . . . . . . . . . . . . . . . . . 12
4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1 Distributed RAM . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Distributed ROM . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Initialized Block RAM . . . . . . . . . . . . . . . . . . . . 19
4.4 Block ROM . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
A Makefile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
vi
B Hex2Coe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
vii
List of Tables
2.1 Device Summary . . . . . . . . . . . . . . . . . . . . . . . 8
4.1 CSC engine implementation results (XC2VP30). . . . . . . 18
4.2 RAM resource requirements . . . . . . . . . . . . . . . . . 18
4.3 Distributed ROM PRR synthesis results . . . . . . . . . . . 20
4.4 Initialized BRAM results . . . . . . . . . . . . . . . . . . . 20
4.5 Virtex 6 Block ROM Resource Utilization . . . . . . . . . . 22
4.6 Estimated partial reconfiguration timings (XC6VLX240) . . 23
viii
List of Figures
2.1 Core of the CSC engine. . . . . . . . . . . . . . . . . . . . 5
3.1 CLUT interface changes. . . . . . . . . . . . . . . . . . . . 13
3.2 Tool flow for testing the CSC. . . . . . . . . . . . . . . . . 15




Color space conversion modules are used in a variety of commercial appli-
cations, including printers, scanners, and digital cameras. These devices
typically implement color space conversion on application-specific inte-
grated circuits (ASICs), gaining performance compared to software at the
cost of flexibility. By utilizing partial reconfiguration (PR) that SRAM-
based FPGAs allow, it is possible to retain the flexibility of software while
achieving performance much nearer that of an ASIC implementation.
This thesis explores the effects of partial evaluation on an FPGA im-
plementation of a color space conversion (CSC) engine using color lookup
tables (CLUTs) as the element to test. The purpose of these tests was to
improve the performance of a partially reconfigurable CSC engine. In the
current PR CSC design, the partially reconfigurable region (PRR), or re-
configurable partition (RP) in the new PR design flow, is reconfigured,
followed by the configuration of the CLUTs. In this thesis we evaluated
2
a number of partial evaluation variations to the CLUT design, including
distributed RAMs, distributed ROMs, block RAMs with initial values, and
block ROMs.
Chapter 2 reviews the background of this project. Chapter 3 describes
the test cases studied in this thesis. Chapter 4 presents the results obtained.




Typically, FPGAs are programmed in their entirety, causing the system to
wait while reconfiguration occurs. With partial reconfiguration (PR), a re-
gion of the device is reconfigured while the rest of the system can continue
processing data. Manet et al. have reported on the advantages of dynamic
partial reconfiguration in software defined radios and professional electron-
ics [6]. Blodget et al. described a self reconfigurable platform (SRP), where
specific circuits on the FPGA are used to reconfigure other regions of the
device [1].
Partial evaluation, or instance-specific design, is an area of research in
reconfigurable computing in which multiple FPGA implementations of a
circuit are synthesized, each specialized to a specific instance of the prob-
lem. By specializing the circuit around the data to be processed, the hard-
ware typically becomes simpler, smaller, and faster. For example, instead of
implementing a filter with programmable coefficients, one can implement
4
multiple filters, each with a specific set of coefficients [4]. Previous uses of
partial evaluation on FPGAs have used dynamic run-time synthesis of the
circuit [7]. In the study of the CSC, the implementation of specific hardware
for each set of CLUTs is an example of partial evaluation.
The application focused on in this thesis is a color space conversion
(CSC) engine. A color space is a method of describing color in a standard
way [3]. Several standardized color spaces exist, for example RGB, CMYK,
CIE LAB, etc., as well as color spaces specific to individual devices such
as printers or scanners. A typical application of a CSC engine is to convert
from the color space of the device to a standardized color space, in a scan-
ner or camera, or from a standardized color space to a device-specific color
space, as in a printer. The conversion calculations are usually non-linear and
complex in multiple dimensions [5].
The method of conversion of interest is using color look-up tables to per-
form arbitrary conversions. This is typically done in an application-specific
integrated circuit (ASIC), however, prior work [2, 8] has taken an ASIC
implementation of a CSC engine provided by Hewlett Packard (HP) and
implemented it on a Virtex-II Pro XC2VP30 FPGA. The ASIC implemen-
tation has two major conversion units, one for converting a 3D input color
space (i.e., RGB) to another 3D or 4D color space, and one for converting a
4D input color space (i.e., CMYK) to another 3D or 4D color space, as seen
5
Figure 2.1: Core of the CSC engine.
in Figure 2.1. The ASIC also provides multiple write paths to the CLUTs.
Due to resource limitations of the XC2VP30, and the availability of partial
reconfiguration, the 3D and 4D modules were replaced with a single mod-
ule which, through PR, can be configured for either 3D or 4D conversions.
The FPGA implementation of [2] used block RAM resources to store the
CLUTs.
SRAM-based FPGAs are the driving force of partial reconfiguration,
with the Virtex family of Xilinx FPGAs the center of much research. This is
due to the configuration architecture, which consists of three layers. The
routing layer programs the interconnects related to the global resources,
namely clocks, to ensure proper distribution in the FPGA. The user logic
layer forms the hardware components of the FPGA design. The configura-
tion layer configures the interconnects present to connect the user hardware
components. In addition, the internal configuration access port (ICAP)
present on these devices allows for user logic on the FPGA to access con-
figuration memory on the device, enabling self-reconfiguration. The ICAP
6
has an 8 bit interface on the Virtex-II Pro, with a 32 bit interface available
on Virtex 4, 5, and 6 devices, and a maximum frequency of 50 MHz to 100
MHz depending on the device.
This CSC application, due to its extensive use of CLUTs, is restricted by
the available memory resources on the target FPGA. Table 2.1 shows the
available memory resources of the various FPGAs used in the tests of this
thesis. Of particular interest is the relative increase in available BRAM re-
sources compared with the available distributed memory resources; between
the Virtex 5 and Virtex 6, the available BRAM resources doubled, while the
available LUT-based memory increased by more than a factor of 3.6.
Testing of the CSC was done across four Xilinx devices, one Spartan
and three Virtex devices. In particular, the Spartan 3E XC3S1600E, Virtex-
II Pro XC2VP30, Virtex 5 XC5VLX110T, and Virtex 6 XC6VLX240T were
used. The Virtex-II Pro was originally chosen as the target device when
the CSC was implemented on an FPGA, as the Xilinx University Program
boards use the Virtex-II Pro. The Spartan 3E was chosen to investigate the
implementation of the CSC engine on a low-cost FPGA. The Virtex 5 was
explored as a potential upgrade route for newer testing. The Virtex 6 was
chosen when finally upgrading, to migrate to the latest PR-enabled FPGA.
The Virtex-II Pro, now obsolete, was the first FPGA family to feature
7
the ICAP. This family of devices, in addition to the ICAP, featured sev-
eral multi-gigabit transceivers, hardware multipliers and block memory, and
larger devices in the family contained one or two IBM PowerPC 405 pro-
cessor cores. ICAP is limited to 50 MHz on the Virtex-II Pro devices.
The Spartan 3E, on the other hand, was the first low-cost FPGA family
to feature the ICAP module; however, functionality of the ICAP was not
directly supported by the Xilinx design tools. The design tools supported
by the Xilinx tools (specifically PlanAhead) were oriented around module-
based partial reconfiguration. Due to the design of the Virtex family of
FPGAs, the rest of the reconfigurable fabric-including unused bits in a PR
frame-is capable of continuing to operate glitch-free during partial reconfig-
uration. However, when undergoing PR on the Spartan 3E, unused bits in
a frame are temporarily reset; a method of handling these glitches must be
added to any PR design. Both Virtex-II Pro and Spartan 3E frame sizes are
one column of CLBs or block memory.
In addition to increasing the available resources, the Virtex 5 and 6 also
featured enhancements to the ICAP module. Limited to 50 MHz and 8 bits
in the Virtex-II Pro and Spartan 3E, the ICAP in Virtex 5 and 6 can be
configured for data widths of 8, 16, or 32 bits. On Virtex 5 and 6 devices,
ICAP is limited to 100 MHz.
8
Table 2.1: Device Summary
Family BRAM (Kbits) Distributed RAM (Kbits) Multipliers/DSP Slices
Min Tested Max Min Tested Max Min Tested Max
Spartan 3E 72 648 648 15 231 231 4 36 36
Virtex II Pro 216 2448 7992 44 428 1378 12 136 444
Virtex 5 936 5328 18576 320 1120 2280 24 64 1056




We have studied the effects of partial evaluation on the color look-up table
values in the HP color space conversion modules on Xilinx FPGAs. Four
implementations of storing the CLUT data are investigated in this thesis.
1. Initial values to (block) RAMs




Our initial investigation into initial values to BRAMs is based on the idea
that since BRAMs are being used already, initializing the cells should give
us initial values without any additional overhead. With that premise, if there
is no additional overhead from initializing BRAMs, converting to Block
10
ROMs would allow for the removal of hardware associated with writing
to the CLUTs, reducing resource utilization and potentially increasing the
clock rate.
3.1 Tested Implementation Variants
Since not every FPGA has an abundance of BRAMs, though as seen in
Table 2.1 the amount of available BRAMs is increasing with newer genera-
tions of FPGAs, we also looked into using the LUT resources of an FPGA to
implement the CLUTs, known in Xilinx tools as distributed RAMs and dis-
tributed ROMs. A potential application of distributed memories could allow
for a combination of distributed and block memories based on the available
resources of the specific FPGA.
3.1.1 Initialized Block RAMs
The simplest change tested was initializing the BRAMs to store initial CLUT
data. This testing led to the creation of tools and scripts to aid in the im-
plementation of the design. Without initial values, a single BRAM Xilinx
CORE Generator module of each memory size was created to simplify the
design. However, initialized BRAMs require individual BRAM modules
to specify unique coefficient (COE) files. A tool (hex2coe) was created to
11
convert from the transactional based CLUT data containing 32 bit values,
used to fill the CLUTs one location at a time at runtime, to the COE for-
mat containing 40 or 48 bit values. This tool reads in the CLUT hex data
and uses bit shifting to first separate each channel then writes them, in hex,
as a coefficient file. Scripts to generate the BRAM modules, as well as a
makefile to create the entire design, were created as well.
3.1.2 Distributed RAMs
The largest obstacle encountered while designing the distributed RAM im-
plementation was the resource limitation. Initial testing on the XC2VP30
led to the discovery that, even considering just the resource requirements of
the individual 3D or 4D conversion module of the PRR, the XC2VP30 did
not have sufficient resources for accurate testing of such a design. This led
to an evaluation of other devices and device families for a comparison of the
resources available on the FPGAs.
3.1.3 Distributed ROMs
To test distributed ROMs, an individual distributed ROM module needed to
be created for each CLUT and each test. As distributed ROMs store data
in the LUT resources of the FPGA as an optimized function. The required
resources depend on the test case; a set of identity CLUTs simplify down to
12
wires, while complex CSC conversions use resources comparable to those
used by distributed RAMs, as little if any simplification can be done. With
this design the write interface to the CLUTs in the PRR was removed.
3.2 Block ROMs
Similarly, block ROMs also used specific block ROM instantiations for each
CLUT. While initial testing of block ROMs was done on the Virtex-II Pro
XC2VP30, the block ROM design was fully tested in simulation and hard-
ware on the Virtex 6 XC6VLX240T. The block ROM design removes the
entire write mechanism to CLUTs in the CSC, including the pre- and post-
conversion modules (Figure 3.1). This allows for the removal of all hard-
ware involved in writing to the CLUTs, in particular the Autoload module,
responsible for loading the CLUTs from the pixel pipeline. While removing
all write access in the design is not entirely practical, as the pre- and post-
conversion CLUTs need to be writable, this implementation allows for the
greatest reduction in hardware.
3.3 Test Bench Design
During the course of this project we have revised the test bench several
times. The initial test bench from [2], implemented on two Virtex-II Pro
13
Figure 3.1: CLUT interface changes.
XC2VP30 Xilinx University Program development boards, proved limit-
ing. The resources available on the XC2VP30 was insufficient for multiple
tests. Additionally, the test bench takes on the order of 30 minutes to load
test data, leading to evaluating alternative test setups. After initial results
pointed to the resource limitations of the XC2VP30, a VHDL test bench was
used to verify functionality of a non-PR implementation. Initial synthesis
and VHDL testing for distributed ROM and distributed RAM implemen-
tations were performed, but not completed due to limitations in resources,
on the Virtex-II Pro XC2VP30. The results on the Virtex-II Pro led to the
testing of distributed ROM implementations of the PRR on the Virtex 5
XC5VLX110T and Spartan 3E XC3S1600E.
14
Initial testing of initialized block RAM and block ROM implementa-
tions were performed on the Virtex II Pro XC2VP30 and completed on the
ML605 evaluation board, using a Virtex 6 XC6VLX240T. Initially, all block
ROM tests on the Virtex 6 were completed in non-PR VHDL simulation. A
PR-enabled version was implemented on the ML605 using a built-in self
test (BIST) to verify functionality. Figure 3.2 shows the tool and data flow
from configuration files, through design and building of the CSC, to testing
the CSC design. Starting from configuration and lookup table data provided
by HP, files containing the CLUT and register values for a conversion are
created using an HP-provided tool. For testing versions which use initial-
ized memories, the CLUT data is passed into hex2coe to obtain coefficient
files for the memories. The steps here vary depending on the type of testing
to be done. For hardware testing using partial reconfiguration, the partial
bitstreams for the 3D and 4D modules are converted to test vectors using
mrprdata [2]. For testing without initialized memories, the CLUT data is
converted to the test vector format and added to the test vector. Source TIFF
images are converted to text test vectors using a custom MATLAB script
tiff2txt, making up the pixel data of the test vector. For initial testing on
the XC2VP30, the test vectors are loaded onto a Compact Flash card to be













































Figure 3.2: Tool flow for testing the CSC.
16




As seen in Table 2.1, there is a wide range of available resources, even
within a given family of devices. In the driving example, the CSC, there is a
fairly large usage of memory, used for storing CLUTs. Given that SRAM-
based FPGAs store configuration data in SRAM memory cells, it’s possible
to use those same FPGA LUTs to serve as CLUTs in distributed memory,
either RAM or ROM. Table 4.1 shows the initial resource requirements of
the PR-enabled CSC on the XC2VP30. Of particular note, the partially
reconfigurable region (PRR) uses 26.34% of the LUT resources in the block
RAM implementation.
4.1 Distributed RAM
Implementing the distributed RAM design on the XC2VP30 led to a closer
look at the memory requirements of the CSC, in Table 4.2. Focusing on
18
Table 4.1: CSC engine implementation results (XC2VP30).
Static Region Available
Feature PRR and PRR Resources
Slices 4,407 10,035 13,696
BRAMs 60 92 136
Slice Flip Flops 1,399 4,313 27,392
4 input LUTs 7,214 16,732 27,392
Max. clock rate 50MHz
just the PRR, 383.8 Kbits are needed just for memory in the 3D module,
or 89.7% of the available distributed memory on the XC2VP30. As seen in
Table 4.3, the 3D PRR uses 18% of the available LUT resources, combining
for over 100% of the device for the PR alone, preventing the design from
synthesizing. While the 4D PRR requires fewer distributed RAM resources,
the PRR size is determined by the largest amount of resources required be-
tween the modules that can occupy it.







Distributed ROM allows the synthesis tools to analyze the stored data for
places to optimize in synthesis; this can lead to smaller designs for any
collection of CLUTs that have a pattern in them. Also, by being a ROM,
write hardware can be removed, allowing further hardware reductions. In a
PR design, however, the distributed ROM in practicality will have limited
hardware reduction due to patterns, as the PRR needs to be large enough to
encompass any set of memories, CLUTs in the CSC. As Table 4.3 shows,
distributed ROM implementations, or more likely balanced hybrid imple-
mentations, can be useful in devices such as the Spartan 3E, where there is a
low amount of available BRAM and a surplus in LUT resources compared
to those needed by logic. However, the trend in newer families is to increase
the available BRAM resources, albeit at a lower rate than logic resources.
4.3 Initialized Block RAM
Due to the structure of the Virtex-II Pro, where configuration is done on a
frame by frame basis, BRAMs have their own frames separate from slice
configuration frames. As a result, adding initial values to the BRAMs
caused the size of the full and partial bitstreams to grow from the initial
data. However, as Table 4.4 shows, the size of the bitstream increases by
20
Table 4.3: Distributed ROM PRR synthesis results
Synthesis




3D 18% 19% 44% 75
4D 26% 27% 25% 75
Dist ROM
3D 89% 97% 0% 66




3D 6% 6 % 20% 95
4D 7% 7% 11% 78
Dist ROM
3D 34% 34% 0% 89




3D 16% 18% 176% 57
4D 24% 26% 94% 55
Dist ROM
3D 83% 90% 0% 48
4D 81% 87% 0% 43
more than the BRAM size. This appears to be due to the additional BRAMs
in the frames used by the PRR, though the discrepancy between the 3D and
4D changes in bitstream sizes is not explained.
Table 4.4: Initialized BRAM results
Change in
Engine Partial Bitstream Size (Bytes) bitstream size CLUT Size (Bytes)
(bytes)
No Initial Data Initialized BRAMs
3D 476116 628156 152040 49130
4D 473684 678828 205144 32805
21
4.4 Block ROM
The final implementation analyzed was using block ROMs in place of block
RAMs. While the block ROM implementation suffers from the same notice-
able increase in bitstream size as block RAMs, this implementation allows
for the removal of write hardware from the design, potentially a significant
improvement in both resource utilization and clock rate.
As Table 4.5 shows, there is a maximum reduction of 8.52% in the static
logic resource utilization, with only a maximum of 2.24% reduction in the
RP resources. The most significant change was the estimated clock rate,
which increases by 11.35%. At the same time, the partial bitstream size
increases by 13.75%. Table 4.6 shows, using the estimated clock rate and
bitstream sizes, the approximate reconfiguration time of the RP, as well as
the configuration time of the CLUTs for the BRAM implementation. The
clock rate was estimated based on Xilinx ISE post-place and route static tim-
ing analysis for the non-PR implementations, with the lowest between 3D
and 4D designs for each memory configuration being considered. Although
partial reconfiguration might affect the actual clock frequency slightly, it
should have a similar effect on both RAM and ROM implementations, and
was ignored for this analysis. While the partial reconfiguration time of the
22
module increases by converting to the BROM implementation, during com-
pact mode configuration of the 3D CLUTs the total configuration times have
less than 0.1% variance, while under normal mode the BROM implemen-
tation proves to be faster. The BROM implementation, however, forces the
entire RP reconfiguration even if just the CLUTs change, a less flexible sys-
tem.
Table 4.5: Virtex 6 Block ROM Resource Utilization
Engine Resource RP Usage Static Usage
RAM ROM % Delta RAM ROM % Delta
3D
Registers 1010 1026 1.58% 3618 3621 0.08%
LUTs 3730 3679 -1.37% 7787 7630 -2.02%
Slices 1025 1002 -2.24% 2968 2830 -4.65%
LUT-FF Pairs 3767 3727 -1.06% 8598 8422 -2.05%
RAMB36s 28 28 0% 32 32 0%
RAMB18s 4 4 0% 4 4 0%
DSP48s 16 16 0% 0 0 0%
Bitsteam Size (KB) 531 604 13.75% 9017 9017 0%
4D
Registers 1383 1383 0% 3618 3621 0.08%
LUTs 4617 4564 -1.15% 7787 7630 -2.02%
Slices 1233 1243 0.81% 2968 2715 -8.52%
LUT-FF Pairs 4718 4646 -1.53% 8607 8351 -2.97%
RAMB36s 2 2 0% 32 32 0%
RAMB18s 30 30 0% 0 0 0%
DSP48s 24 24 0% 0 0 0%
Bitsteam Size (KB) 531 604 13.75% 9017 9017 0%
Estimated max. clock rate (MHz) 61.8 68.8 11.35%
23
Table 4.6: Estimated partial reconfiguration timings (XC6VLX240)
Implementation Bitstream Size (Kbytes) Speed (MHz) Config Time (ms)a
BRAM 531 61.8 8.80
CLUT Data (3D) 48 61.8 0.20 (0.40)
CLUT Data (4D) 32 61.8 0.13 (0.27)
BROM 604 68.8 8.99




We have presented the evaluation of four variations of storage as they apply
to a partially reconfigurable FPGA implementation of a color space conver-
sion module. With the initial design, when the CSC PRR is reconfigured,
the CLUT RAMs are loaded with the appropriate CLUT data. The four
variations of CLUT storage evaluated make trade-offs between resource uti-
lization and performance. For the size of storage needed for the CLUTs in
this design, distributed memories used too many LUT resources; however,
such an implementation could be used for smaller memories, or in a mixed
design with block and distributed memories, if the available block memory
resources of the desired device is too limiting. Block ROMs can provide the
greatest reduction in resources and increase in performance; in this design,
there are multiple write paths to the CLUTs, with the critical path of the
original design the write enable signal to one of the block RAM modules.
Additional application of partial evaluation could be applied to other
25
modules of the CSC; however, with the critical path being on a write enable
signal to the CLUTs, the only likely gain would be in resource utilization.
Future work to improve the performance of the CSC can look at overlap-
ping reconfiguration of one region of the FPGA while processing data, so
that the reconfiguration time overhead can be mitigated. Note that such a
design would require a reevaluation of the design presented in [2], as that
design utilized the pixel pipeline to feed bitstream data.
26
Bibliography
[1] Brandon Blodget, Philip James-Roxby, Eric Keller, Scott McMillan,
and Prasanna Sundararajan. A self-reconfiguring platform. In Che-
ung and George Constantinides, editors, Field Programmable Logic and
Application, volume 2778 of Lecture Notes in Computer Science, chap-
ter 55, pages 565–574. Springer Berlin / Heidelberg, Berlin, Heidelberg,
2003.
[2] J. Galindo, E. Peskin, B. Larson, and G. Roylance. Leveraging
Firmware in Multichip Systems to Maximize FPGA Resources: An Ap-
plication of Self-Partial Reconfiguration. Reconfigurable Computing
and FPGAs, 2008. ReConFig ’08. International Conference on, pages
139–144, December 2008.
[3] P. Green and L.W. MacDonald. Colour engineering: achieving device
independent colour. Wiley SID series in display technology. Wiley,
2002.
[4] Scott Hauck and André DeHon. Reconfigurable Computing: The The-
ory and Practice of FPGA-Based Computation. Morgan Kaufmann,
November 2007.
[5] James M. Kasson, Sigfredo I. Nin, Wil Plouffe, and James Lee Hafner.
Performing color space conversions with three-dimensional linear inter-
polation. J. Electronic Imaging, pages 226–250, 1995.
27
[6] Philippe Manet, Daniel Maufroid, Leonardo Tosi, Gregory Gailliard,
Olivier Mulertt, Marco Di Ciano, Jean-Didier Legat, Denis Aulagnier,
Christian Gamrat, Raffaele Liberati, Vincenzo La Barba, Pol Cuvelier,
Bertrand Rousseau, and Paul Gelineau. An Evaluation of Dynamic Par-
tial Reconfiguration for Signal and Image Processing in Professional
Electronics Applications. EURASIP Journal on Embedded Systems,
2008, November 2008.
[7] N. McKay, T. Melham, and Kong W. Susanto. Dynamic specialisation
of XC6200 FPGAs by partial evaluation. pages 308–309, April 1998.
[8] Sreenivas Patil. Reconfigurable hardware for color space conversion.




# M a k e f i l e f o r CSC PR
# I n i t i a l i z e e n v i r o n m e n t
# Genera te ngc f i l e s f o r each module and copy t o d e s i r e d l o c a t i o n
# Conver t b i t s t r e a m s t o t e s t f i l e s
# T e s t f i l e s : CSC / CXF2FUSION / Merging r e g i o n s o f t e s t f i l e w i t h
# header , PRM b i t s t ream , CLUT data , image
RMDIR = $ (RM) −R
STATIC DIR = s y n t h / S t a t i c
MOD3D DIR = s y n t h / Mod 3d
MOD4D DIR = s y n t h / Mod 4d
TOP DIR = s y n t h / Top
STATIC PROJECTS = c s c a u t o l o a d / c s c a u t o l o a d . i s e c s c c o n t r o l / c s c c o n t r o l . i s e \
c s c i s o l a t i o n s t a g e / c s c i s o l a t i o n s t a g e 6 9 . i s e \
c s c i s o l a t i o n s t a g e / c s c i s o l a t i o n s t a g e 1 0 0 . i s e \
c s c k p l a n e m a g / c s c k p l a n e m a g . i s e c s c l u t 1 d / c s c l u t 1 d p r e . i s e \
c s c l u t 1 d / c s c l u t 1 d p o s t . i s e c s c p r e m a t c h / c s c p r e m a t c h . i s e \
c s c r e g / c s c r e g . i s e c s c w r a p p e r / c s c w r a p p e r . i s e i c a p e a i / i c a p e a i . i s e \
l u t c h k t o p / l u t c h k t o p . i s e \
p i p e l i n e h a n d s h a k e e n a b l e / p i p e l i n e h a n d s h a k e e n a b l e . i s e \
t e s t r i g 2 d u t i n t e r f a c e / t e s t r i g 2 d u t i n t e r f a c e . i s e
STATIC HDL = $ ( n o t d i r $ ( STATIC PROJECTS : . i s e = . v ) )
SOURCES 1D SRAM := $ ( w i l d c a r d $ ( STATIC DIR ) / sram∗w r a p p e r c o r e g e n 2 . v ) \
$ ( p a t s u b s t %.xco ,%. ngc , $ ( w i l d c a r d $ (MOD4D DIR) / s ram ∗ . xco ) )
SOURCES STATIC = $ ( a d d p r e f i x $ ( STATIC DIR ) / , $ ( STATIC PROJECTS ) $ ( STATIC HDL ) \
$ ( STATIC PROJECTS : . i s e = . x s t ) )
TARGETS STATIC = $ ( a d d p r e f i x $ ( STATIC DIR ) / , $ ( STATIC PROJECTS : . i s e = . ngc ) )
PR 3D PROJECT = $ (MOD3D DIR) / Mod 3d . i s e
PR 4D PROJECT = $ (MOD4D DIR) / Mod 4d . i s e
PR PROJECTS = $ ( PR 3D PROJECT ) $ ( PR 4D PROJECT )
SOURCES PR 3D SRAM := $ ( w i l d c a r d $ (MOD3D DIR) / s ram 0 ∗ . xco ) $ ( w i l d c a r d
$ (MOD3D DIR) / s ram 1 ∗ . xco )
SOURCES PR 4D SRAM := $ ( w i l d c a r d $ (MOD4D DIR) / s ram ∗ . xco )
SOURCES PR 3D SRAM COE := $ ( w i l d c a r d $ (MOD3D DIR) / ∗ . coe )
SOURCES PR 4D SRAM COE := $ ( w i l d c a r d $ (MOD4D DIR) / ∗ . coe )
TARGETS PR 3D SRAM := $ ( SOURCES PR 3D SRAM : . xco = . edn )
TARGETS PR 4D SRAM := $ ( SOURCES PR 4D SRAM : . xco = . edn )
SOURCES PR 3D = $ ( a d d p r e f i x $ (MOD3D DIR) / , c s c 3 d 4 d . v c s c 3 d . v c s c p h a s e 1 3 d . v \
c s c l u t w r a p p e r s 3 d . v c s c p h a s e 2 3 d . v c s c p h a s e 3 3 d . v c s c p h a s e 3 3 d c h a n n e l . v \
c s c d e f s . vh r b i s t d e f s . vh c s c 3 d 4 d . l s o ) \
$ ( w i l d c a r d $ (MOD3D DIR) / sram∗w r a p p e r c o r e g e n 2 . v )
SOURCES PR 4D = $ ( a d d p r e f i x $ (MOD4D DIR) / , c s c 3 d 4 d . v c s c 4 d . v c s c p h a s e 1 4 d . v \
29
c s c l u t w r a p p e r s 4 d . v c s c p h a s e 2 4 d . v c s c p h a s e 3 4 d . v c s c p h a s e 3 4 d c h a n n e l . v \
c s c d e f s . vh r b i s t d e f s . vh c s c 3 d 4 d . l s o ) \
$ ( w i l d c a r d $ (MOD4D DIR) / sram∗w r a p p e r c o r e g e n 2 . v )
TARGETS PR 3D = $ (MOD3D DIR) / c s c 3 d 4 d . ngc
TARGETS PR 4D = $ (MOD4D DIR) / c s c 3 d 4 d . ngc
TARGETS PR = $ ( TARGETS PR 3D ) $ ( TARGETS PR 4D )
TOP PROJECT = $ ( TOP DIR ) / Top . i s e
SOURCES TOP = $ ( TOP DIR ) / c s c . v $ ( TOP DIR ) / c s c . l s o
TARGETS TOP = $ ( TOP DIR ) / c s c . ngc
TARGETS = $ ( TARGETS PR ) $ ( TARGETS STATIC )
COPIES 3D = $ ( a d d p r e f i x $ ( NETLIST DIR ) / Mod 3d / , $ ( n o t d i r $ ( TARGETS PR 3D ) ) ) \
$ ( a d d p r e f i x $ ( NETLIST DIR ) / Mod 3d / , $ ( n o t d i r $ ( TARGETS PR 3D SRAM ) ) )
COPIES 4D = $ ( a d d p r e f i x $ ( NETLIST DIR ) / Mod 4d / , $ ( n o t d i r $ ( TARGETS PR 4D ) ) ) \
$ ( a d d p r e f i x $ ( NETLIST DIR ) / Mod 4d / , $ ( n o t d i r $ ( TARGETS PR 4D SRAM ) ) )
COPIES STATIC = $ ( a d d p r e f i x $ ( NETLIST DIR ) / S t a t i c / , $ ( n o t d i r $ ( TARGETS STATIC ) ) )
COPIES TOP = $ ( a d d p r e f i x $ ( NETLIST DIR ) / Top / , $ ( n o t d i r $ (TARGETS TOP) ) \
b u s m a c r o x c 2 v p l 2 r a s y n c n a r r o w . nmc b u s m a c r o x c 2 v p r 2 l a s y n c n a r r o w . nmc )
INTERMEDIATES = $ ( w i l d c a r d s y n t h / ∗ / sram∗ch∗ readme . t x t ) \
$ ( w i l d c a r d s y n t h / ∗ / sram∗ch∗ f l i s t . t x t ) \ $ ( w i l d c a r d s y n t h / ∗ / sram∗ch ∗ . a sy ) \
$ ( w i l d c a r d s y n t h / ∗ / sram∗ch ∗ . sym ) $ ( w i l d c a r d s y n t h / ∗ / sram∗ch ∗ . v ∗ ) \
$ ( w i l d c a r d s y n t h / ∗ / ∗ . edn ) $ ( w i l d c a r d s y n t h /∗ /∗ l o g ) \
$ ( w i l d c a r d s y n t h / ∗ / ∗ . ngo ) $ ( w i l d c a r d s y n t h / ∗ / ∗ . s t x ) \
$ ( w i l d c a r d s y n t h / ∗ / ∗ . s y r ) $ ( w i l d c a r d s y n t h / ∗ / ∗ . ng r ) \
$ ( w i l d c a r d s y n t h / ∗ / ∗ . h tml ) $ ( w i l d c a r d s y n t h / S t a t i c / ∗ / ∗ . h tml )
CVSIGNORED := $ ( f i n d −name . c v s i g n o r e )
TEMPDIRS = $ ( w i l d c a r d $ ( STATIC DIR ) / ∗ / xmsgs ) $ ( w i l d c a r d $ ( STATIC DIR ) / ∗ / x s t ) \
$ ( w i l d c a r d s y n t h / ∗ / cg ) $ ( w i l d c a r d s y n t h / ∗ / xmsgs ) $ ( w i l d c a r d s y n t h / ∗ / t e m p l a t e s ) \
$ ( w i l d c a r d s y n t h / ∗ / tmp ) $ ( TOP DIR ) / x s t $ ( TOP DIR ) / x s t
TOOLSDIR = t o o l s
MRPRDATA = $ (TOOLSDIR) / m r p r d a t a / m r p r d a t a
HEX2COE = $ (TOOLSDIR) / hex2coe / hex2coe
PRDATA2PG = $ (TOOLSDIR) / p r d a t a 2 p g . p l
TOOLS = $ (MRPRDATA) $ (HEX2COE)
SCRIPTSDIR = s c r i p t s
SCRIPT SETPR = $ ( SCRIPTSDIR ) \\ s e t t o x i l i n x p r . b a t
EXPORT 3D DIR = P A P r o j e c t s / c s c p r v 2 p 3 d e x p o r t
EXPORT 4D DIR = P A P r o j e c t s / c s c p r v 2 p 4 d e x p o r t
NETLIST DIR = n e t l i s t s
a l l : t o o l s coes s y n t h n e t l i s t s p l a n a h e a d h e x f i l e s p r d a t a t e s t f i l e s
t o o l s : $ (TOOLS)
coes : $ (HEX2COE) mkcoe
. / mkcoe
s y n t h : s t a t i c p r
s t a t i c : $ ( TARGETS STATIC )
p r : $ ( TARGETS PR )
# x s t p l a c e s f i l e s i n x s t / p r o j n a v . tmp and won ’ t run i f i t doesn ’ t e x i s t
# x s t : − i s e p r o j e c t f i l e ( i s e / x i s e depend ing on v e r s i o n ) f o r g u i mode
# − i n t s t y l e : o u t p u t message f o r m a t
# − i f n : i n p u t f i l e name ( x s t )
# −o f n : o u t p u t ( l o g ) f i l e name
# 3D module
$ ( TARGETS PR 3D ) : $ ( PR 3D PROJECT ) $ ( SOURCES PR 3D ) co regen3d
mkdir −p $ (@D) / x s t / p r o j n a v . tmp
cd $ (@D) && x s t − i s e $(<F ) − i n t s t y l e xf low − i f n c s c 3 d 4 d . x s t −ofn c s c 3 d 4 d . s y r
30
# 4D module
$ ( TARGETS PR 4D ) : $ ( PR 4D PROJECT ) $ ( SOURCES PR 4D ) co regen4d
mkdir −p $ (@D) / x s t / p r o j n a v . tmp
cd $ (@D) && x s t − i s e $(<F ) − i n t s t y l e xf low − i f n c s c 3 d 4 d . x s t −ofn c s c 3 d 4 d . s y r
# Top module
$ (TARGETS TOP) : $ ( TOP PROJECT ) $ (SOURCES TOP)
mkdir −p $ (@D) / x s t / p r o j n a v . tmp
cd $ (@D) && x s t − i s e $(<F ) − i n t s t y l e xf low − i f n c s c . x s t −ofn c s c . s y r
co regen3d : $ ( TARGETS PR 3D SRAM ) $ ( SOURCES PR 3D SRAM COE )
co regen4d : $ ( TARGETS PR 4D SRAM ) $ ( SOURCES PR 4D SRAM COE )
# srams
%. edn : %. xco
# backup SRAM xco ’ s s i n c e coregen r e w r i t e s them w i t h same da ta
# a f t e r edn i s c r e a t e d ; h e l p s p r e s e r v e f u n c t i o n a l i t y o f make .
# Run coregen i n b a t c h ( command l i n e ) mode
# R e s t o r e a f t e r edn g e n e r a t i o n , p r e s e r v i n g t h e o r i g i n a l m o d i f i e d d a t e
cd $ (@D) && cp $(<F ) $(<F ) . bak
cd $ (@D) && c o r e g e n −b $(<F )
cd $ (@D) && mv $(<F ) . bak $(<F )
# s t a t i c n e t l i s t s
%. ngc : %. i s e %. l s o %. x s t
mkdir −p $ (@D) / x s t / p r o j n a v . tmp
cd $ (@D) && x s t − i s e $(<F ) − i n t s t y l e xf low − i f n $ (∗F ) . x s t −ofn $ (∗F ) . s y r
# l s o f i l e d e f i n e s how t o s e a r c h l i b r a r i e s , i f n o t p r e s e n t t h e n
# c r e a t e w i t h DEFAULT SEARCH ORDER keyword
%. l s o :
echo DEFAULT SEARCH ORDER > $@
# c o p i e s o f n e t l i s t s i n n e t l i s t s / d i r e c t o r y
# t h i s i s where t h e p lanahead p r o j e c t c h e c k s f o r n e t l i s t s
n e t l i s t s : $ ( COPIES 3D ) $ ( COPIES 4D ) $ ( COPIES STATIC ) $ ( COPIES TOP )
$ ( NETLIST DIR ) / S t a t i c /%. ngc : $ ( STATIC DIR ) /%/%. ngc
cp $< $@
$ ( NETLIST DIR ) / Mod 3d/% : $ (MOD3D DIR) /%
cp $< $@
$ ( NETLIST DIR ) / Mod 4d/% : $ (MOD4D DIR) /%
cp $< $@
$ ( NETLIST DIR ) / S t a t i c / c s c i s o l a t i o n s t a g e /% : $ ( STATIC DIR ) / c s c i s o l a t i o n s t a g e /%
cp $< $@
$ ( NETLIST DIR ) / S t a t i c / c s c l u t 1 d /% : $ ( STATIC DIR ) / c s c i s o l a t i o n s t a g e /%
cp $< $@
$ ( NETLIST DIR ) / Top /%. ngc : $ ( TOP DIR ) /%. ngc
cp $< $@
p l a n a h e a d : b u i l d 3 d b u i l d 4 d
31
# G e n e r a t e s t h e b i t s t r e a m s f o r 3d module ( f u l l and p a r t i a l )
b u i l d 3 d : $ ( EXPORT 3D DIR ) / merge / s t a t i c f u l l . ncd
$ ( EXPORT 3D DIR ) / merge / R e c o n f i g M o d u l e s C V r o u t e d p a r t i a l . ncd
$ ( EXPORT 3D DIR ) / merge / s t a t i c f u l l . ncd
$ ( EXPORT 3D DIR ) / merge / Reconf igModu le s CV rou ted . ncd \
$ ( EXPORT 3D DIR ) / merge / r e c o n f i g m o d u l e s c v r o u t e d p a r t i a l . b i t : \
$ ( COPIES 3D ) $ ( COPIES STATIC ) $ ( COPIES TOP )
i f [ −e $ ( EXPORT 3D DIR ) ] ; t h e n $ (RM) −R $ ( EXPORT 3D DIR ) o l d ; \
mv $ ( EXPORT 3D DIR ) $ ( EXPORT 3D DIR ) o l d ; f i
$ (RM) P A p r o j e c t s / c s c p r v 2 p 3 d / c s c p r v 2 p 3 d . d a t a / n e t l i s t / ∗ . ngc
i f [ ! −e P A p r o j e c t s / c s c p r v 2 p 3 d / c s c p r v 2 p 3 d . d a t a / n e t l i s t / c s c . ngc ] ; t h e n \
mkdir −p P A p r o j e c t s / c s c p r v 2 p 3 d / c s c p r v 2 p 3 d . d a t a / n e t l i s t / ; \
cp P A P r o j e c t s / c s c p r v 2 p 3 d / c s c p r v 2 p 3 d . d a t a / f l o o r p l a n 1 / c s c . e d f \
P A p r o j e c t s / c s c p r v 2 p 3 d / c s c p r v 2 p 3 d . d a t a / n e t l i s t / ; \
f i
mkdir $ ( EXPORT 3D DIR )
cmd / c ” $ ( SCRIPT SETPR ) && planAhead −s o u r c e $ ( SCRIPTSDIR ) / c s c 3 d p r P A . t c l ”
cmd / c ” $ ( SCRIPT SETPR ) && cd $ ( EXPORT 3D DIR ) && i n i t M o d u l a r . b a t ”
cmd / c ” $ ( SCRIPT SETPR ) && cd $ ( EXPORT 3D DIR ) / s t a t i c && s t a t i c L o g i c I m p l . b a t ”
cmd / c ” $ ( SCRIPT SETPR ) && cd $ ( EXPORT 3D DIR ) && p r o c e s s P b l o c k s . b a t ”
cmd / c ” $ ( SCRIPT SETPR ) && cd $ ( EXPORT 3D DIR ) / merge && assemblePCfg . b a t ”
# G e n e r a t e s t h e b i t s t r e a m s f o r 4d module ( f u l l and p a r t i a l )
b u i l d 4 d : $ ( EXPORT 4D DIR ) / merge / s t a t i c f u l l . ncd
$ ( EXPORT 4D DIR ) / merge / R e c o n f i g M o d u l e s C V r o u t e d p a r t i a l . ncd
# Can ’ t e x e c u t e t h e f i r s t s t a g e s o f t h e PA f l o w or you ’ l l screw up t h e s t a t i c r e g i o n
# So copy t h e 3D d i r e c t o r y , t h e n copy t h e 4D n e t l i s t s i n t o t h e new d i r e c t o r y
# Then r e r u n t h e l a s t 2 s t e p s w i t h t h e 4D modules
$ ( EXPORT 4D DIR ) / merge / s t a t i c f u l l . ncd \
$ ( EXPORT 4D DIR ) / merge / r e c o n f i g m o d u l e s c v r o u t e d p a r t i a l . ncd \
$ ( EXPORT 4D DIR ) / merge / r e c o n f i g m o d u l e s c v r o u t e d p a r t i a l . b i t : \
$ ( COPIES 4D ) $ ( COPIES STATIC ) $ ( COPIES TOP )
i f [ −e $ ( EXPORT 4D DIR ) ] ; t h e n $ (RM) −R $ ( EXPORT 4D DIR ) o l d ; \
mv $ ( EXPORT 4D DIR ) $ ( EXPORT 4D DIR ) o l d ; f i
cp −R $ ( EXPORT 3D DIR ) / $ ( EXPORT 4D DIR ) /
cp ‘ l s −d $ ( NETLIST DIR ) / Mod 4d /∗ | g rep −v CVS‘
$ ( EXPORT 4D DIR ) / ReconfigModules CV /
cmd / c ” $ ( SCRIPT SETPR ) && cd $ ( EXPORT 4D DIR ) && p r o c e s s P b l o c k s . b a t ”
cmd / c ” $ ( SCRIPT SETPR ) && cd $ ( EXPORT 4D DIR ) / merge && assemblePCfg . b a t ”
h e x f i l e s : $ ( EXPORT 3D DIR ) / merge / c s c 3 d . hex $ ( EXPORT 4D DIR ) / merge / c s c 4 d . hex
$ ( EXPORT 3D DIR ) / merge / c s c 3 d . hex :
$ ( EXPORT 3D DIR ) / merge / r e c o n f i g m o d u l e s c v r o u t e d p a r t i a l . b i t
cd $ ( EXPORT 3D DIR ) / merge / && promgen −w −p hex −u 0
r e c o n f i g m o d u l e s c v r o u t e d p a r t i a l . b i t −o c s c 3 d . hex
$ ( EXPORT 4D DIR ) / merge / c s c 4 d . hex :
$ ( EXPORT 4D DIR ) / merge / r e c o n f i g m o d u l e s c v r o u t e d p a r t i a l . b i t
cd $ ( EXPORT 4D DIR ) / merge / && promgen −w −p hex −u 0
r e c o n f i g m o d u l e s c v r o u t e d p a r t i a l . b i t −o c s c 4 d . hex
# Conver t p a r t i a l b i t s t r e a m s i n t o t e s t v e c t o r s u s i n g MRPRDATA
p r d a t a : $ ( EXPORT 3D DIR ) / merge / c s c 3 d p r . t x t $ ( EXPORT 4D DIR ) / merge / c s c 4 d p r . t x t
$ ( EXPORT 3D DIR ) / merge / c s c 3 d p r . t x t : $ ( EXPORT 3D DIR ) / merge / c s c 3 d . hex
32
$ (MRPRDATA) $ ( EXPORT 3D DIR ) / merge / c s c 3 d . hex $ ( EXPORT 3D DIR ) / merge / c s c 3 d p r . t x t
$ ( EXPORT 4D DIR ) / merge / c s c 4 d p r . t x t : $ ( EXPORT 4D DIR ) / merge / c s c 4 d . hex
$ (MRPRDATA) $ ( EXPORT 4D DIR ) / merge / c s c 4 d . hex $ ( EXPORT 4D DIR ) / merge / c s c 4 d p r . t x t
t e s t f i l e s : p r p l u s 3 d . t x t p r p l u s 4 d . t x t
# Combine premade header and image w i t h p a r t i a l b i t s t r e a m v e c t o r s t o make c o m p l e t e t e s t
v e c t o r
p r p l u s 3 d . t x t : t e s t r i g / t e s t v e c t o r s / h e a d e r 3 d . t x t $ ( EXPORT 3D DIR ) / merge / c s c 3 d p r . t x t \
t e s t r i g / t e s t v e c t o r s / c l u t 3 d p r e 1 d p o s t 1 d . t x t t e s t r i g / t e s t v e c t o r s / img3d . t x t
@echo $ ?
c a t $ ˆ > $@
$ (PRDATA2PG) p r p l u s 3 d . t x t pg3d . t x t
# Combine premade header and image w i t h p a r t i a l b i t s t r e a m v e c t o r s t o make c o m p l e t e t e s t
v e c t o r
p r p l u s 4 d . t x t : t e s t r i g / t e s t v e c t o r s / h e a d e r 4 d . t x t $ ( EXPORT 4D DIR ) / merge / c s c 4 d p r . t x t \
t e s t r i g / t e s t v e c t o r s / c l u t 4 d p r e 1 d p o s t 1 d . t x t t e s t r i g / t e s t v e c t o r s / img4d . t x t
c a t $ ˆ > $@
$ (PRDATA2PG) p r p l u s 4 d . t x t pg4d . t x t
c l e a n :
$ (RM) $ (TARGETS) $ ( COPIES ) $ ( INTERMEDIATES)
$ (RMDIR) $ (TEMPDIRS)
# V e r i l o g dependency f i l e s f o r each s t a t i c p r o j e c t
$ ( TARGETS STATIC ) : $ ( STATIC DIR ) / c s c d e f s . vh
$ ( STATIC DIR ) / c s c a u t o l o a d / c s c a u t o l o a d . ngc : $ ( STATIC DIR ) / c s c a u t o l o a d . v
$ ( STATIC DIR ) / c s c c o n t r o l / c s c c o n t r o l . ngc : $ ( STATIC DIR ) / c s c c o n t r o l . v
$ ( STATIC DIR ) / c s c i s o l a t i o n s t a g e / c s c i s o l a t i o n s t a g e 6 9 . ngc :
$ ( STATIC DIR ) / c s c i s o l a t i o n s t a g e 6 9 . v
$ ( STATIC DIR ) / c s c i s o l a t i o n s t a g e / c s c i s o l a t i o n s t a g e 1 0 0 . ngc :
$ ( STATIC DIR ) / c s c i s o l a t i o n s t a g e 1 0 0 . v
$ ( STATIC DIR ) / c s c k p l a n e m a g / c s c k p l a n e m a g . ngc : $ ( STATIC DIR ) / c s c k p l a n e m a g . v
$ ( STATIC DIR ) / c s c l u t 1 d / c s c l u t 1 d p o s t . ngc : $ ( STATIC DIR ) / c s c l u t 1 d p o s t . v \
$ ( STATIC DIR ) / c s c p h a s e 1 1 d . v $ ( STATIC DIR ) / c s c p h a s e 2 1 d . v \
$ ( STATIC DIR ) / c s c l u t w r a p p e r s 1 d . v $ ( STATIC DIR ) / r b i s t d e f s . vh
$ ( STATIC DIR ) / c s c l u t 1 d / c s c l u t 1 d p r e . ngc : $ ( STATIC DIR ) / c s c l u t 1 d p r e . v \
$ ( STATIC DIR ) / c s c p h a s e 1 1 d . v $ ( STATIC DIR ) / c s c p h a s e 2 1 d . v \
$ ( STATIC DIR ) / c s c l u t w r a p p e r s 1 d . v $ ( STATIC DIR ) / r b i s t d e f s . vh
$ ( STATIC DIR ) / c s c p r e m a t c h / c s c p r e m a t c h . ngc : $ ( STATIC DIR ) / c s c p r e m a t c h . v
$ ( STATIC DIR ) / c s c r e g / c s c r e g . ngc : $ ( STATIC DIR ) / c s c r e g . v
$ ( STATIC DIR ) / c s c w r a p p e r / c s c w r a p p e r . ngc : $ ( STATIC DIR ) / CSC wrapper . vhd
$ ( STATIC DIR ) / i c a p e a i / i c a p e a i . ngc : $ ( STATIC DIR ) / i c a p e a i . v
$ ( STATIC DIR ) / l u t c h k t o p / l u t c h k t o p . ngc : $ ( STATIC DIR ) / l u t c h k c t r l . v \
$ ( STATIC DIR ) / l u t c h k r e g s . v $ ( STATIC DIR ) / l u t c h k t i m e r . v
$ ( STATIC DIR ) / l u t c h k t o p . v \
$ ( STATIC DIR ) / c s c l u t c h k c r c 1 6 p a r a l l e l n . v
$ ( STATIC DIR ) / p i p e l i n e h a n d s h a k e e n a b l e / p i p e l i n e h a n d s h a k e e n a b l e . ngc : \
$ ( STATIC DIR ) / p i p e l i n e h a n d s h a k e e n a b l e . v
$ ( STATIC DIR ) / T e s t R i g 2 D U T I n t e r f a c e / T e s t R i g 2 D U T I n t e r f a c e . ngc : \
$ ( STATIC DIR ) / T e s t R i g 2 D U T I n t e r f a c e . v




/ / T h i s u t i l i t y c o n v e r t s hex f i l e s t o t h e coe f o r m a t
/ / I n p u t : Hex f i l e
/ / Ou tpu t : 2 coe f i l e s ( f o r upper , lower )
/ / Usage : hex2coe [−c |−m] [−1|−3|−4] [−N|−C] − i <i n p u t> [−o <o u t p u t stem>]
/ / −c : s p e c i f i e s t h e o u t p u t s h o u l d be coe f o r m a t ( d e f a u l t )
/ / −m: s p e c i f i e s t h e o u t p u t s h o u l d be m i f f o r m a t ( n o t y e t imp lemen ted )
/ / −1: s p e c i f i e s t h e f i l e i s f o r a 1D memory (48 b i t s , 12 b i t s per c h a n n e l )
/ / −3 −4: s p e c i f i e s t h e f i l e i s f o r a 3D/ 4D memory (40 b i t s , 10 b i t s per c h a n n e l )
/ / −N: s p e c i f i e s t h e i n p u t i s i n normal mode (12 b i t s / c h a n n e l f o r 1D, 10
b i t s / c h a n n e l f o r 3D/ 4D)
/ / −C: s p e c i f i e s t h e i n p u t i s i n compact mode (8 b i t s / c h a n n e l f o r 1D, 8 b i t s / c h a n n e l
f o r 3D/ 4D)
/ / − f : O u t p u t s t h e f u l l 40 b i t coe f i l e i n a d d i t i o n t o t h e d i v i d e d c h a n n e l s (3D/ 4D o n l y )
/ / −h : d i s p l a y t h i s h e l p and e x i t
/ / I f n e i t h e r −N nor −C i s g iven , w i l l a t t e m p t t o d e t e r m i n e t h e f o r m a t based on t h e
c o n t e n t s .
# i n c l u d e <s t d i o . h>
# i n c l u d e < s t d l i b . h>
# i n c l u d e <s t r i n g . h>
# d e f i n e CH01 ” ch01 ”
# d e f i n e CH23 ” ch23 ”
# d e f i n e COEEXT ” . coe ”
# d e f i n e MIFEXT ” . mif ”
# d e f i n e EXTSIZE 4
# d e f i n e COEHEADERCOMMENTCH01 ” ; I n i t i a l i z a t i o n f i l e f o r ch01 of %s\n ”
# d e f i n e COEHEADERCOMMENTCH23 ” ; I n i t i a l i z a t i o n f i l e f o r ch23 of %s\n ”
# d e f i n e COEHEADERCOMMENTFULL ” ; I n i t i a l i z a t i o n f i l e f o r %s\n ”
# d e f i n e COERADIX ” m e m o r y i n i t i a l i z a t i o n r a d i x =16;\ n ”
# d e f i n e COEVECTOR ” m e m o r y i n i t i a l i z a t i o n v e c t o r =\n ”
# d e f i n e COEEND ” ; ”
t y p e d e f enum OUTFORMAT {COE, MIF} e o u t f o r m a t ;
t y p e d e f enum MEMTYPE {D1 , D3 , D4} e memtype ;
t y p e d e f enum MODE {NORMAL, COMPACT} e mode ;
t y p e d e f s t r u c t MEM3D4D {
unsigned i n t ch0 : 1 0 ;
unsigned i n t ch1 : 1 0 ;
unsigned i n t ch2 : 1 0 ;
unsigned i n t ch3 : 1 0 ;
34
} s mem3d4d ;
t y p e d e f s t r u c t MEM1D {
unsigned i n t ch0 : 1 2 ;
unsigned i n t ch1 : 1 2 ;
unsigned i n t ch2 : 1 2 ;
unsigned i n t ch3 : 1 2 ;
} s mem1d ;
void usage ( ) ;
unsigned i n t a t o h ( char ) ;
char∗ d e c 2 b i n ( i n t va l , char∗ s t r , i n t l e n ) ;
i n t main ( i n t argc , char ∗ a rgv [ ] ) {
char ∗ i n f i l e n a m e ;
char ∗ o u t f i l e n a m e b a s e ;
char ∗ o u t f i l e n a m e c h 0 1 ;
char ∗ o u t f i l e n a m e c h 2 3 ;
char ∗ o u t f i l e n a m e f u l l ;
FILE ∗ i n f i l e ;
FILE ∗ o u t f i l e c h 0 1 ;
FILE ∗ o u t f i l e c h 2 3 ;
FILE ∗ o u t f i l e f u l l ;
i n t mode se t = 0 ; / / was mode s p e c i f i e d by o p t i o n s
i n t mode de te rmined = 0 ; / / i f mode n o t s p e c i f i e d , t r y t o d e t e r m i n e
i n t t y p e s e t = 0 ; / / was t y p e s p e c i f i e d by o p t i o n s
i n t o f o r m a t s e t = 0 ; / / was o u t p u t f o r m a t s p e c i f i e d by o p t i o n s
i n t i n n a m e s e t = 0 ; / / was i n p u t name p r o v i d e d
i n t f u l l o u t p u t = 0 ; / / s p e c i f i e s f o r 3d / 4 d i f t h e f u l l 40 b i t coe s h o u l d be w r i t t e n
i n t i ;
e o u t f o r m a t o f o r m a t = COE;
e memtype t y p e = D3 ;
e mode mode = NORMAL;
i n t o u t p u t n a m e p r o v i d e d = 0 ;
s mem3d4d mem3d4d out ;
s mem1d mem1d out ;
char b u f f e r [ 1 0 ] ; / / hex f i l e s have 8 c h a r a c t e r s per l i n e (+1 f o r / n and +1 f o r
/ 0 )
char mif 1d [ 1 3 ] ; / / 12 b i t s + / n
char mif 3d4d [ 1 1 ] ; / / 10 b i t s + / n
char∗ e x t = COEEXT; / / d e f a u l t
mif 1d [ 1 2 ] = ’\0 ’ ;
mi f 3d4d [ 1 0 ] = ’\0 ’ ;
i f ( a r g c == 1) { / / no argument s
usage ( ) ;
e x i t ( 1 ) ;
}
e l s e
{
/ / p a r s e command l i n e
f o r ( i =1 ; i<a r g c ; i ++)
{
i f ( a rgv [ i ] [ 0 ] = = ’− ’ ) {
sw i t ch ( a rgv [ i ] [ 1 ] ) {
case ’ c ’ :
i f ( o f o r m a t s e t == 0){
o f o r m a t = COE;
o f o r m a t s e t = 1 ;
35
}
e l s e {
i f ( o f o r m a t != COE)
{
f p r i n t f ( s t d e r r , ”−c o p t i o n c o n f l i c t s w i th p r e v i o u s −m o p t i o n .
E x i t i n g .\ n ” ) ;





i f ( o f o r m a t s e t == 0){
o f o r m a t = MIF ;
o f o r m a t s e t = 1 ;
}
e l s e {
i f ( o f o r m a t != MIF )
{
f p r i n t f ( s t d e r r , ”−m o p t i o n c o n f l i c t s w i th p r e v i o u s −c o p t i o n .
E x i t i n g .\ n ” ) ;




case ’ 1 ’ :
i f ( t y p e s e t == 0)
{
t y p e = D1 ;
t y p e s e t = 1 ;
}
e l s e {
i f ( t y p e != D1 )
{
f p r i n t f ( s t d e r r , ”−1 o p t i o n c o n f l i c t s w i th p r e v i o u s −3 or −4 o p t i o n .
E x i t i n g .\ n ” ) ;




case ’ 3 ’ :
i f ( t y p e s e t == 0)
{
t y p e = D3 ;
t y p e s e t = 1 ;
}
e l s e {
i f ( t y p e != D3 )
{
f p r i n t f ( s t d e r r , ”−3 o p t i o n c o n f l i c t s w i th p r e v i o u s −1 or −4 o p t i o n .
E x i t i n g .\ n ” ) ;




case ’ 4 ’ :
i f ( t y p e s e t == 0)
{
t y p e = D4 ;
36
t y p e s e t = 1 ;
}
e l s e {
i f ( t y p e != D4 )
{
f p r i n t f ( s t d e r r , ”−4 o p t i o n c o n f l i c t s w i th p r e v i o u s −1 or −3 o p t i o n .
E x i t i n g .\ n ” ) ;





i f ( mode se t == 0)
{
mode = NORMAL;
mode se t = 1 ;
}
e l s e {
i f ( mode != NORMAL)
{
f p r i n t f ( s t d e r r , ”−N o p t i o n c o n f l i c t s w i th p r e v i o u s −C o p t i o n .
E x i t i n g .\ n ” ) ;




case ’C ’ :
i f ( mode se t == 0)
{
mode = COMPACT;
mode se t = 1 ;
}
e l s e {
i f ( mode != COMPACT)
{
f p r i n t f ( s t d e r r , ”−C o p t i o n c o n f l i c t s w i th p r e v i o u s −N o p t i o n .
E x i t i n g .\ n ” ) ;




case ’ i ’ :
i f ( i n n a m e s e t == 0)
{
i n f i l e n a m e = a rgv [ i + 1 ] ;
i ++;
i n n a m e s e t = 1 ;
}
e l s e {
f p r i n t f ( s t d e r r , ” M u l t i p l e s o u r c e s p r o v i d e d . E x i t i n g .\ n ” ) ;
e x i t ( 1 ) ;
}
break ;
case ’ o ’ :
i f ( o u t p u t n a m e p r o v i d e d == 0)
{
o u t f i l e n a m e b a s e = a rgv [ i + 1 ] ;
i ++;
37
o u t p u t n a m e p r o v i d e d = 1 ;
}
e l s e {
f p r i n t f ( s t d e r r , ” M u l t i p l e t a r g e t s p r o v i d e d . E x i t i n g .\ n ” ) ;
e x i t ( 1 ) ;
}
break ;
case ’ f ’ :
f u l l o u t p u t = 1 ;
break ;
case ’ h ’ :
u sage ( ) ;
e x i t ( 0 ) ;
break ;
d e f a u l t :
f p r i n t f ( s t d e r r , ” I n v a l i d o p t i o n %s\n ” , a rgv [ i ] ) ;
u sage ( ) ;




e l s e
{
usage ( ) ;




p r i n t f ( ”Done r e a d i n g o p t i o n s . . . \ n ” ) ;
/ / Genera te t h e o u t p u t base name i f none was p r o v i d e d
i f ( o u t p u t n a m e p r o v i d e d ==0){
o u t f i l e n a m e b a s e = ( char ∗ ) m a l loc ( s t r l e n ( i n f i l e n a m e ) ) ;
i f ( o u t f i l e n a m e b a s e ==NULL) {
f p r i n t f ( s t d e r r , ” Could n o t a l l o c a t e memory f o r o u t p u t f i l e name ! ” ) ;
e x i t ( 1 ) ;
}
s t r c p y ( o u t f i l e n a m e b a s e , i n f i l e n a m e ) ;
o u t f i l e n a m e b a s e = s t r t o k ( o u t f i l e n a m e b a s e , ” . ” ) ; / / t h i s f a i l s i f t h e i n p u t f i l e name
i s o f t h e s o r t x . y . hex
}
/ / DEBUG
i f ( o f o r m a t s e t ==0){
p r i n t f ( ”No o u t p u t f o r m a t s e t . Using d e f a u l t \n ” ) ;
}
i f ( t y p e s e t ==0){
p r i n t f ( ”No memory t y p e s e t . Using d e f a u l t \n ” ) ;
}
i f ( mode se t ==0){
p r i n t f ( ”No mode s e t . Wi l l t r y t o d e t e r m i n e by c o n t e n t s \n ” ) ;
}
i f ( o u t p u t n a m e p r o v i d e d ==0){
p r i n t f ( ”No o u t p u t name p r o v i d e d . Using d e f a u l t \n ” ) ;
}
p r i n t f ( ”\n ” ) ;
p r i n t f ( ” Outpu t Format : %s\n ” , ( o f o r m a t ==COE) ? ”COE” : ”MIF” ) ;
p r i n t f ( ”Memory t y p e : %s\n ” , ( t y p e ==D1 ) ? ” 1D” : ( ( t y p e ==D3 ) ? ” 3D” : ” 4D” ) ) ;
p r i n t f ( ”Memory Mode : %s\n ” , ( mode==NORMAL) ? ” Normal ” : ” Compact ” ) ;
38
p r i n t f ( ” I n p u t f i l e : %s\n ” , i n f i l e n a m e ) ;
p r i n t f ( ” Outpu t f i l e : %s\n ” , o u t f i l e n a m e b a s e ) ;
p r i n t f ( ”\n ” ) ;
/ / END DEBUG
i f ( o f o r m a t ==MIF )
e x t =MIFEXT ;
/ / open hex f i l e
i f ( i n f i l e n a m e != NULL) {
i n f i l e = fopen ( i n f i l e n a m e , ” r ” ) ;
i f ( i n f i l e ==NULL) {
f p r i n t f ( s t d e r r , ” Can ’ t open f i l e %s f o r r e a d i n g !\ n ” , i n f i l e n a m e ) ;
e x i t ( 1 ) ;
}
}
/ / open o u t p u t f i l e s
i f ( o u t f i l e n a m e b a s e != NULL) {
o u t f i l e n a m e c h 0 1 = ( char ∗ ) m a l loc ( s t r l e n ( o u t f i l e n a m e b a s e ) + s t r l e n ( CH01 ) +EXTSIZE ) ;
s t r c p y ( o u t f i l e n a m e c h 0 1 , o u t f i l e n a m e b a s e ) ;
s t r c a t ( o u t f i l e n a m e c h 0 1 , CH01 ) ;
s t r c a t ( o u t f i l e n a m e c h 0 1 , e x t ) ;
o u t f i l e c h 0 1 = fopen ( o u t f i l e n a m e c h 0 1 , ”w” ) ;
i f ( o u t f i l e c h 0 1 == NULL) {
f p r i n t f ( s t d e r r , ” Can ’ t open f i l e %s f o r w r i t i n g !\ n ” , o u t f i l e n a m e c h 0 1 ) ;
e x i t ( 1 ) ;
}
o u t f i l e n a m e c h 2 3 = ( char ∗ ) m a l loc ( s t r l e n ( o u t f i l e n a m e b a s e ) + s t r l e n ( CH23 ) +EXTSIZE ) ;
s t r c p y ( o u t f i l e n a m e c h 2 3 , o u t f i l e n a m e b a s e ) ;
s t r c a t ( o u t f i l e n a m e c h 2 3 , CH23 ) ;
s t r c a t ( o u t f i l e n a m e c h 2 3 , e x t ) ;
o u t f i l e c h 2 3 = fopen ( o u t f i l e n a m e c h 2 3 , ”w” ) ;
i f ( o u t f i l e c h 2 3 == NULL) {
f p r i n t f ( s t d e r r , ” Can ’ t open f i l e %s f o r w r i t i n g !\ n ” , o u t f i l e n a m e c h 2 3 ) ;
e x i t ( 1 ) ;
}
o u t f i l e n a m e f u l l = ( char ∗ ) m a l loc ( s t r l e n ( o u t f i l e n a m e b a s e ) +EXTSIZE ) ;
s t r c p y ( o u t f i l e n a m e f u l l , o u t f i l e n a m e b a s e ) ;
s t r c a t ( o u t f i l e n a m e f u l l , e x t ) ;
o u t f i l e f u l l = fopen ( o u t f i l e n a m e f u l l , ”w” ) ;
i f ( o u t f i l e f u l l == NULL) {
f p r i n t f ( s t d e r r , ” Can ’ t open f i l e %s f o r w r i t i n g !\ n ” , o u t f i l e n a m e f u l l ) ;
e x i t ( 1 ) ;
}
}
/ / w r i t e h e a d e r s t o COE f i l e s
i f ( o f o r m a t ==COE ) {
f p r i n t f ( o u t f i l e c h 0 1 , COEHEADERCOMMENTCH01, i n f i l e n a m e ) ;
f p r i n t f ( o u t f i l e c h 0 1 , COERADIX) ;
f p r i n t f ( o u t f i l e c h 0 1 , COEVECTOR) ;
f p r i n t f ( o u t f i l e c h 2 3 , COEHEADERCOMMENTCH23, i n f i l e n a m e ) ;
f p r i n t f ( o u t f i l e c h 2 3 , COERADIX) ;
f p r i n t f ( o u t f i l e c h 2 3 , COEVECTOR) ;
f p r i n t f ( o u t f i l e f u l l , COEHEADERCOMMENTFULL, i n f i l e n a m e ) ;
39
f p r i n t f ( o u t f i l e f u l l , COERADIX) ;
f p r i n t f ( o u t f i l e f u l l , COEVECTOR) ;
}
/ / b u f f e r s i z e i s 10 f o r 8 c h a r a c t e r s + n e w l i n e + l i n e f e e d ( change i f s y s t e m changes )
whi le ( f g e t s ( b u f f e r , 10 , i n f i l e ) !=NULL) { / / Read a 32 b i t l i n e and d i v i d e i n t o c h a n n e l s
i f ( mode se t ==1 && mode==NORMAL) {
/ / p r o c e s s t h e i n p u t f i l e as i f i t ’ s a Normal mode ( i e , 12 /10 b i t s f o r 1 / 3 / 4D)
p r i n t f ( ” Normal mode s p e c i f i e d \n ” ) ;
i f ( t y p e == D1 ) { / / 1d , 12 b i t s per c h a n n e l
/ / b u f f e r : 0WWW0XXX\n\0
p r i n t f ( ”1−D Memory\n ” ) ;
p r i n t f ( ”HEX: %s ” , b u f f e r ) ;
mem1d out . ch0 = a t o h ( b u f f e r [ 1 ] ) ;
mem1d out . ch0 = mem1d out . ch0 << 4 ;
mem1d out . ch0 += a t o h ( b u f f e r [ 2 ] ) ;
mem1d out . ch0 = mem1d out . ch0 << 4 ;
mem1d out . ch0 += a t o h ( b u f f e r [ 3 ] ) ;
mem1d out . ch1 = a t o h ( b u f f e r [ 5 ] ) ;
mem1d out . ch1 = mem1d out . ch1 << 4 ;
mem1d out . ch1 += a t o h ( b u f f e r [ 6 ] ) ;
mem1d out . ch1 = mem1d out . ch1 << 4 ;
mem1d out . ch1 += a t o h ( b u f f e r [ 7 ] ) ;
i f ( f g e t s ( b u f f e r , 10 , i n f i l e ) !=NULL) {
p r i n t f ( ”HEX: %s ” , b u f f e r ) ;
mem1d out . ch2 = a t o h ( b u f f e r [ 1 ] ) ;
mem1d out . ch2 = mem1d out . ch2 << 4 ;
mem1d out . ch2 += a t o h ( b u f f e r [ 2 ] ) ;
mem1d out . ch2 = mem1d out . ch2 << 4 ;
mem1d out . ch2 += a t o h ( b u f f e r [ 3 ] ) ;
mem1d out . ch3 = a t o h ( b u f f e r [ 5 ] ) ;
mem1d out . ch3 = mem1d out . ch3 << 4 ;
mem1d out . ch3 += a t o h ( b u f f e r [ 6 ] ) ;
mem1d out . ch3 = mem1d out . ch3 << 4 ;
mem1d out . ch3 += a t o h ( b u f f e r [ 7 ] ) ;
}
p r i n t f ( ” Channe l s : %x %x %x %x\n ” , mem1d out . ch0 ,
mem1d out . ch1 , mem1d out . ch2 , mem1d out . ch3 ) ;
p r i n t f ( ” Channe l s ( merged ) : %x %x\n\n ” , ( mem1d out . ch0<<12)+mem1d out . ch1 ,
( mem1d out . ch2<<12)+mem1d out . ch3 ) ;
/ / w r i t e c h a n n e l s t o f i l e s (COE or MIF )
i f ( o f o r m a t == COE ) {
f p r i n t f ( o u t f i l e c h 0 1 , ”%x\n ” , ( mem1d out . ch0<<12)+mem1d out . ch1 ) ;
f p r i n t f ( o u t f i l e c h 2 3 , ”%x\n ” , ( mem1d out . ch2<<12)+mem1d out . ch3 ) ;
}
e l s e {
d e c 2 b i n ( mem1d out . ch0 , mif 1d , 12) ;
f p r i n t f ( o u t f i l e c h 0 1 , ”%s ” , mi f 1d ) ;
d e c 2 b i n ( mem1d out . ch1 , mif 1d , 12) ;
f p r i n t f ( o u t f i l e c h 0 1 , ”%s\n ” , mi f 1d ) ;
d e c 2 b i n ( mem1d out . ch2 , mif 1d , 12) ;
f p r i n t f ( o u t f i l e c h 2 3 , ”%s ” , mi f 1d ) ;
d e c 2 b i n ( mem1d out . ch3 , mif 1d , 12) ;




e l s e { / / 3d or 4d , same format , 10 b i t s per c h a n n e l
/ / b u f f e r : 0[00ww]WW0[00 xx ]XX
p r i n t f ( ” 3D/ 4D Memory\n ” ) ;
p r i n t f ( ”HEX: %s ” , b u f f e r ) ;
mem3d4d out . ch0 = a t o h ( b u f f e r [ 1 ] ) ; / / ch0=ww
mem3d4d out . ch0 = mem3d4d out . ch0 << 4 ;
mem3d4d out . ch0 += a t o h ( b u f f e r [ 2 ] ) ; / / ch0=wwW
mem3d4d out . ch0 = mem3d4d out . ch0 << 4 ;
mem3d4d out . ch0 += a t o h ( b u f f e r [ 3 ] ) ; / / ch0=wwWW
mem3d4d out . ch1 = a t o h ( b u f f e r [ 5 ] ) ;
mem3d4d out . ch1 = mem3d4d out . ch1 << 4 ;
mem3d4d out . ch1 += a t o h ( b u f f e r [ 6 ] ) ;
mem3d4d out . ch1 = mem3d4d out . ch1 << 4 ;
mem3d4d out . ch1 += a t o h ( b u f f e r [ 7 ] ) ;
i f ( f g e t s ( b u f f e r , 10 , i n f i l e ) !=NULL) {
p r i n t f ( ”HEX: %s ” , b u f f e r ) ;
mem3d4d out . ch2 = a t o h ( b u f f e r [ 1 ] ) ;
mem3d4d out . ch2 = mem3d4d out . ch2 << 4 ;
mem3d4d out . ch2 += a t o h ( b u f f e r [ 2 ] ) ;
mem3d4d out . ch2 = mem3d4d out . ch2 << 4 ;
mem3d4d out . ch2 += a t o h ( b u f f e r [ 3 ] ) ;
mem3d4d out . ch3 = a t o h ( b u f f e r [ 5 ] ) ;
mem3d4d out . ch3 = mem3d4d out . ch3 << 4 ;
mem3d4d out . ch3 += a t o h ( b u f f e r [ 6 ] ) ;
mem3d4d out . ch3 = mem3d4d out . ch3 << 4 ;
mem3d4d out . ch3 += a t o h ( b u f f e r [ 7 ] ) ;
}
/ / debug
p r i n t f ( ” Channe l s : %x %x %x %x\n ” , mem3d4d out . ch0 ,
mem3d4d out . ch1 , mem3d4d out . ch2 , mem3d4d out . ch3 ) ;
p r i n t f ( ” Channe l s ( merged ) : %x %x\n\n ” , ( mem3d4d out . ch0<<10)+mem3d4d out . ch1 ,
( mem3d4d out . ch2<<10)+mem3d4d out . ch3 ) ;
/ / add o u t p u t code here
i f ( o f o r m a t == COE ) {
f p r i n t f ( o u t f i l e c h 0 1 , ”%x\n ” , ( mem3d4d out . ch0<<10)+mem3d4d out . ch1 ) ;
f p r i n t f ( o u t f i l e c h 2 3 , ”%x\n ” , ( mem3d4d out . ch2<<10)+mem3d4d out . ch3 ) ;
f p r i n t f ( o u t f i l e f u l l , ”%x%x\n ” , ( mem3d4d out . ch0<<10)+mem3d4d out . ch1 ,
( mem3d4d out . ch2<<10)+mem3d4d out . ch3 ) ;
}
e l s e {
/ / add code f o r MIF o u t p u t ( b i n a r y )
d e c 2 b i n ( mem3d4d out . ch0 , mif 3d4d , 10) ;
f p r i n t f ( o u t f i l e c h 0 1 , ”%s ” , mi f 3d4d ) ;
d e c 2 b i n ( mem3d4d out . ch1 , mif 3d4d , 10) ;
f p r i n t f ( o u t f i l e c h 0 1 , ”%s\n ” , mi f 3d4d ) ;
d e c 2 b i n ( mem3d4d out . ch2 , mif 3d4d , 10) ;
f p r i n t f ( o u t f i l e c h 2 3 , ”%s ” , mi f 3d4d ) ;
d e c 2 b i n ( mem3d4d out . ch3 , mif 3d4d , 10) ;




e l s e i f ( mode se t ==1 && mode==COMPACT) { / / Compact mode (8 b i t s per c h a n n e l
i n s o u r c e )
p r i n t f ( ” Compact mode s p e c i f i e d \n ” ) ;
41
i f ( t y p e == D1 ) { / / 1d , 12 b i t s per c h a n n e l
/ / b u f f e r : WWXXYYZZ\n\0
p r i n t f ( ” 1D memory\n ” ) ;
p r i n t f ( ”HEX: %s ” , b u f f e r ) ;
mem1d out . ch0 = a t o h ( b u f f e r [ 0 ] ) ;
mem1d out . ch0 = mem1d out . ch0 << 4 ;
mem1d out . ch0 += a t o h ( b u f f e r [ 1 ] ) ;
mem1d out . ch0 = mem1d out . ch0 << 4 ;
mem1d out . ch1 = a t o h ( b u f f e r [ 2 ] ) ;
mem1d out . ch1 = mem1d out . ch1 << 4 ;
mem1d out . ch1 += a t o h ( b u f f e r [ 3 ] ) ;
mem1d out . ch1 = mem1d out . ch1 << 4 ;
mem1d out . ch2 = a t o h ( b u f f e r [ 4 ] ) ;
mem1d out . ch2 = mem1d out . ch2 << 4 ;
mem1d out . ch2 += a t o h ( b u f f e r [ 5 ] ) ;
mem1d out . ch2 = mem1d out . ch2 << 4 ;
mem1d out . ch3 = a t o h ( b u f f e r [ 6 ] ) ;
mem1d out . ch3 = mem1d out . ch3 << 4 ;
mem1d out . ch3 += a t o h ( b u f f e r [ 7 ] ) ;
mem1d out . ch3 = mem1d out . ch3 << 4 ;
p r i n t f ( ” Channe l s : %x %x %x %x\n ” , mem1d out . ch0 ,
mem1d out . ch1 , mem1d out . ch2 , mem1d out . ch3 ) ;
p r i n t f ( ” Channe l s ( merged ) : %x %x\n\n ” , ( mem1d out . ch0<<12)+mem1d out . ch1 ,
( mem1d out . ch2<<12)+mem1d out . ch3 ) ;
/ / add o u t p u t code here
i f ( o f o r m a t == COE ) {
f p r i n t f ( o u t f i l e c h 0 1 , ”%x\n ” , ( mem1d out . ch0<<12)+mem1d out . ch1 ) ;
f p r i n t f ( o u t f i l e c h 2 3 , ”%x\n ” , ( mem1d out . ch2<<12)+mem1d out . ch3 ) ;
}
e l s e {
/ / add code f o r MIF o u t p u t ( b i n a r y )
d e c 2 b i n ( mem1d out . ch0 , mif 1d , 12) ;
f p r i n t f ( o u t f i l e c h 0 1 , ”%s ” , mi f 1d ) ;
d e c 2 b i n ( mem1d out . ch1 , mif 1d , 12) ;
f p r i n t f ( o u t f i l e c h 0 1 , ”%s\n ” , mi f 1d ) ;
d e c 2 b i n ( mem1d out . ch2 , mif 1d , 12) ;
f p r i n t f ( o u t f i l e c h 2 3 , ”%s ” , mi f 1d ) ;
d e c 2 b i n ( mem1d out . ch3 , mif 1d , 12) ;
f p r i n t f ( o u t f i l e c h 2 3 , ”%s\n ” , mi f 1d ) ;
}
}
e l s e { / / 3d or 4d , same format , 10 b i t s per c h a n n e l
/ / b u f f e r : WWXXYYZZ\n\0
p r i n t f ( ” 3D or 4D memory\n ” ) ;
p r i n t f ( ”HEX: %s ” , b u f f e r ) ;
mem3d4d out . ch0 = a t o h ( b u f f e r [ 0 ] ) ; / / ch0=ww
mem3d4d out . ch0 = mem3d4d out . ch0 << 4 ;
mem3d4d out . ch0 += a t o h ( b u f f e r [ 1 ] ) ; / / ch0=wwW
mem3d4d out . ch0 = mem3d4d out . ch0 << 2 ;
mem3d4d out . ch1 = a t o h ( b u f f e r [ 2 ] ) ;
mem3d4d out . ch1 = mem3d4d out . ch1 << 4 ;
mem3d4d out . ch1 += a t o h ( b u f f e r [ 3 ] ) ;
mem3d4d out . ch1 = mem3d4d out . ch1 << 2 ;
42
mem3d4d out . ch2 = a t o h ( b u f f e r [ 4 ] ) ;
mem3d4d out . ch2 = mem3d4d out . ch2 << 4 ;
mem3d4d out . ch2 += a t o h ( b u f f e r [ 5 ] ) ;
mem3d4d out . ch2 = mem3d4d out . ch2 << 2 ;
mem3d4d out . ch3 = a t o h ( b u f f e r [ 6 ] ) ;
mem3d4d out . ch3 = mem3d4d out . ch3 << 4 ;
mem3d4d out . ch3 += a t o h ( b u f f e r [ 7 ] ) ;
mem3d4d out . ch3 = mem3d4d out . ch3 << 2 ;
/ / debug
p r i n t f ( ” Channe l s : %x %x %x %x\n ” , mem3d4d out . ch0 ,
mem3d4d out . ch1 , mem3d4d out . ch2 , mem3d4d out . ch3 ) ;
p r i n t f ( ” Channe l s ( merged ) : %x %x\n\n ” , ( mem3d4d out . ch0<<10)+mem3d4d out . ch1 ,
( mem3d4d out . ch2<<10)+mem3d4d out . ch3 ) ;
/ / add o u t p u t code here
i f ( o f o r m a t == COE ) {
f p r i n t f ( o u t f i l e c h 0 1 , ”%x\n ” , ( mem3d4d out . ch0<<10)+mem3d4d out . ch1 ) ;
f p r i n t f ( o u t f i l e c h 2 3 , ”%x\n ” , ( mem3d4d out . ch2<<10)+mem3d4d out . ch3 ) ;
f p r i n t f ( o u t f i l e f u l l , ”%x%x\n ” , ( mem3d4d out . ch0<<10)+mem3d4d out . ch1 ,
( mem3d4d out . ch2<<10)+mem3d4d out . ch3 ) ;
}
e l s e {
/ / add code f o r MIF o u t p u t ( b i n a r y )
d e c 2 b i n ( mem3d4d out . ch0 , mif 3d4d , 10) ;
f p r i n t f ( o u t f i l e c h 0 1 , ”%s ” , mi f 3d4d ) ;
d e c 2 b i n ( mem3d4d out . ch1 , mif 3d4d , 10) ;
f p r i n t f ( o u t f i l e c h 0 1 , ”%s\n ” , mi f 3d4d ) ;
d e c 2 b i n ( mem3d4d out . ch2 , mif 3d4d , 10) ;
f p r i n t f ( o u t f i l e c h 2 3 , ”%s ” , mi f 3d4d ) ;
d e c 2 b i n ( mem3d4d out . ch3 , mif 3d4d , 10) ;





i f ( o f o r m a t ==COE ) {
f p r i n t f ( o u t f i l e c h 0 1 , COEEND) ;
f p r i n t f ( o u t f i l e c h 2 3 , COEEND) ;
}
i f ( i n f i l e != NULL)
f c l o s e ( i n f i l e ) ;
i f ( o u t f i l e c h 0 1 != NULL)
f c l o s e ( o u t f i l e c h 0 1 ) ;
i f ( o u t f i l e c h 2 3 != NULL)
f c l o s e ( o u t f i l e c h 2 3 ) ;
re turn 0 ;
}
void usage ( ) {
p r i n t f ( ” Usage : hex2coe [−c |−m] [−1|−3|−4] [−N|−C] − i <i n p u t> [−o <o u t p u t stem >]\n ” ) ;
p r i n t f ( ” −c : s p e c i f i e s t h e o u t p u t s h o u l d be coe f o r m a t ( d e f a u l t ) \n ” ) ;
p r i n t f ( ” −m: s p e c i f i e s t h e o u t p u t s h o u l d be mif f o r m a t ( n o t y e t implemented ) \n ” ) ;
p r i n t f ( ” −1: s p e c i f i e s t h e f i l e i s f o r a 1D memory (48 b i t s , 12 b i t s p e r c h a n n e l ) \n ” ) ;
p r i n t f ( ” −3 −4: s p e c i f i e s t h e f i l e i s f o r a 3D/ 4D memory (40 b i t s , 10 b i t s p e r
c h a n n e l ) \n ” ) ;
43
p r i n t f ( ” −N: s p e c i f i e s t h e i n p u t i s i n normal mode (12 b i t s / c h a n n e l f o r 1D, 10
b i t s / c h a n n e l f o r 3D/ 4D) \n ” ) ;
p r i n t f ( ” −C : s p e c i f i e s t h e i n p u t i s i n compact mode (8 b i t s / c h a n n e l f o r 1D, 8
b i t s / c h a n n e l f o r 3D/ 4D) \n ” ) ;
p r i n t f ( ” −f : O u t p u t s t h e f u l l 40 b i t coe f i l e i n a d d i t i o n t o t h e d i v i d e d c h a n n e l s
(3D/ 4D on ly ) \n ” ) ;
p r i n t f ( ” −h : d i s p l a y t h i s h e l p and e x i t \n ” ) ;
p r i n t f ( ” I f n e i t h e r −N nor −C i s given , w i l l a t t e m p t t o d e t e r m i n e t h e f o r m a t based on
t h e c o n t e n t s .\ n ” ) ;
re turn ;
}
/ / C o n v e r t s an a s c i i hex s t r i n g 0−9A−F t o an i n t e g e r
unsigned i n t a t o h ( char s t r )
{
unsigned i n t Value = 0 , D i g i t ;
char c ;
c= s t r ;
i f ( c >= ’ 0 ’ && c <= ’ 9 ’ ) {
D i g i t = ( unsigned i n t ) ( c − ’ 0 ’ ) ;
}
e l s e i f ( c >= ’ a ’ && c <= ’ f ’ ) {
D i g i t = ( unsigned i n t ) ( c − ’ a ’ ) + 1 0 ;
}
e l s e i f ( c >= ’A’ && c <= ’F ’ ) {
D i g i t = ( unsigned i n t ) ( c − ’A’ ) + 1 0 ;
}
e l s e {
}
Value = ( Value << 4) + D i g i t ;
re turn Value ;
}
/ / C o n v e r t s an i n t e g e r t o a b i n a r y s t r i n g
char∗ d e c 2 b i n ( i n t va l , char∗ s t r , i n t l e n )
{
i n t i ;
f o r ( i = len −1; i >=0; i−−){
s t r [ i ] = ( v a l %2==0)? ’ 0 ’ : ’ 1 ’ ;
v a l = va l >>1;
}
re turn s t r ;
}
