International Journal of Computer and Communication
Technology
Volume 7

Issue 2

Article 10

April 2016

A Novel Decode-Aware Compression Technique for Improved
Compression and Decompression
J. Suresh Babu
Department Of Electronics & Comm. Engineering, Nimra College of Engineering & Technology,
Ibrahimpatnam, Vijayawada, sureshbabujakkula@gmail.com

K. Tirumala Rao
Department Of Electronics & Comm. Engineering, Nimra College of Engineering & Technology,
Ibrahimpatnam, Vijayawada, ktirumalarao@gmail.com

P. Srinivas
Department Of Electronics & Comm. Engineering, Nimra College of Engineering & Technology,
Ibrahimpatnam, Vijayawada, psrinivas@gmail.com

Follow this and additional works at: https://www.interscience.in/ijcct

Recommended Citation
Babu, J. Suresh; Rao, K. Tirumala; and Srinivas, P. (2016) "A Novel Decode-Aware Compression Technique
for Improved Compression and Decompression," International Journal of Computer and Communication
Technology: Vol. 7 : Iss. 2 , Article 10.
DOI: 10.47893/IJCCT.2016.1351
Available at: https://www.interscience.in/ijcct/vol7/iss2/10

This Article is brought to you for free and open access by the Interscience Journals at Interscience Research
Network. It has been accepted for inclusion in International Journal of Computer and Communication Technology
by an authorized editor of Interscience Research Network. For more information, please contact
sritampatnaik@gmail.com.

A Novel Decode-Aware Compression Technique for
Improved Compression and Decompression

J. Suresh Babu, K. Tirumala Rao & P. Srinivas
Department Of Electronics & Comm. Engineering,
Nimra College of Engineering & Technology, Ibrahimpatnam, Vijayawada
E-mail : sureshbabujakkula@gmail.com

Abstract – With compressed bit streams, more configuration information can be stored using the same memory. The access delay is
also reduced, because less bits need to be transferred through the memory interface. To measure the efficiency of bit stream
compression, compression ratio (CR) is widely used as a metric. it is a major challenge to develop an efficient compression
technique that can significantly reduce the bit stream size without sacrificing the decompression performance. Our approach
combines the advantages of previous compression techniques with good compression ratio and those with fast decompression. This
paper makes three important contributions. First, it performs smart placement of compressed bit streams to enable fast
decompression of variable-length coding. Next, it selects bitmask-based compression parameters suitable for bit stream compression.
Finally, it efficiently combines run length encoding and bitmask-based compression to obtain better compression and faster
decompression.
Keywords - Field-programmable gate array (FPGA), bit stream compression, Bitmask-based compression, decompression
hardware.

I.

With compressed bit streams, more configuration
information can be stored using the same memory.
There
are
two
major
challenges
in
bit
stream compression: 1) how to compress the bit stream
as much as possible and 2) how to efficiently
decompress the bit stream without affecting the
reconfiguration time. We can classify the existing bit
stream compression techniques into two categories. The
techniques in the first category have good compression
ratio due to complex and variable- length coding (VLC)
[1]–[3].
However,
they
also
need expensive
decompression hardware, which may not be
acceptable for practical implementation. The other
category of
compression
approaches
accelerate
decompression using simple or fixed-length coding
(FLC) [4] and therefore have very efficient
decompression hardware. The only concern is that their
compression ratios are usually compromised. Among
various compression techniques that has been
proposed in recent years, application of bitmask-based
compression [5] seems to be attractive for bit stream
compression, because of its good compression ratio and
relatively simple decompression scheme. However, the
original
algorithm
is
proposed
for
instruction compression and not suitable for FPGA bit
stream compression. Moreover, the use of variable-

INTRODUCTION

Since the configuration information for FPGA has
to be stored in internal or external memory as bit
streams, the limited memory size, and access bandwidth
become the key factors in determining the different
functionalities that a system can be configured and how
quickly the configuration can be performed. While it is
quite costly to employ memory with more capacity
and access bandwidth, bit stream compression technique
alleviates the memory constraint by reducing the size of
the bit streams.

Fig. 1 Decompression module

International Journal of Computer and Communication Technology (IJCCT), ISSN: 2231-0371, Vol-7, Iss-2

120

A Novel Decode-Aware Compression Technique for Improved Compression and Decompression

overhead is also multiplied. Most importantly, this
approach does not reduce the speed overhead introduced
by the buffering circuitry for VLC bit stream. In
contrast,
our
proposed
approach
will
significantly improve the maximum operating frequency
by effectively addressing the buffering circuitry
problem.

length coding is challenging for the design of
decompression hardware because it requires expensive
buffering circuitry as described in Section III. Hence, it
is a major challenge to develop an efficient compression
technique that can significantly reduce the bit stream
size
without sacrificing
the
decompression
performance. Our approach combines the advantages of
previous
compression techniques
with
good
compression ratio and those with fast decompression.

III. BIT STREAM COMPRESSION
ALGORITHMS

II. RELATED WORK

Fig. 2 shows our decode-aware bit stream
compression framework. On the compression side,
FPGA configuration bit stream is analyzed for selection
of profitable dictionary entries and bitmask patterns.
The compressed bit stream is then generated using
bitmask-based compression and run length encoding
(RLE).

The
difference
between
consecutive
frames (difference vector) is encoded using either
Huffman-based run length encoding or LZSS-based
compression. Such sophisticated encoding schemes can
produce excellent compression. However, they did not
address the decompression overhead in [1], which is a
major bottleneck in reconfigurable systems. In contrast,
many bit stream compression techniques only access the
configuration
memory
linearly
during
decompression, and therefore can be applied to virtually
all FPGAs. The basic idea behind most of these
techniques is to divide the entire bit stream into many
small words, then compress them with common
algorithms
such
as
Huffman
coding
[7],
arithmetic coding [8], or dictionary-based compression.
Among them, LZSS-based algorithms have
received special interest because the compressed stream
can be decoded efficiently without complex hardware.
For instance, Xilinx [9] introduced a bit
stream compression algorithm based on LZ77 which is
integrated in the System ACE controller. Huebner et al.
[10] proposed an LZSS-based technique for Xilinx
Virtex XCV2000E FPGA. The decompression engine is
designed carefully to achieve fast decompression. Stefan
et al. [11] observed that simpler algorithms like LZSS
successfully maintains decompression overhead in an
acceptable
range
but
compromises
on
compression efficiency. On the other hand, compression
techniques using complex algorithms can achieve
significant compression but incurs considerable
hardware overhead during decompression.

Fig.2 Bit stream Compression
Next, our decode-aware placement algorithm is
employed to place the compressed bit stream in the
memory for efficient decompression. During run-time,
the compressed bit stream is transmitted from the
memory to the decompression engine, and the original
configuration bit stream is produced by decompression.
Algorithm 1 outlines four important steps in our decodeaware compression framework (shown in Fig. 2): 1)
bitmask selection; 2) dictionary selection; 3)
integrated RLE compression; and 4) decode-aware
placement. The input bit stream is first divided into a
sequence of symbols with length of . Then bitmask
patterns and dictionary entries used for bitmask-based
compression are selected as described Next, the symbol
sequence is compressed using bitmask and RLE. We use
the same algorithm in [5] to perform the bitmask-based
compression. The RLE compression in our algorithm is
discussed. Finally, we place the compressed bit stream
into a decode friendly layout within the memory using
placement algorithm.

Unfortunately, the authors did not model the
buffering circuitry of the decompression engine in their
work. Hence the hardware overhead presented for some
variable-length coding techniques may be inaccurate.
To increase the decompression throughput of
complex compression
algorithms,
parallel
decompression can be used. Nikara et al. [12] improved
the throughput employing speculative parallel decoders.
Qin et al. [13] introduced a placement technique of
compressed
bit
streams
to
enable parallel
decompression. However, since the structure of each
decoder and buffering circuitry are not changed, the area

Since memory and communication bus are designed
in multiple of bytes (8 bits), storing dictionaries or
transmitting data other than multiple of byte size is not

International Journal of Computer and Communication Technology (IJCCT), ISSN: 2231-0371, Vol-7, Iss-2

121

A Novel Decode-Aware Com
mpression Technnique for Improv
ved Compressionn and Decompresssion

compresssion, we oonly use
patterns((1s,2s,2f,3s,3f,,4s,4f).

efficient. Thuus, we restrictt the symbol length to be
multiples of eight
e
in our cuurrent implemeentation. Since
the dictionaryy for bit streaam compressioon is smaller
compared to th
he size of the bit
b stream itselff, we use d=2i
to fully utilizee the bits for dictionary
d
indexxing, where is
the number off indexing bits.

2.

profitable
p

bittmask

Dictionary Selectiion:

Alggorithm 2 shhows our diictionary seleection
algorithm
m. Comparedd to the diictionary seleection
approachh proposed inn [5] for instruuction compression,
we mad
de an importannt optimizationn at Step 5). In
I the
original algorithm [5]], any node adjacent to the most
r
if its profit is less
profitablle node is removed,
than certtain threshold.. This mechannism is designned to
reduce the
t dictionary size.
s
Howeverr, if the threshold is
not chossen properly, soome high frequ
uency symbolss may
be incorrrectly removed. Since the dictionary size in bit
stream compression
c
is usually negliggible comparedd with
the size of the bit streeam, it is not beneficial to reduce
r
c
raatio.
the dictionary size by sscarifying the compression

Input bit
b steams

Divide inp
put bitstream into symbol sequence
s
SL

Perrform bitmaskk pattern selecction.

P
Perform
dictioonary selectioon.
Compreess SL symboll into CL sym
mbol using
Bitm
masking
Perforrm decode aw
ware placemennt of CL

Compreessed bitstream
m is placed inn memory
Fig.3 Decoompression Meechanism
Algoriithm-1
1.

d new heuristiics in
Theerefore, our allgorithm used
Step 5)), which carefu
fully removes edges insteaad of
nodes. Experimental
E
rresults in Section V-A show
w that
our ap
pproach is more suitablle for bit stream
s
better
compresssion,
becaause
we
ensure
dictionarry coverage.

S
Bitmask Selection:

Our bitmask-based com
mpression is similar
s
to [5],
where three tyypes of encodiing formats aree used. Fig. 3
shows the foormats in thesse cases: no compression,
compression using
u
dictionaary, and comppression using
bitmask. The selection of bitmask
b
playss an important
role in bitmask-based comprression. Generrally, there are
two types of bitmask
b
patterrns. One is “fix
xed” bitmask,
which can onnly be appliedd on fixed positions in a
symbol. The other
o
one is “ssliding” bitmassk, which can
be applied at any position. For example, a 2-bit fixed
bitmask (“2f” bitmask) is reestricted to be used on even
g bitmask (“2s”” bitmask) can
locations, but a 2-bit sliding
be used anyw
where. Clearly, fixed bitmaskks require less
bits to encode its location,, but they cann only match
bit changes at fixed positions. On the otherr hand, sliding
bitmasks are more
m
flexible, but consume more bits to
encode. In othher words, only
y a few numbber of bitmask
patterns or their combin
nations are profitable
p
for
compression. Similar to [5], in our study
y of bit stream

International Journal of Computer and Communication Technology (IJCCT), ISSN: 2231-0371, Vol-7, Iss-2

122

A Novel Decode-Aware Compression Technique for Improved Compression and Decompression

3.

encoding (pBMC+RLE). Fig. 4 shows the compression
results on Pan et al. [1] and Koch et al. [4] benchmarks.

Performance Estimation

We used Xilinx Virtex-II family IP core
benchmarks to analyze the results in this article.
The same results are found applicable to other families
and vendors too. In our experiments, Pan et al. [1]
benchmarks are compressed with 32 bit symbols, 512
entry dictionary entries and two sliding 2- and 3-bit
bitmasks for storing bitmask differences.Koch et al. [4]
benchmarks are compressed using 16 bit symbols, with
16 entry dictionary and a 2-bit sliding bitmask.
Compression Efficiency:
We first compare our improved bitmaskcompression technique with the original approach
proposed in [5]. To avoid the bias caused by parameter
selection, we use the same bitmask parameters for both
of them.
Three different compression techniques are
compared
for
compression
efficiency:
1)
bitmaskbased compression (BMC) [5]; 2) BMC with our
dictionary selection technique (pBMC); and 3) BMC
with our dictionary selection technique and run length

Fig. 5 : RLE based Compression
It can be seen that our dictionary selection
algorithm outperform the original technique.
.

International Journal of Computer and Communication Technology (IJCCT), ISSN: 2231-0371, Vol-7, Iss-2

123

A Novel Decode-Aware Compression Technique for Improved Compression and Decompression

Fig.4 Simulation Results
.
FG356 package using ISE 9.2.04i to measure the
decompression efficiency.

The dictionary generated by our algorithm improves
the compression ratio by 4% to 5%. Since in our
approach we do not have to find the threshold value
manually for each bit stream, our algorithm adaptively
finds the most suitable dictionary entries for each bit
stream. On the other hand, our method has the same
performance.

IV. CONCLUSIONS
The existing compression algorithms either provide
good compression with slow decompression or fast
decompression at the cost of compression efficiency. In
this paper, we proposed a decoding-aware compression
technique that tries to obtain both best possible
compression and fast decompression performance.
The proposed compression technique analyzes the effect
of parameters on compression ratio and chooses the
optimal ones automatically. We also exploit run
length encoding of consecutive repetitive patterns
efficiently combined with bitmask-based compression to
further
improve
both compression
ratio
and
decompression efficiency. We proposed a smart
placement
algorithm
which
enables
the
compressed variable-length coding bit stream to be
stored Our experimental results demonstrated that our
technique improves the compression ratio by 10% to
15% while the decompression engine is capable of
operating at 200 MHz in Virtex II FPGAs. The
configuration time is reduced by 15% to 20% compared
to the best known decompression accelerator [4].

The experimental results also illustrate the
improvement of compression ratio due to the run length
encoding used in our technique.
Decompression Efficiency:
We measured the decompression efficiency using
the time required to reconfigure a compressed bit
stream, the resource usage and maximum operating
frequency of the decompression engine. The
reconfiguration time is calculated using the product of
number of cycles required to decode the compressed bit
stream and operating clock speed. We have synthesized
decompression units for variable-length bitmask-based
compression, difference vector-based compression (DV
RLE RB), LZSS (8 bit symbols6), and our proposed
approach on Xilinx Virtex II family XC2v40 device

International Journal of Computer and Communication Technology (IJCCT), ISSN: 2231-0371, Vol-7, Iss-2

124

A Novel Decode-Aware Compression Technique for Improved Compression and Decompression

ACKNOWLEDGEMENTS

Authors Profile:

The authors would like to thank the anonymous
reviewers for their comments which were very helpful
in improving the quality and presentation of this paper.
REFERENCES:
[1]

J. H. Pan, T. Mitra, and W. F. Wong,
“Configuration
bitstream
compression for
dynamically reconfigurable FPGAs,” in Proc. Int.
Conf. Comput.-Aided Des., 2004, pp. 766–773.

[2]

D. Koch, C. Beckhoff, and J. Teich, “Bitstream
decompression
for
high speed
FPGA
configuration from slow memories,” in Proc. Int.
Conf. Field-Program. Technol., 2007, pp. 161–
168.

[3]

L. Feinstein, D. Schnackenberg, R. Balupari, and
D. Kindred, “Statistical approaches to ddos attack
detection and response,” in DISCEX, 2003.
L. Spitzner, Honeypots: Tracking Attackers.
Addison-Wesley, 2002.

[4]
[5]

C.Morrow
http://www.secsup.org/Tracking.
BlackHole
Route Server and Tracking Traffic on an IP
Network.

[6]

http://www.snort.org.
Network IDS/IPS.

[7]

A. V. Aho and M. J. Corasick, “Efficient string
matching: an aid to bibliographic search,”
Commun. ACM, vol. 18, no. 6, pp. 333–340,
1975.

[8]

D. E. Knuth, J. H. M. Jr., and V. R. Pratt, “Fast
pattern matching in strings,” SIAM J. Comput.,
vol. 6, no. 2, pp. 323–350, June 1977.

[9]

R. S. Boyer and J. S. Moore, “A fast string
matching algorithm,” Commun. ACM, vol. 20,
no. 10, pp. 762–772, October 1977.

SNORT:

Open-Source



International Journal of Computer and Communication Technology (IJCCT), ISSN: 2231-0371, Vol-7, Iss-2

125

