画像勾配方向と相対的強度に基づくVLSI向けHEVCエンコーダアルゴリズム by Chen Gaoxing
早稲田大学大学院情報生産システム研究科
博 士 論 文 概 要
論 文 題 目
I ma ge Te x tu re Di re c t i o n and
Re la t i ve S tre n gth base d
H E VC Enco d ing Algo r i thm fo r
V LSI I mple me nta t i o n
申 請 者
Gaoxing CHEN
情報生産システム工学専攻
画像情報システム研究
2016 年 12 月
2Video-re lated products l ike TV and camera are widely used in our dai ly
l i fe . In recent years , to process the high-resolut ion (such as 4K, 8K) and high
qual i ty video, the High Ef f ic iency Video Coding (HEVC) was standardized in
2013. Comparing with the formal standard H.264/AVC, HEVC achieves two
t imes compress ion performance whi le the computat ional complexity has
increased more than three t imes . To real ize real -t ime for the high resolut ion
and high qual i ty video products , VLSI implementat ion is necessary. To
achieve HEVC VLSI implementat ion, there are many steps to be completed .
Since the current HEVC is sof tware or iented standard, the VLSI fr iendly
algor i thm for HEVC opt imizat ion is of great importance . To real ize the VLSI
fr iendly algor i thm, three main prerequis ites should be taken into
cons iderat ion. Spec if i cal ly, to reduce the s ize and power in hardware , the
algor i thm should not just cons ider complexity reduct ion, but also shorten the
longest path which highly ef fects the process ing t ime. Bes ides , due to the
memory l imitat ion in hardware implementat ion, f rame should be divided into
blocks and then transmitted to the encoder one by one, rather than
process ing frame by frame.
VLSI implementat ion for video compress ion technology has a long
history. The HEVC structure is based on its formal standard H.264/AVC. To
achieve the much higher compress ion performance , HEVC adopted many new
features. First ly, in H.264, the process ing unit is f ixed to 16ｘ 16, but in
HEVC, i t is expanded to 64ｘ 64 Coding Tree Unit (CTU) in order to apply i t
to high-resolution video. A f lexible quad-tree part i t ion structure ranging
from 64ｘ 64 to 4ｘ 4 is adopted in HEVC. In addit ion, intra predict ion, which
is based on spat ial corre lat ion, employs 35 modes for each predict ion unit
(PU) compared with 4 modes in 16ｘ 16 block and 9 modes in 4ｘ 4 block in
H.264. For inter predict ion, which is based on temporal corre lat ion, HEVC
employs two addit ional process ing modes. Secondly, on the bas is of
deblocking f i l ter (DF) in H.264, HEVC adopts a new tool named sample
adapt ive of fset (SAO) after DF in in- loop f i l ter to reduce the r inging art i facts
within CTU. The data dependency between DF and SAO increases the
dif f i culty of CTU-level process ing.
To solve the above problems and real ize VLSI fr iendly algor i thm in intra ,
inter predict ion and SAO, the image texture direct ion and relat ive strength
based methods are proposed in this dissertat ion. Image texture direct ion
reduces the complexity of mode select ion in intra predict ion and depth
dec is ion in inter predict ion by ut i l iz ing video features inc luding edge
detect ion and motion vector. Relat ive strength uses features of HEVC
3process ing modules inc luding direct ion strength, quant izat ion strength and
boundary strength. Direct ion strength and quant izat ion strength aim to
reduce the complexity of depth dec is ion in intra and inter predict ion.
Boundary strength aims at solving the data dependency in in- loop f i lter.
The dissertat ion is organized as fo l lows.
In Chapter 1, background of this dissertat ion is descr ibed, inc luding
introduct ion of HEVC, the necessi ty of VLSI fr iendly algor ithm in HEVC and
the concept of the proposal . Furthermore, the target and organizat ion of this
dissertat ion are shown.
In Chapter 2, edge detect ion, cost (qual i ty af ter predict ion) and
direct ion strength based mode and depth dec is ion in intra predict ion are
proposed. In or iginal HEVC, the best mode is dec ided by two-step cost
calculat ion, Rough Mode Decis ion (RMD) and Rate Distort ion Optimizat ion
(RDO) . Convent ional works ut i l ize s ingle feature such as edge or s impl i f ied
cost value to reduce the process ing t ime in RMD. However, the candidates in
RDO with much larger complexity remain. In the proposed mode dec is ion,
combining edge detect ion and RMD result , the mode candidates in RDO can
be reduced. Furthermore, mode ref inement is proposed in RDO process to
ensure the qual i ty loss in mode dec is ion. In depth dec is ion, a balanced search
tree structure with two nodes is proposed to shorten the longest path. Each
node skips at least one depth by direct ion strength judgment , which is from
the edge detect ion result in mode dec is ion. Therefore, the longest path in
depth dec is ion is shorten. Exper imental results show that proposal could
achieve 42.8% time saving (TS) in HEVC test mode (HM) 8.0 . Compared with
Jiang 's work [CECNet , 2012] , i t can reduce more encoding t ime by about
28.6%. Bes ides , the qual i ty loss for the proposed algor ithm is only 0.6%
BDBR gain, whi le Jiang 's work has 0.8% BDBR gain. In addit ion, the longest
path is shorten from f ive steps to three steps.
In chapter 3, Mult i - feature such as mot ion vector, quant izat ion strength,
etc . based depth dec is ion for inter predict ion is proposed. Convent ional
works , which are s ingle feature such as frame texture based depth dec is ion,
e i ther the di f ference between blocks within one frame or the mot ion feature
is not cons idered . Therefore, the best depth cannot be obtained. The proposed
method reduces the processing t ime by select ing the maximum depth of
reference blocks from cont inuous frames which are obtained by s imi lar
content , mot ion vector and encoding per iodic i ty denoted as quant izat ion
strength. A balanced search tree structure with two branches based on these
three features is proposed. Each branch represents a depth search range. The
4f irs t branch skips 64ｘ 64, and the second one skips 8ｘ 8. Exper imental
results show that the proposed algor i thm could achieve 33.7% TS in HM 14.0 .
Compared with Li ' s work [JCTVC-F092, 2011] , i t achieves more TS by about
12.5%. Bes ides , the qual i ty loss for the proposed algor ithm is only 0.8%
BDBR gain, whi le Li 's work has 1.6% BDBR gain. Furthermore, cons ider ing
longest path shortening, the proposal can skip at least one depth dec is ion
process ing steps. Thus, the longest path is shorten from four steps to three
steps .
In Chapter 4, boundary strength predict ion based pixel se lect ion for
SAO category determinat ion is proposed. Consider ing CTU level VLSI
implementat ion, SAO cannot use the CTU boundary pixels s ince these pixel s
have not been processed by DF. Therefore , the best SAO cannot be obtained.
Praveen 's work [VCIP, 2013] focused on involving the CTU boundary into
SAO. However, the video qual ity increas ing of this work is l imited s ince i t
s imply enlarged the SAO process ing area without predict ion. In the proposal ,
CTU boundary pixels are selected by boundary strength predict ion. Boundary
strength indicates the poss ibi l i ty of deblocking. Spec if i cal ly, when the value
change of the boundary pixel is smal l , which indicates the pixel has high
poss ibi l i ty to be deblocked, thus the pixel can be used in SAO. When the
value change is large , the pixel is deblocked with pre-def ined variat ion range.
The pixel , whose SAO category is not af fected by DF, can be used in SAO.
Exper imental results show that the proposed algor i thm could achieve 0.13%
BDBR saving in HM 11.0 , whi le Praveen 's work [VCIP, 2013] can only save
0.04% BDBR. Bes ides , the proposal can obtain better video qual ity than
Praveen 's work in background noise reduct ion and block art i facts smoothing.
In Chapter 5, the overall d issertat ion is summarized and the future
works are descr ibed. Based on new features of intra / inter predict ion and SAO
in HEVC, three VLSI fr iendly algor i thms are presented in this dissertat ion.
Compared with convent ional works , about 15%～ 28% time saving in or igina l
HEVC with negl igib le qual i ty loss by combinat ion of image texture direct ion
(edge detect ion and motion vector ) and relative strength (direct ion,
quant izat ion and boundary strength) . Bes ides, the proposed methods can
shorten the longest path and solve the boundary data dependency. These
results contr ibute to the real izat ion of the HEVC encoder VLSI which is the
key to real iz ing the next generat ion high-resolut ion video system. It
contr ibutes great ly to the reduct ion of the cost and power in pract ical use .
Moreover, the approach on the hardware or iented image texture direct ion
and the relat ive strength great ly contributes to academic.
