Polymorphic ASIC : For Video Decoding by Adarsha Rao, S J
Abstract
Video applications are becoming ubiquitous in recent times due to an explosion in the number
of devices with video capture and display capabilities. Traditionally, video applications are im-
plemented on a variety of devices with each device targeting a specific application. However,
the advances in technology have created a need to support multiple applications from a single
device like a smart phone or tablet. Such convergence of applications necessitates support for
interoperability among various applications, scalable performance meet the requirements of
different applications and a high degree of reconfigurability to accommodate rapid evolution
in applications features. In addition, low power consumption requirement is also very stringent
for many video applications.
The conventional custom hardware implementations of video applications deliver high per-
formance at low power consumption while the recent MPSoC implementations enable high
degree of interoperability and are useful to support application evolution. In this thesis, we
combine the best features of custom hardware and MPSoC approaches to design a Polymor-
phic ASIC. A Polymorphic ASIC is an integrated circuit designed to meet the requirements
of several applications belonging to a particular domain. A polymorphic ASIC consists of
a fabric of computation, storage and communication resources, using which applications are
composed dynamically. Although different video applications differ widely in the internal de-
tails of operation, at the heart of almost every video application is a video codec (encoder and
decoder). The requirements of scalability, high performance and low power consumption are
very stringent for video decoding. Therefore this thesis focuses mainly on the architectural
design of a Polymorphic ASIC for video decoding.
We present an unified software and hardware architecture (USHA) for Polymorphic ASIC.
v
vi
USHA is a tiled architecture which uses loosely coupled processor and hardware tiles that are
software programmable and hardware reconfigurable respectively. The distinctive feature of
Polymorphic ASIC is the static partitioning of the application and dynamic mapping of ap-
plication processes onto the computational tiles. Depending on the application scenarios, a
process may be mapped onto one of the hardware or processor tiles. Polymorphic ASIC incor-
porates a network–on–chip (NoC) to achieve flexible communication across different tiles.
Formulation of a programming framework for Polymorphic ASIC requires an implemen-
tation model that captures the structure of video decoder applications as well as the properties
of the Polymorphic ASIC architecture. We derive an implementation model based on a com-
bination of parametric polyhedral process networks, stream based functions and windowed
dataflow models of computation. The implementation model leads to a process network ori-
ented compilation flow that achieves realization agnostic application partitioning and enables
seamless migration across uniprocessor, multi–processor, semi hardware and full hardware
configurations of a video decoder. The thesis also presents an application QoS aware scheduler
that selects a decoder configuration that best meets the application performance requirements,
thereby enabling dynamic performance scaling.
The memory hierarchy of Polymorphic ASIC makes use of an application specific cache.
Through a combined analysis of miss rate and external memory bandwidth, we show that the
degradation in decoder performance due to memory stall cycles depends on the properties of
the video being decoded as well as the behavior of the external memory interface. Based on
this observation, we present the design of a reconfigurable 2–D cache architecture which can
adjust its parameters in accordance with the characteristics of the video stream being decoded.
We validate the Polymorphic ASIC using a proof–of–concept implementation on an FPGA.
The performance of H.264 decoder on Polymorphic ASIC is evaluated for uniprocessor, multi
processor, hardware accelerated and full hardware configurations. The scaling in performance
delivered by these configurations shows that the Polymorphic ASIC enables the application to
achieve super linear speedups [1]. The experimental results show that different implementa-
tions of a H.264 video decoder on the Polymorphic ASIC can deliver performance comparable
to a wide spectrum of devices ranging from embedded processor like ARM 9 to MPSoCs like
vii
IBM Cell. We also present the energy consumption of various configurations of video decoders
on Polymorphic ASIC and an application to configuration mapping aimed at minimizing the
overall energy consumption of a Polymorphic ASIC.
