7,558 research outputs found
Distributed video coding for wireless video sensor networks: a review of the state-of-the-art architectures
Distributed video coding (DVC) is a relatively new video coding architecture originated from two fundamental theorems namely, Slepian–Wolf and Wyner–Ziv. Recent research developments have made DVC attractive for applications in the emerging domain of wireless video sensor networks (WVSNs). This paper reviews the state-of-the-art DVC architectures with a focus on understanding their opportunities and gaps in addressing the operational requirements and application needs of WVSNs
Side information exploitation, quality control and low complexity implementation for distributed video coding
Distributed video coding (DVC) is a new video coding methodology that shifts the highly complex motion search components from the encoder to the decoder, such a video coder would have a great advantage in encoding speed and it is still able to achieve similar rate-distortion performance as the conventional coding solutions. Applications include wireless video sensor networks, mobile video cameras and wireless video surveillance, etc. Although many progresses have been made in DVC over the past ten years, there is still a gap in RD performance between conventional video coding solutions and DVC. The latest development of DVC is still far from standardization and practical use. The key problems remain in the areas such as accurate and efficient side information generation and refinement, quality control between Wyner-Ziv frames and key frames, correlation noise modelling and decoder complexity, etc.
Under this context, this thesis proposes solutions to improve the state-of-the-art side information refinement schemes, enable consistent quality control over decoded frames during coding process and implement highly efficient DVC codec.
This thesis investigates the impact of reference frames on side information generation and reveals that reference frames have the potential to be better side information than the extensively used interpolated frames. Based on this investigation, we also propose a motion range prediction (MRP) method to exploit reference frames and precisely guide the statistical motion learning process. Extensive simulation results show that choosing reference frames as SI performs competitively, and sometimes even better than interpolated frames. Furthermore, the proposed MRP method is shown to significantly reduce the decoding complexity without degrading any RD performance.
To minimize the block artifacts and achieve consistent improvement in both subjective and objective quality of side information, we propose a novel side information synthesis framework working on pixel granularity. We synthesize the SI at pixel level to minimize the block artifacts and adaptively change the correlation noise model according to the new SI. Furthermore, we have fully implemented a state-of-the-art DVC decoder with the proposed framework using serial and parallel processing technologies to identify bottlenecks and areas to further reduce the decoding complexity, which is another major challenge for future practical DVC system deployments. The performance is evaluated based on the latest transform domain DVC codec and compared with different standard codecs. Extensive experimental results show substantial and consistent rate-distortion gains over standard video codecs and significant speedup over serial implementation.
In order to bring the state-of-the-art DVC one step closer to practical use, we address the problem of distortion variation introduced by typical rate control algorithms, especially in a variable bit rate environment. Simulation results show that the proposed quality control algorithm is capable to meet user defined target distortion and maintain a rather small variation for sequence with slow motion and performs similar to fixed quantization for fast motion sequence at the cost of some RD performance.
Finally, we propose the first implementation of a distributed video encoder on a Texas Instruments TMS320DM6437 digital signal processor. The WZ encoder is
efficiently implemented, using rate adaptive low-density-parity-check accumulative (LDPCA) codes, exploiting the hardware features and optimization techniques to improve the overall performance. Implementation results show that the WZ encoder is able to encode at 134M instruction cycles per QCIF frame on a TMS320DM6437 DSP running at 700MHz. This results in encoder speed 29 times faster than non-optimized encoder implementation. We also implemented a highly efficient DVC decoder using both serial and parallel technology based on a PC-HPC (high performance cluster) architecture, where the encoder is running in a general purpose PC and the decoder is running in a multicore HPC. The experimental results show that the parallelized decoder can achieve about 10 times speedup under various bit-rates and GOP sizes compared to the serial implementation and significant RD gains with regards to the state-of-the-art DISCOVER codec
Mesh-based video coding for low bit-rate communications
In this paper, a new method for low bit-rate content-adaptive mesh-based video coding is proposed. Intra-frame coding of this method employs feature map extraction for node distribution at specific threshold levels to achieve higher density placement of initial nodes for regions that contain high frequency features and conversely sparse placement of initial nodes for smooth regions. Insignificant nodes are largely removed using a subsequent node elimination scheme. The Hilbert scan is then applied before quantization and entropy coding to reduce amount of transmitted information. For moving images, both node position and color parameters of only a subset of nodes may change from frame to frame. It is sufficient to transmit only these changed parameters. The proposed method is well-suited for video coding at very low bit rates, as processing results demonstrate that it provides good subjective and objective image quality at a lower number of required bits
Motion Scalability for Video Coding with Flexible Spatio-Temporal Decompositions
PhDThe research presented in this thesis aims to extend the scalability range of the
wavelet-based video coding systems in order to achieve fully scalable coding with a
wide range of available decoding points. Since the temporal redundancy regularly
comprises the main portion of the global video sequence redundancy, the techniques
that can be generally termed motion decorrelation techniques have a central role in
the overall compression performance. For this reason the scalable motion modelling
and coding are of utmost importance, and specifically, in this thesis possible
solutions are identified and analysed.
The main contributions of the presented research are grouped into two
interrelated and complementary topics. Firstly a flexible motion model with rateoptimised
estimation technique is introduced. The proposed motion model is based
on tree structures and allows high adaptability needed for layered motion coding. The
flexible structure for motion compensation allows for optimisation at different stages
of the adaptive spatio-temporal decomposition, which is crucial for scalable coding
that targets decoding on different resolutions. By utilising an adaptive choice of
wavelet filterbank, the model enables high compression based on efficient mode
selection. Secondly, solutions for scalable motion modelling and coding are
developed. These solutions are based on precision limiting of motion vectors and
creation of a layered motion structure that describes hierarchically coded motion.
The solution based on precision limiting relies on layered bit-plane coding of motion
vector values. The second solution builds on recently established techniques that
impose scalability on a motion structure. The new approach is based on two major
improvements: the evaluation of distortion in temporal Subbands and motion search
in temporal subbands that finds the optimal motion vectors for layered motion
structure.
Exhaustive tests on the rate-distortion performance in demanding scalable video
coding scenarios show benefits of application of both developed flexible motion
model and various solutions for scalable motion coding
Design and demonstration of digital pre-distortion using software defined radio
Abstract. High data rates for large number of users set tight requirements for signal quality measured in terms of error vector magnitude (EVM). In radio transmitters, nonlinear distortion dominated by power amplifiers (PAs) often limits the achievable EVM. However, the linearity can be improved by linearization techniques. Digital pre-distortion (DPD) is one of these widely used linearization techniques for an effective distortion reduction over a wide bandwidth. In DPD, the nonlinearity of the transmitter is pre-compensated in the digital domain to achieve linear output. Moreover, DPD is used to enable PAs to operate in the power-efficient region with a decent linearity.
As we are moving towards millimetre-wave frequencies to enable the wideband communications, the design of the DPD algorithm must be optimized in terms of performance and power consumption. Moreover, continuous development of wireless infrastructure motivates to make research on programmable and reconfigurable platforms in order to decrease the demonstration cost and time, especially for the demonstration purposes. This thesis illustrates and presents how software defined radio (SDR) platforms can be used to demonstrate DPD.
Universal software defined peripheral (USRP) X300 is a commercial software defined radio (SDR) platform. The chosen model, X300, has two independent channels equipped with individual transceiver cards. SIMULINK is used to communicate with the device and the two channels of X300 are used as transmitter and receiver simultaneously in full-duplex mode. Hence, a single USRP device is acting as an operational transmitter and feedback receiver, simultaneously. The implemented USRP design consists of SIMULINK based transceiver design and lookup table based DPD in which the coefficients are calculated in MATLAB offline. An external PA, i.e. ZFL-2000+ together with a directional coupler and attenuator are connected between the TX/RX port and RX2 port to measure the nonlinearity. The nonlinearity transceiver is measured with and without the external PA. The experimental results show decent performance for linearization by using the USRP platform. However, the results differ widely due to the used USRP transceiver parameterization and PA operational point. The 16 QAM test signal with 500 kHz bandwidth is fed to the USRP transmit chain. As an example, the DPD algorithm improves the EVM from 7.6% to 2.1% and also the ACPR is reduced around 10 dB with the 16 QAM input signal where approximately + 2.2 dBm input power applied to the external PA
Wave-equation based seismic multiple attenuation
Reflection seismology is widely used to map the subsurface geological structure of
the Earth. Seismic multiples can contaminate seismic data and are therefore due to be
removed. For seismic multiple attenuation, wave-equation based methods are proved
to be effective in most cases, which involve two aspects: multiple prediction and
multiple subtraction. Targets of both aspects are to develop and apply a fully datadriven
algorithm for multiple prediction, and a robust technique for multiple
subtraction. Based on many schemes developed by others regarding to the targets, this
thesis addresses and tackles the problems of wave-equation based seismic multiple
attenuation by several approaches.
First, the issue of multiple attenuation in land seismic data is discussed. Multiple
Prediction through Inversion (MPTI) method is expanded to be applied in the poststack
domain and in the CMP domain to handle the land data with low S/N ratio,
irregular geometry and missing traces. A running smooth filter and an adaptive
threshold K-NN (nearest neighbours) filter are proposed to help to employ MPTI on
land data in the shot domain.
Secondly, the result of multiple attenuation depends much upon the effectiveness
of the adaptive subtraction. The expanded multi-channel matching (EMCM) filter is
proved to be effective. In this thesis, several strategies are discussed to improve the
result of EMCM. Among them, to model and subtract the multiples according to their
orders is proved to be practical in enhancing the effect of EMCM, and a masking filter
is adopted to preserve the energy of primaries. Moreover, an iterative application of
EMCM is proposed to give the optimized result.
Thirdly, with the limitation of current 3D seismic acquisition geometries, the
sampling in the crossline direction is sparse. This seriously affects the application of
the 3D multiple attenuation. To tackle the problem, a new approach which applies a
trajectory stacking Radon transform along with the energy spectrum is proposed in
this thesis. It can replace the time-consuming time-domain sparse inversion with
similar effectiveness and much higher efficiency.
Parallel computing is discussed in the thesis so as to enhance the efficiency of
the strategies. The Message-Passing Interface (MPI) environment is implemented in
most of the algorithms mentioned above and greatly improves the efficiency
High-Level Synthesis Based VLSI Architectures for Video Coding
High Efficiency Video Coding (HEVC) is state-of-the-art video coding standard. Emerging applications like free-viewpoint video, 360degree video, augmented reality, 3D movies etc. require standardized extensions of HEVC. The standardized extensions of HEVC include HEVC Scalable Video Coding (SHVC), HEVC Multiview Video Coding (MV-HEVC), MV-HEVC+ Depth (3D-HEVC) and HEVC Screen Content Coding. 3D-HEVC is used for applications like view synthesis generation, free-viewpoint video. Coding and transmission of depth maps in 3D-HEVC is used for the virtual view synthesis by the algorithms like Depth Image Based Rendering (DIBR). As first step, we performed the profiling of the 3D-HEVC standard. Computational intensive parts of the standard are identified for the efficient hardware implementation. One of the computational intensive part of the 3D-HEVC, HEVC and H.264/AVC is the Interpolation Filtering used for Fractional Motion Estimation (FME). The hardware implementation of the interpolation filtering is carried out using High-Level Synthesis (HLS) tools. Xilinx Vivado Design Suite is used for the HLS implementation of the interpolation filters of HEVC and H.264/AVC. The complexity of the digital systems is greatly increased. High-Level Synthesis is the methodology which offers great benefits such as late architectural or functional changes without time consuming in rewriting of RTL-code, algorithms can be tested and evaluated early in the design cycle and development of accurate models against which the final hardware can be verified
Recommended from our members
Decoding-complexity-aware HEVC encoding using a complexity–rate–distortion model
The energy consumption of Consumer Electronic (CE) devices during media playback is inexorably linked to the computational complexity of decoding compressed video. Reducing a CE device's the energy consumption is therefore becoming ever more challenging with the increasing video resolutions and the complexity of the video coding algorithms. To this end, this paper proposes a framework that alters the video bit stream to reduce the decoding complexity and simultaneously limits the impact on the coding efficiency. In this context, this paper (i) first performs an analysis to determine the trade-off between the decoding complexity, video quality and bit rate with respect to a reference decoder implementation on a General Purpose Processor (GPP) architecture. Thereafter, (ii) a novel generic decoding complexity-aware video coding algorithm is proposed to generate decoding complexity-rate-distortion optimized High Efficiency Video Coding (HEVC) bit streams.
The experimental results reveal that the bit streams generated by the proposed algorithm achieve 29.43% and 13.22% decoding complexity reductions for a similar video quality with minimal coding efficiency impact compared to the state-of-the-art approaches when applied to the HM16.0 and openHEVC decoder implementations, respectively. In addition, analysis of the energy consumption behavior for the same scenarios reveal up to 20% energy consumption reductions while achieving a similar video quality to that of HM 16.0 encoded HEVC bit streams
A new Edge Detector Based on Parametric Surface Model: Regression Surface Descriptor
In this paper we present a new methodology for edge detection in digital
images. The first originality of the proposed method is to consider image
content as a parametric surface. Then, an original parametric local model of
this surface representing image content is proposed. The few parameters
involved in the proposed model are shown to be very sensitive to
discontinuities in surface which correspond to edges in image content. This
naturally leads to the design of an efficient edge detector. Moreover, a
thorough analysis of the proposed model also allows us to explain how these
parameters can be used to obtain edge descriptors such as orientations and
curvatures.
In practice, the proposed methodology offers two main advantages. First, it
has high customization possibilities in order to be adjusted to a wide range of
different problems, from coarse to fine scale edge detection. Second, it is
very robust to blurring process and additive noise. Numerical results are
presented to emphasis these properties and to confirm efficiency of the
proposed method through a comparative study with other edge detectors.Comment: 21 pages, 13 figures and 2 table
- …