4,657 research outputs found
An adaptive partial distortion search for block motion estimation
2002-2003 > Academic research: refereed > Refereed conference paperVersion of RecordPublishe
Fast motion estimation algorithm in H.264 standard
In H.264/AVC standard, the block motion estimation pattern is used to estimate the motion which is a very time consuming part. Although many fast algorithms have been proposed to reduce the huge calculation, the motion estimation time still cannot achieve the critical real time application. So to develop an algorithm which will be fast and having low complexity became a challenge in this standard.For this reasons, a lot of block motion estimation algorithms have been proposed. Typically the block motion estimation part is categorized into two parts. (1) Single pixel motion estimation (2) Fractional pixel motion estimation. In single pixel motion estimation one kind of fast motion algorithm uses fixed pattern like Three Step search, 2-D Logarithmic Search. Four Step search,Diamond Search, Hexagon Based Search. These algorithms are able to reduce the search point and get good coding quality. But the coding quality decreases when the fixed pattern does not fit the real life video sequence. In this thesis we tried to reduce the time complexity and number of search point by using an early termination method which is called adaptive threshold selection. We have used this method in three step search (TSS) and four step search and compared the performance with already existing block matching algorithm.This thesis work proposes fast sub-pixel motion estimation techniques having lower computational complexity. The proposed methods are based on mathematical models of the motion compensated prediction errors in compressing moving pictures. Unlike conventional hierarchical motion estimation techniques, the proposed methods avoid sub-pixel interpolation and subsequent secondary search after the integer-precision motion estimation, resulting in reduced computational time. In order to decide the coefficients of the models, the motion-compensated prediction errors of the neighboring pixels around the integer-pixel motion vector are utilized
Adaptive parallel video-coding algorithm
Parallel encoding of video inevitably frame rate gives varying rate performance due to dynamically changing video content and motion field since the encoding process of each macro-block, especially motion estimation, is data dependent. A multiprocessor schedule optimized for a particular frame with certain macro-block encoding time may not be optimized towards another frame with different encoding time, which causes performance degradation to the parallelization. To tackle this problem, we propose a method based on a batch of near-optimal schedules generated at compile-time and a run-time mechanism to select the schedule giving the shortest predicted critical path length. This method has the advantage of being near-optimal using compile-time schedules while involving only run-time selection rather than re-scheduling. Implementation on the IBM SP2 multiprocessor system using 24 processors gives an average speedup of about 13.5 (frame rate of 38.5 frames per second) for a CIF sequence consisting of segments of 6 different scenes. This is equivalent to an average improvement of about 16.9% over the single schedule scheme with schedule adapted to each of the scenes. Using an open test sequence consisting of 8 video segments, the average improvement achieved is 13.2%, i.e. an average speedup of 13.3 (35.6 frames per second).published_or_final_versio
Fast block-based image restoration employing the improved best neighborhood matching approach
2005-2006 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe
Deformable Object Tracking with Gated Fusion
The tracking-by-detection framework receives growing attentions through the
integration with the Convolutional Neural Networks (CNNs). Existing
tracking-by-detection based methods, however, fail to track objects with severe
appearance variations. This is because the traditional convolutional operation
is performed on fixed grids, and thus may not be able to find the correct
response while the object is changing pose or under varying environmental
conditions. In this paper, we propose a deformable convolution layer to enrich
the target appearance representations in the tracking-by-detection framework.
We aim to capture the target appearance variations via deformable convolution,
which adaptively enhances its original features. In addition, we also propose a
gated fusion scheme to control how the variations captured by the deformable
convolution affect the original appearance. The enriched feature representation
through deformable convolution facilitates the discrimination of the CNN
classifier on the target object and background. Extensive experiments on the
standard benchmarks show that the proposed tracker performs favorably against
state-of-the-art methods
Recommended from our members
Research and developments of Dirac video codec
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel University.In digital video compression, apart from storage, successful transmission of the compressed video
data over the bandwidth limited erroneous channels is another important issue. To enable a video
codec for broadcasting application, it is required to implement the corresponding coding tools (e.g.
error-resilient coding, rate control etc.). They are normally non-normative parts of a video codec and
hence their specifications are not defined in the standard. In Dirac as well, the original codec is
optimized for storage purpose only and so, several non-normative part of the encoding tools are still
required in order to be able to use in other types of application.
Being the "Research and Developments of the Dirac Video Codec" as the research title, phase I of
the project is mainly focused on the error-resilient transmission over a noisy channel. The error-resilient
coding method used here is a simple and low complex coding scheme which provides the
error-resilient transmission of the compressed video bitstream of Dirac video encoder over the packet
erasure wired network. The scheme combines source and channel coding approach where error-resilient
source coding is achieved by data partitioning in the wavelet transformed domain and
channel coding is achieved through the application of either Rate-Compatible Punctured
Convolutional (RCPC) Code or Turbo Code (TC) using un-equal error protection between header plus
MV and data. The scheme is designed mainly for the packet-erasure channel, i.e. targeted for the
Internet broadcasting application.
But, for a bandwidth limited channel, it is still required to limit the amount of bits generated from
the encoder depending on the available bandwidth in addition to the error-resilient coding. So, in the
2nd phase of the project, a rate control algorithm is presented. The algorithm is based upon the Quality
Factor (QF) optimization method where QF of the encoded video is adaptively changing in order to
achieve average bitrate which is constant over each Group of Picture (GOP). A relation between the
bitrate, R and the QF, which is called Rate-QF (R-QF) model is derived in order to estimate the
optimum QF of the current encoding frame for a given target bitrate, R.
In some applications like video conferencing, real-time encoding and decoding with minimum
delay is crucial, but, the ability to do real-time encoding/decoding is largely determined by the
complexity of the encoder/decoder. As we all know that motion estimation process inside the encoder
is the most time consuming stage. So, reducing the complexity of the motion estimation stage will
certainly give one step closer to the real-time application. So, as a partial contribution toward realtime
application, in the final phase of the research, a fast Motion Estimation (ME) strategy is designed
and implemented. It is the combination of modified adaptive search plus semi-hierarchical way of
motion estimation. The same strategy was implemented in both Dirac and H.264 in order to
investigate its performance on different codecs. Together with this fast ME strategy, a method which
is called partial cost function calculation in order to further reduce down the computational load of the
cost function calculation was presented. The calculation is based upon the pre-defined set of patterns
which were chosen in such a way that they have as much maximum coverage as possible over the
whole block.
In summary, this research work has contributed to the error-resilient transmission of compressed
bitstreams of Dirac video encoder over a bandwidth limited error prone channel. In addition to this,
the final phase of the research has partially contributed toward the real-time application of the Dirac
video codec by implementing a fast motion estimation strategy together with partial cost function
calculation idea.BBC R&D and Brunel University
Human detection in surveillance videos and its applications - a review
Detecting human beings accurately in a visual surveillance system is crucial for diverse application areas including abnormal event detection, human gait characterization, congestion analysis, person identification, gender classification and fall detection for elderly people. The first step of the detection process is to detect an object which is in motion. Object detection could be performed using background subtraction, optical flow and spatio-temporal filtering techniques. Once detected, a moving object could be classified as a human being using shape-based, texture-based or motion-based features. A comprehensive review with comparisons on available techniques for detecting human beings in surveillance videos is presented in this paper. The characteristics of few benchmark datasets as well as the future research directions on human detection have also been discussed
Implementing video compression algorithms on reconfigurable devices
The increasing density offered by Field Programmable Gate Arrays(FPGA), coupled with their short design cycle, has made them a popular choice for implementing a wide range of algorithms and complete systems. In this thesis the implementation of video compression algorithms on FPGAs is studied. Two areas are specifically focused on; the integration of a video encoder into a complete
system and the power consumption of FPGA based video encoders.
Two FPGA based video compression systems are described, one which targets surveillance applications and one which targets video conferencing applications. The FPGA video surveillance system makes use of a novel memory format to
improve the efficiency with which input video sequences can be loaded over the system bus.
The power consumption of a FPGA video encoder is analyzed. The results indicating that the motion estimation encoder stage requires the most power consumption. An algorithm, which reuses the intra prediction results generated during the encoding process, is then proposed to reduce the power consumed on an FPGA video encoder’s external memory bus. Finally, the power reduction algorithm is implemented within an FPGA video encoder. Results are given showing that, in addition to reducing power on the external memory bus, the algorithm also reduces power in the motion estimation stage of a FPGA based video encoder
- …