513 research outputs found
Space-variant picture coding
PhDSpace-variant picture coding techniques exploit the strong spatial non-uniformity of
the human visual system in order to increase coding efficiency in terms of perceived quality
per bit. This thesis extends space-variant coding research in two directions. The first of
these directions is in foveated coding. Past foveated coding research has been dominated
by the single-viewer, gaze-contingent scenario. However, for research into the multi-viewer
and probability-based scenarios, this thesis presents a missing piece: an algorithm for computing
an additive multi-viewer sensitivity function based on an established eye resolution
model, and, from this, a blur map that is optimal in the sense of discarding frequencies in
least-noticeable- rst order. Furthermore, for the application of a blur map, a novel algorithm
is presented for the efficient computation of high-accuracy smoothly space-variant
Gaussian blurring, using a specialised filter bank which approximates perfect space-variant
Gaussian blurring to arbitrarily high accuracy and at greatly reduced cost compared to
the brute force approach of employing a separate low-pass filter at each image location.
The second direction is that of artifi cially increasing the depth-of- field of an image, an
idea borrowed from photography with the advantage of allowing an image to be reduced
in bitrate while retaining or increasing overall aesthetic quality. Two synthetic depth of field algorithms are presented herein, with the desirable properties of aiming to mimic
occlusion eff ects as occur in natural blurring, and of handling any number of blurring
and occlusion levels with the same level of computational complexity. The merits of this
coding approach have been investigated by subjective experiments to compare it with
single-viewer foveated image coding. The results found the depth-based preblurring to
generally be significantly preferable to the same level of foveation blurring
Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy
In this paper we shall consider the problem of deploying attention to subsets
of the video streams for collating the most relevant data and information of
interest related to a given task. We formalize this monitoring problem as a
foraging problem. We propose a probabilistic framework to model observer's
attentive behavior as the behavior of a forager. The forager, moment to moment,
focuses its attention on the most informative stream/camera, detects
interesting objects or activities, or switches to a more profitable stream. The
approach proposed here is suitable to be exploited for multi-stream video
summarization. Meanwhile, it can serve as a preliminary step for more
sophisticated video surveillance, e.g. activity and behavior analysis.
Experimental results achieved on the UCR Videoweb Activities Dataset, a
publicly available dataset, are presented to illustrate the utility of the
proposed technique.Comment: Accepted to IEEE Transactions on Image Processin
Content-prioritised video coding for British Sign Language communication.
Video communication of British Sign Language (BSL) is important for remote interpersonal communication and for the equal provision of services for deaf people. However, the use of video telephony and video conferencing applications for BSL communication is limited by inadequate video quality. BSL is a highly structured, linguistically complete, natural language system that expresses vocabulary and grammar visually and spatially using a complex combination of facial expressions (such as eyebrow movements, eye blinks and mouth/lip shapes), hand gestures, body movements and finger-spelling that change in space and time. Accurate natural BSL communication places specific demands on visual media applications which must compress video image data for efficient transmission. Current video compression schemes apply methods to reduce statistical redundancy and perceptual irrelevance in video image data based on a general model of Human Visual System (HVS) sensitivities. This thesis presents novel video image coding methods developed to achieve the conflicting requirements for high image quality and efficient coding. Novel methods of prioritising visually important video image content for optimised video coding are developed to exploit the HVS spatial and temporal response mechanisms of BSL users (determined by Eye Movement Tracking) and the characteristics of BSL video image content. The methods implement an accurate model of HVS foveation, applied in the spatial and temporal domains, at the pre-processing stage of a current standard-based system (H.264). Comparison of the performance of the developed and standard coding systems, using methods of video quality evaluation developed for this thesis, demonstrates improved perceived quality at low bit rates. BSL users, broadcasters and service providers benefit from the perception of high quality video over a range of available transmission bandwidths. The research community benefits from a new approach to video coding optimisation and better understanding of the communication needs of deaf people
- …