1,148 research outputs found
A reduced-reference perceptual image and video quality metric based on edge preservation
In image and video compression and transmission, it is important to rely on an objective image/video quality metric which accurately represents the subjective quality of processed images and video sequences. In some scenarios, it is also important to evaluate the quality of the received video sequence with minimal reference to the transmitted one. For instance, for quality improvement of video transmission through closed-loop optimisation, the video quality measure can be evaluated at the receiver and provided as feedback information to the system controller. The original image/video sequence-prior to compression and transmission-is not usually available at the receiver side, and it is important to rely at the receiver side on an objective video quality metric that does not need reference or needs minimal reference to the original video sequence. The observation that the human eye is very sensitive to edge and contour information of an image underpins the proposal of our reduced reference (RR) quality metric, which compares edge information between the distorted and the original image. Results highlight that the metric correlates well with subjective observations, also in comparison with commonly used full-reference metrics and with a state-of-the-art RR metric. © 2012 Martini et al
Underwater image restoration: super-resolution and deblurring via sparse representation and denoising by means of marine snow removal
Underwater imaging has been widely used as a tool in many fields, however, a major issue is the quality of the resulting images/videos. Due to the light's interaction with water and its constituents, the acquired underwater images/videos often suffer from a significant amount of scatter (blur, haze) and noise. In the light of these issues, this thesis considers problems of low-resolution, blurred and noisy underwater images and proposes several approaches to improve the quality of such images/video frames.
Quantitative and qualitative experiments validate the success of proposed algorithms
Recommended from our members
3D multiple description coding for error resilience over wireless networks
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Mobile communications has gained a growing interest from both customers and service providers alike in the last 1-2 decades. Visual information is used in many application domains such as remote health care, video âon demand, broadcasting, video surveillance etc. In order to enhance the visual effects of digital video content, the depth perception needs to be provided with the actual visual content. 3D video has earned a significant interest from the research community in recent years, due to the tremendous impact it leaves on viewers and its enhancement of the userâs quality of experience (QoE). In the near future, 3D video is likely to be used in most video applications, as it offers a greater sense of immersion and perceptual experience. When 3D video is compressed and transmitted over error prone channels, the associated packet loss leads to visual quality degradation. When a picture is lost or corrupted so severely that the concealment result is not acceptable, the receiver typically pauses video playback and waits for the next INTRA picture to resume decoding. Error propagation caused by employing predictive coding may degrade the video quality severely. There are several ways used to mitigate the effects of such transmission errors. One widely used technique in International Video Coding Standards is error resilience.
The motivation behind this research work is that, existing schemes for 2D colour video compression such as MPEG, JPEG and H.263 cannot be applied to 3D video content. 3D video signals contain depth as well as colour information and are bandwidth demanding, as they require the transmission of multiple high-bandwidth 3D video streams. On the other hand, the capacity of wireless channels is limited and wireless links are prone to various types of errors caused by noise, interference, fading, handoff, error burst and network congestion. Given the maximum bit rate budget to represent the 3D scene, optimal bit-rate allocation between texture and depth information rendering distortion/losses should be minimised. To mitigate the effect of these errors on the perceptual 3D video quality, error resilience video coding needs to be investigated further to offer better quality of experience (QoE) to end users.
This research work aims at enhancing the error resilience capability of compressed 3D video, when transmitted over mobile channels, using Multiple Description Coding (MDC) in order to improve better userâs quality of experience (QoE).
Furthermore, this thesis examines the sensitivity of the human visual system (HVS) when employed to view 3D video scenes. The approach used in this study is to use subjective testing in order to rate peopleâs perception of 3D video under error free and error prone conditions through the use of a carefully designed bespoke questionnaire.Petroleum Technology Development Fund (PTDF
A reduced reference video quality assessment method for provision as a service over SDN/NFV-enabled networks
139 p.The proliferation of multimedia applications and services has generarted a noteworthy upsurge in network traffic regarding video content and has created the need for trustworthy service quality assessment methods. Currently, predominent position among the technological trends in telecommunication networkds are Network Function Virtualization (NFV), Software Defined Networking (SDN) and 5G mobile networks equipped with small cells. Additionally Video Quality Assessment (VQA) methods are a very useful tool for both content providers and network operators, to understand of how users perceive quality and this study the feasibility of potential services and adapt the network available resources to satisfy the user requirements
Pareto Optimized Large Mask Approach for Efficient and Background Humanoid Shape Removal
The purpose of automated video object removal is to not only detect and remove the object of interest automatically, but also to utilize background context to inpaint the foreground area. Video inpainting requires to fill spatiotemporal gaps in a video with convincing material, necessitating both temporal and spatial consistency; the inpainted part must seamlessly integrate into the background in a variety of scenes, and it must maintain a consistent appearance in subsequent frames even if its surroundings change noticeably. We introduce deep learning-based methodology for removing unwanted human-like shapes in videos. The method uses Pareto-optimized Generative Adversarial Networks (GANs) technology, which is a novel contribution. The system automatically selects the Region of Interest (ROI) for each humanoid shape and uses a skeleton detection module to determine which humanoid shape to retain. The semantic masks of human like shapes are created using a semantic-aware occlusion-robust model that has four primary components: feature extraction, and local, global, and semantic branches. The global branch encodes occlusion-aware information to make the extracted features resistant to occlusion, while the local branch retrieves fine-grained local characteristics. A modified big mask inpainting approach is employed to eliminate a person from the image, leveraging Fast Fourier convolutions and utilizing polygonal chains and rectangles with unpredictable aspect ratios. The inpainter network takes the input image and the mask to create an output image excluding the background humanoid shapes. The generator uses an encoder-decoder structure with included skip connections to recover spatial information and dilated convolution and squeeze and excitation blocks to make the regions behind the humanoid shapes consistent with their surroundings. The discriminator avoids dissimilar structure at the patch scale, and the refiner network catches features around the boundaries of each background humanoid shape. The efficiency was assessed using the Structural Learned Perceptual Image Patch Similarity, Frechet Inception Distance, and Similarity Index Measure metrics and showed promising results in fully automated background person removal task. The method is evaluated on two video object segmentation datasets (DAVIS indicating respective values of 0.02, FID of 5.01 and SSIM of 0.79 and YouTube-VOS, resulting in 0.03, 6.22, 0.78 respectively) as well a database of 66 distinct video sequences of people behind a desk in an office environment (0.02, 4.01, and 0.78 respectively).publishedVersio
A reduced reference video quality assessment method for provision as a service over SDN/NFV-enabled networks
139 p.The proliferation of multimedia applications and services has generarted a noteworthy upsurge in network traffic regarding video content and has created the need for trustworthy service quality assessment methods. Currently, predominent position among the technological trends in telecommunication networkds are Network Function Virtualization (NFV), Software Defined Networking (SDN) and 5G mobile networks equipped with small cells. Additionally Video Quality Assessment (VQA) methods are a very useful tool for both content providers and network operators, to understand of how users perceive quality and this study the feasibility of potential services and adapt the network available resources to satisfy the user requirements
3DVQM : 3d video quality monitor
This dissertation presents a research study and software implementation of an objective
quality monitor for 3D video streams transmitted over networks with non-guaranteed
packet delivery due to errors, congestion, excessive delay, etc. A review of Video Quality
Assessment (VQA) models available in the literature is first presented, addressing 2D and
3D video quality models that were selected as relevant for this research work.
A packet-layer VQA model is proposed based on header information from three different
packet-layer levels: Network Abstraction Layer (NAL), Packetised Elementary
Streams (PES) and MPEG2 - Transport Stream (TS). Transmission errors leading to undecodable
TS packets are assumed to result in a whole frame loss. The proposed method
estimates the size of the lost frames, which is used as a model parameter to predict their
objective quality, measured as the Structural Similarity Index Metric (SSIM).
In order to materialise the proposed VQA model, a software application was developed
that allows monitoring a corrupted 3D video stream quality. To make the monitoring
process as user friendly as possible, a Guide User Interface (GUI) was developed. With
this feature the user can interact with the application by controlling the input parameters
and customizing the results on the output display.
The results show that SSIM of isolated missing stereoscopic frames in 3D coded video
can be predicted with Root Mean Square Error (RMSE) accuracy of about 0.1 and Pearson
correlation coefficient of 0.8, taking the SSIM of uncorrupted frames as reference. It is
concluded that the proposed model is capable of estimating the SSIM quite accurately
using only the estimated sizes of single lost frames
OFDM techniques for multimedia data transmission
Orthogonal Frequency Division Multiplexing (OFDM) is an efficient parallel data transmission scheme that has relatively recently become popular in both wired and wireless communication systems for the transmission of multimedia data. OFDM can be found at the core of well known systems such as digital television/radio broadcasting, ADSL internet and wireless LANs. Research into the OFDM field continually looks at different techniques to attempt to make this type of transmission more efficient. More recent works in this area have considered the benefits of using wavelet transforms in place of the Fourier transforms traditionally used in OFDM systems and other works have looked at data compression as a method of increasing throughput in these types of transmission systems. The work presented in this thesis considers the transmission of image and video data in traditional OFDM transmission and discusses the strengths and weaknesses of this method. This thesis also proposes a new type of OFDM system that combines transmission and data compression into one block. By merging these two processes into one the complexity of the system is reduced, therefore promising to increase system efficiency. The results presented in this thesis show the novel compressive OFDM method performs well in channels with a low signal-to-noise ratio. Comparisons with traditional OFDM with lossy compression show a large improvement in the quality of the data received with the new system when used in these noisy channel environments. The results also show superior results are obtained when transmitting image and video data using the new method, the high correlative properties of images are ideal for effective transmission using the new technique. The new transmission technique proposed in this thesis also gives good results when considering computation time. When compared to MATLAB simulations of a traditional DFT-based OFDM system with a separate compression block, the proposed transmission method was able to reduce the computation time by between a half to three-quarters. This decrease in computational complexity also contributes to transmission efficiency when considering the new method
- âŠ