320 research outputs found

    Super-resolution:A comprehensive survey

    Get PDF

    Confidence-based adaptive frame rate up-conversion

    Get PDF

    AUTOMATED ESTIMATION, REDUCTION, AND QUALITY ASSESSMENT OF VIDEO NOISE FROM DIFFERENT SOURCES

    Get PDF
    Estimating and removing noise from video signals is important to increase either the visual quality of video signals or the performance of video processing algorithms such as compression or segmentation where noise estimation or reduction is a pre-processing step. To estimate and remove noise, effective methods use both spatial and temporal information to increase the reliability of signal extraction from noise. The objective of this thesis is to introduce a video system having three novel techniques to estimate and reduce video noise from different sources, both effectively and efficiently and assess video quality without considering a reference non-noisy video. The first (intensity-variances based homogeneity classification) technique estimates visual noise of different types in images and video signals. The noise can be white Gaussian noise, mixed Poissonian- Gaussian (signal-dependent white) noise, or processed (frequency-dependent) noise. The method is based on the classification of intensity-variances of signal patches in order to find homogeneous regions that best represent the noise signal in the input signal. The method assumes that noise is signal-independent in each intensity class. To find homogeneous regions, the method works on the downsampled input image and divides it into patches. Each patch is assigned to an intensity class, whereas outlier patches are rejected. Then the most homogeneous cluster is selected and its noise variance is considered as the peak of noise variance. To account for processed noise, we estimate the degree of spatial correlation. To account for temporal noise variations a stabilization process is proposed. We show that the proposed method competes related state-of-the-art in noise estimation. The second technique provides solutions to remove real-world camera noise such as signal-independent, signal-dependent noise, and frequency-dependent noise. Firstly, we propose a noise equalization method in intensity and frequency domain which enables a white Gaussian noise filter to handle real noise. Our experiments confirm the quality improvement under real noise while white Gaussian noise filter is used with our equalization method. Secondly, we propose a band-limited time-space video denoiser which reduces video noise of different types. This denoiser consists of: 1) intensity-domain noise equalization to account for signal dependency, 2) band-limited anti-blocking time-domain filtering of current frame using motion-compensated previous and subsequent frames, 3) spatial filtering combined with noise frequency equalizer to remove residual noise left from temporal filtering, and 4) intensity de-equalization to invert the first step. To decrease the chance of motion blur, temporal weights are calculated using two levels of error estimation; coarse (blocklevel) and fine (pixel-level). We correct the erroneous motion vectors by creating a homography from reliable motion vectors. To eliminate blockiness in block-based temporal filter, we propose three ideas: interpolation of block-level error, a band-limited filtering by subtracting the back-signal beforehand, and two-band motion compensation. The proposed time-space filter is parallelizable to be significantly accelerated by GPU. We show that the proposed method competes related state-ofthe- art in video denoising. The third (sparsity and dominant orientation quality index) technique is a new method to assess the quality of the denoised video frames without a reference (clean frames). In many image and video applications, a quantitative measure of image content, noise, and blur is required to facilitate quality assessment, when the ground-truth is not available. We propose a fast method to find the dominant orientation of image patches, which is used to decompose them into singular values. Combining singular values with the sparsity of the patch in the transform domain, we measure the possible image content and noise of the patches and of the whole image. To measure the effect of noise accurately, our method takes both low and high textured patches into account. Before analyzing the patches, we apply a shrinkage in the transform domain to increase the contrast of genuine image structure. We show that the proposed method is useful to select parameters of denoising algorithms automatically in different noise scenarios such as white Gaussian and real noise. Our objective and subjective results confirm the correspondence between the measured quality and the ground-truth and proposed method rivals related state-of-the-art approaches

    Video content analysis for intelligent forensics

    Get PDF
    The networks of surveillance cameras installed in public places and private territories continuously record video data with the aim of detecting and preventing unlawful activities. This enhances the importance of video content analysis applications, either for real time (i.e. analytic) or post-event (i.e. forensic) analysis. In this thesis, the primary focus is on four key aspects of video content analysis, namely; 1. Moving object detection and recognition, 2. Correction of colours in the video frames and recognition of colours of moving objects, 3. Make and model recognition of vehicles and identification of their type, 4. Detection and recognition of text information in outdoor scenes. To address the first issue, a framework is presented in the first part of the thesis that efficiently detects and recognizes moving objects in videos. The framework targets the problem of object detection in the presence of complex background. The object detection part of the framework relies on background modelling technique and a novel post processing step where the contours of the foreground regions (i.e. moving object) are refined by the classification of edge segments as belonging either to the background or to the foreground region. Further, a novel feature descriptor is devised for the classification of moving objects into humans, vehicles and background. The proposed feature descriptor captures the texture information present in the silhouette of foreground objects. To address the second issue, a framework for the correction and recognition of true colours of objects in videos is presented with novel noise reduction, colour enhancement and colour recognition stages. The colour recognition stage makes use of temporal information to reliably recognize the true colours of moving objects in multiple frames. The proposed framework is specifically designed to perform robustly on videos that have poor quality because of surrounding illumination, camera sensor imperfection and artefacts due to high compression. In the third part of the thesis, a framework for vehicle make and model recognition and type identification is presented. As a part of this work, a novel feature representation technique for distinctive representation of vehicle images has emerged. The feature representation technique uses dense feature description and mid-level feature encoding scheme to capture the texture in the frontal view of the vehicles. The proposed method is insensitive to minor in-plane rotation and skew within the image. The capability of the proposed framework can be enhanced to any number of vehicle classes without re-training. Another important contribution of this work is the publication of a comprehensive up to date dataset of vehicle images to support future research in this domain. The problem of text detection and recognition in images is addressed in the last part of the thesis. A novel technique is proposed that exploits the colour information in the image for the identification of text regions. Apart from detection, the colour information is also used to segment characters from the words. The recognition of identified characters is performed using shape features and supervised learning. Finally, a lexicon based alignment procedure is adopted to finalize the recognition of strings present in word images. Extensive experiments have been conducted on benchmark datasets to analyse the performance of proposed algorithms. The results show that the proposed moving object detection and recognition technique superseded well-know baseline techniques. The proposed framework for the correction and recognition of object colours in video frames achieved all the aforementioned goals. The performance analysis of the vehicle make and model recognition framework on multiple datasets has shown the strength and reliability of the technique when used within various scenarios. Finally, the experimental results for the text detection and recognition framework on benchmark datasets have revealed the potential of the proposed scheme for accurate detection and recognition of text in the wild

    Advancements and Breakthroughs in Ultrasound Imaging

    Get PDF
    Ultrasonic imaging is a powerful diagnostic tool available to medical practitioners, engineers and researchers today. Due to the relative safety, and the non-invasive nature, ultrasonic imaging has become one of the most rapidly advancing technologies. These rapid advances are directly related to the parallel advancements in electronics, computing, and transducer technology together with sophisticated signal processing techniques. This book focuses on state of the art developments in ultrasonic imaging applications and underlying technologies presented by leading practitioners and researchers from many parts of the world

    Super resolution of B-mode ultrasound images with deep learning

    Get PDF
    Ultrasound offers a safe, non-invasive, and inexpensive way of imaging. However, due to its natural intrinsic imaging characteristics, it produces poor quality images with low resolution (LR) compared to other medical imaging modalities. Various image enhancement techniques have been extensively studied to overcome these shortcomings. Super-resolution (SR) is one of these methods, which endeavor to obtain high resolution (HR) images from LR images while enlarging them. Numerous studies have already utilized different SR techniques in various stages of ultrasound imaging (USI). Unlike other studies, which aimed at obtaining SR in the pre-processing phase or early stages of the post-processing phase of USI, we achieved SR on B-mode ultrasound images, which is the last stage of USI. We constructed a deep convolutional neural network (CNN) and trained it with a very large dataset of B-mode ultrasound images for the scale factors 2, 3, 4, and 8. We evaluated the performance of our proposed model quantitatively with eight image quality measures. The quantitative results revealed that our algorithm is much more successful than other methods at each magnification factor. Furthermore, we also verified that there is a statistically significant difference between our approach and others. Besides, qualitative analysis of the reconstructed images also confirms that it produces much better quality HR images than other methods in terms of the human visual system.Consejo Nacional de Investigaciones Cienta­ficas y Taccnicas 119E015 CONICE

    Super resolution and dynamic range enhancement of image sequences

    Get PDF
    Camera producers try to increase the spatial resolution of a camera by reducing size of sites on sensor array. However, shot noise causes the signal to noise ratio drop as sensor sites get smaller. This fact motivates resolution enhancement to be performed through software. Super resolution (SR) image reconstruction aims to combine degraded images of a scene in order to form an image which has higher resolution than all observations. There is a demand for high resolution images in biomedical imaging, surveillance, aerial/satellite imaging and high-definition TV (HDTV) technology. Although extensive research has been conducted in SR, attention has not been given to increase the resolution of images under illumination changes. In this study, a unique framework is proposed to increase the spatial resolution and dynamic range of a video sequence using Bayesian and Projection onto Convex Sets (POCS) methods. Incorporating camera response function estimation into image reconstruction allows dynamic range enhancement along with spatial resolution improvement. Photometrically varying input images complicate process of projecting observations onto common grid by violating brightness constancy. A contrast invariant feature transform is proposed in this thesis to register input images with high illumination variation. Proposed algorithm increases the repeatability rate of detected features among frames of a video. Repeatability rate is increased by computing the autocorrelation matrix using the gradients of contrast stretched input images. Presented contrast invariant feature detection improves repeatability rate of Harris corner detector around %25 on average. Joint multi-frame demosaicking and resolution enhancement is also investigated in this thesis. Color constancy constraint set is devised and incorporated into POCS framework for increasing resolution of color-filter array sampled images. Proposed method provides fewer demosaicking artifacts compared to existing POCS method and a higher visual quality in final image

    Artificial Intelligence in the Creative Industries: A Review

    Full text link
    This paper reviews the current state of the art in Artificial Intelligence (AI) technologies and applications in the context of the creative industries. A brief background of AI, and specifically Machine Learning (ML) algorithms, is provided including Convolutional Neural Network (CNNs), Generative Adversarial Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement Learning (DRL). We categorise creative applications into five groups related to how AI technologies are used: i) content creation, ii) information analysis, iii) content enhancement and post production workflows, iv) information extraction and enhancement, and v) data compression. We critically examine the successes and limitations of this rapidly advancing technology in each of these areas. We further differentiate between the use of AI as a creative tool and its potential as a creator in its own right. We foresee that, in the near future, machine learning-based AI will be adopted widely as a tool or collaborative assistant for creativity. In contrast, we observe that the successes of machine learning in domains with fewer constraints, where AI is the `creator', remain modest. The potential of AI (or its developers) to win awards for its original creations in competition with human creatives is also limited, based on contemporary technologies. We therefore conclude that, in the context of creative industries, maximum benefit from AI will be derived where its focus is human centric -- where it is designed to augment, rather than replace, human creativity

    High Dynamic Range Visual Content Compression

    Get PDF
    This thesis addresses the research questions of High Dynamic Range (HDR) visual contents compression. The HDR representations are intended to represent the actual physical value of the light rather than exposed value. The current HDR compression schemes are the extension of legacy Low Dynamic Range (LDR) compressions, by using Tone-Mapping Operators (TMO) to reduce the dynamic range of the HDR contents. However, introducing TMO increases the overall computational complexity, and it causes the temporal artifacts. Furthermore, these compression schemes fail to compress non-salient region differently than the salient region, when Human Visual System (HVS) perceives them differently. The main contribution of this thesis is to propose a novel Mapping-free visual saliency-guided HDR content compression scheme. Firstly, the relationship of Discrete Wavelet Transform (DWT) lifting steps and TMO are explored. A novel approach to compress HDR image by Joint Photographic Experts Group (JPEG) 2000 codec while backward compatible to LDR is proposed. This approach exploits the reversibility of tone mapping and scalability of DWT. Secondly, the importance of the TMO in the HDR compression is evaluated in this thesis. A mapping-free post HDR image compression based on JPEG and JPEG2000 standard codecs for current HDR image formats is proposed. This approach exploits the structure of HDR formats. It has an equivalent compression performance and the lowest computational complexity compared to the existing HDR lossy compressions (50% lower than the state-of-the-art). Finally, the shortcomings of the current HDR visual saliency models, and HDR visual saliency-guided compression are explored in this thesis. A spatial saliency model for HDR visual content outperform others by 10% for spatial visual prediction task with 70% lower computational complexity is proposed. Furthermore, the experiment suggested more than 90% temporal saliency is predicted by the proposed spatial model. Moreover, the proposed saliency model can be used to guide the HDR compression by applying different quantization factor according to the intensity of predicted saliency map
    corecore