18,981 research outputs found

    Steered mixture-of-experts for light field images and video : representation and coding

    Get PDF
    Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

    CG-DIQA: No-reference Document Image Quality Assessment Based on Character Gradient

    Full text link
    Document image quality assessment (DIQA) is an important and challenging problem in real applications. In order to predict the quality scores of document images, this paper proposes a novel no-reference DIQA method based on character gradient, where the OCR accuracy is used as a ground-truth quality metric. Character gradient is computed on character patches detected with the maximally stable extremal regions (MSER) based method. Character patches are essentially significant to character recognition and therefore suitable for use in estimating document image quality. Experiments on a benchmark dataset show that the proposed method outperforms the state-of-the-art methods in estimating the quality score of document images.Comment: To be published in Proc. of ICPR 201

    Modelling the spatial distribution of DEM Error

    Get PDF
    Assessment of a DEM’s quality is usually undertaken by deriving a measure of DEM accuracy – how close the DEM’s elevation values are to the true elevation. Measures such as Root Mean Squared Error and standard deviation of the error are frequently used. These measures summarise elevation errors in a DEM as a single value. A more detailed description of DEM accuracy would allow better understanding of DEM quality and the consequent uncertainty associated with using DEMs in analytical applications. The research presented addresses the limitations of using a single root mean squared error (RMSE) value to represent the uncertainty associated with a DEM by developing a new technique for creating a spatially distributed model of DEM quality – an accuracy surface. The technique is based on the hypothesis that the distribution and scale of elevation error within a DEM are at least partly related to morphometric characteristics of the terrain. The technique involves generating a set of terrain parameters to characterise terrain morphometry and developing regression models to define the relationship between DEM error and morphometric character. The regression models form the basis for creating standard deviation surfaces to represent DEM accuracy. The hypothesis is shown to be true and reliable accuracy surfaces are successfully created. These accuracy surfaces provide more detailed information about DEM accuracy than a single global estimate of RMSE

    Subjective and objective quality assessment of ancient degraded documents

    Get PDF
    Archiving, restoration and analysis of damaged manuscripts have been largely increased in recent decades. Usually, these documents are physically degraded because of aging and improper handing. They also cannot be processed manually because a massive volume of these documents exist in libraries and archives around the world. Therefore, automatic methodologies are needed to preserve and to process their content. These documents are usually processed through their images. Degraded document image processing is a difficult task mainly because of the existing physical degradations. While it can be very difficult to accurately locate and remove such distortions, analyzing the severity and type(s) of these distortions is feasible. This analysis provides useful information on the type and severity of degradations with a number of applications. The main contributions of this thesis are to propose models for objectively assessing the physical condition of document images and to classify their degradations. In this thesis, three datasets of degraded document images along with the subjective ratings for each image are developed. In addition, three no-reference document image quality assessment (NR-DIQA) metrics are proposed for historical and medieval document images. It should be mentioned that degraded medieval document images are a subset of the historical document images and may contain both graphical and textual content. Finally, we propose a degradation classification model in order to identify common distortion types in old document images. Essentially, existing no reference image quality assessment (NR-IQA) metrics are not designed to assess physical document distortions. In the first contribution, we propose the first dataset of degraded document images along with the human opinion scores for each document image. This dataset is introduced to evaluate the quality of historical document images. We also propose an objective NR-DIQA metric based on the statistics of the mean subtracted contrast normalized (MSCN) coefficients computed from segmented layers of each document image. The segmentation into four layers of foreground and background is done based on an analysis of the log-Gabor filters. This segmentation is based on the assumption that the sensitivity of the human visual system (HVS) is different at the locations of text and non-text. Experimental results show that the proposed metric has comparable or better performance than the state-of-the-art metrics, while it has a moderate complexity. Degradation identification and quality assessment can complement each other to provide information on both type and severity of degradations in document images. Therefore, we introduced, in the second contribution, a multi-distortion historical document image database that can be used for the research on quality assessment of degraded documents as well as degradation classification. The developed dataset contains historical document images which are classified into four categories based on their distortion types, namely, paper translucency, stain, readers’ annotations, and worn holes. An efficient NR-DIQA metric is then proposed based on three sets of spatial and frequency image features extracted from two layers of text and non-text. In addition, these features are used to estimate the probability of the four aforementioned physical distortions for the first time in the literature. Both proposed quality assessment and degradation classification models deliver a very promising performance. Finally, we develop in the third contribution a dataset and a quality assessment metric for degraded medieval document (DMD) images. This type of degraded images contains both textual and pictorial information. The introduced DMD dataset is the first dataset in its category that also provides human ratings. Also, we propose a new no-reference metric in order to evaluate the quality of DMD images in the developed dataset. The proposed metric is based on the extraction of several statistical features from three layers of text, non-text, and graphics. The segmentation is based on color saliency with assumption that pictorial parts are colorful. It also follows HVS that gives different weights to each layer. The experimental results validate the effectiveness of the proposed NR-DIQA strategy for DMD images

    MR image reconstruction using deep density priors

    Full text link
    Algorithms for Magnetic Resonance (MR) image reconstruction from undersampled measurements exploit prior information to compensate for missing k-space data. Deep learning (DL) provides a powerful framework for extracting such information from existing image datasets, through learning, and then using it for reconstruction. Leveraging this, recent methods employed DL to learn mappings from undersampled to fully sampled images using paired datasets, including undersampled and corresponding fully sampled images, integrating prior knowledge implicitly. In this article, we propose an alternative approach that learns the probability distribution of fully sampled MR images using unsupervised DL, specifically Variational Autoencoders (VAE), and use this as an explicit prior term in reconstruction, completely decoupling the encoding operation from the prior. The resulting reconstruction algorithm enjoys a powerful image prior to compensate for missing k-space data without requiring paired datasets for training nor being prone to associated sensitivities, such as deviations in undersampling patterns used in training and test time or coil settings. We evaluated the proposed method with T1 weighted images from a publicly available dataset, multi-coil complex images acquired from healthy volunteers (N=8) and images with white matter lesions. The proposed algorithm, using the VAE prior, produced visually high quality reconstructions and achieved low RMSE values, outperforming most of the alternative methods on the same dataset. On multi-coil complex data, the algorithm yielded accurate magnitude and phase reconstruction results. In the experiments on images with white matter lesions, the method faithfully reconstructed the lesions. Keywords: Reconstruction, MRI, prior probability, machine learning, deep learning, unsupervised learning, density estimationComment: Published in IEEE TMI. Main text and supplementary material, 19 pages tota
    • …
    corecore