6 research outputs found

    Efficient dense blur map estimation for automatic 2D-to-3D conversion

    Get PDF
    Focus is an important depth cue for 2D-to-3D conversion of low depth-of-field images and video. However, focus can be only reliably estimated on edges. Therefore, Bea et al. [1] first proposed an optimization based approach to propagate focus to non-edge image portions, for single image focus editing. While their approach produces accurate dense blur maps, the computational complexity and memory requirements for solving the resulting sparse linear system with standard multigrid or (multilevel) preconditioning techniques, are infeasible within the stringent requirements of the consumer electronics and broadcast industry. In this paper we propose fast, efficient, low latency, line scanning based focus propagation, which mitigates the need for complex multigrid or (multilevel) preconditioning techniques. In addition we propose facial blur compensation to compensate for false shading edges that cause incorrect blur estimates in people's faces. In general shading leads to incorrect focus estimates, which may lead to unnatural 3D and visual discomfort. Since visual attention mostly tends to faces, our solution solves the most distracting errors. A subjective assessment by paired comparison on a set of challenging low-depth-of-field images shows that the proposed approach achieves equal 3D image quality as optimization based approaches, and that facial blur compensation results in a significant improvemen

    画像の局所統計量に基づくフォーカルブラー領域分割

    Get PDF
     被写体と焦点距離の関係によって生じるフォーカルブラーは画像撮影に伴う典型的な現象であり,画像のブラー情報を解析する技術はコンピュータビジョンの重要課題の一つである.フォーカルブラーからはシーンの相対的な奥行きや,撮影者の注目領域などシーンに関する有用な情報が得られる.フォーカルブラー領域分割はこれらの情報を解析し有効に利用するための技術であり,様々なアプリケーションの性能向上に寄与する.本論文では,フォーカルブラー領域分割手法の精度向上を目的として,(1)ブラー特徴推定の阻害要因に頑健なブラー特徴推定,(2)単一画像に対するブラー領域分割,および(3)2 枚の画像を用いたブラー領域分割手法を提案する.さらに,フォーカルブラー領域分割手法を含む2値領域分割の有効性を適切に評価するため,クラスタリングとクラス分類の文脈に基づいてブラー領域分割精度評価尺度を検証する.本論文ではブラー特徴推定の阻害要因に頑健なブラー特徴量としてANGHS (Amplitude-Normalized Gradient Histogram Span) を提案する.ANGHSは局所領域の輝度勾配を輝度振幅で正規化し,さらに輝度勾配ヒストグラムの裾の重さを評価する.本論文が提案するANGHSは局所領域内の輝度変化の少ない画素集合に対する頑健性に加え,輝度振幅に対する頑健性を備えている.単一画像に対するブラー領域分割では,ブラー特徴マップの識別性能が精度に大きく影響する点に着目し,識別性能の高いブラー特徴マップ推定法を提案する.ブラー特徴マップの識別性能向上のために,(i)複数サイズのグリッド分割を利用したスパースブラー特徴マップ推定と(ii)EAI (Edge Aware Interpolation) によるブラー特徴マップ推定を適用する.さらに領域分割ではまず,大津法を用いてブラー特徴マップを初期分割し,その後,初期分割結果と色特徴,およびブラー特徴を併用したGraphcutsを用いて初期分割結果を修正することで,ノンパラメトリック推定に基づく大域的領域分割とエネルギー最小化に基づく領域の高精細化によって精度を向上させる2段階領域分割を提案する.2枚の画像を用いたブラー領域分割手法では,2枚のブラーが異なる画像対からブラー差分特徴を求めることで,被写体と背景を分割する理論的なしきい値が定義できることに着目する.2枚のフォーカルブラー画像から推定したブラー差分特徴マップを理論的なしきい値で分割する.さらに,色特徴と被写体合焦画像から求めたブラー特徴マップを併用したGraphcutsで初期分割結果を補正することで精度の向上を図る.フォーカルブラー領域分割の精度評価では,2値領域分割がクラスタリングとクラス分類の問題として捉えられる点に着目し,各文脈における最適な評価尺度を検証する.本論文では,クラスタリングとクラス分類の各文脈についてフォーカルブラー領域分割精度評価のための要求事項を定義する.要求事項についてF1 Score, Intersection over Union, Accuracy, Matthews Correlation Coefficient, Informednessの各評価尺度を比較し,クラスタリングとクラス分類の各文脈において,Informednessの絶対値とInformednessがそれぞれ最適な評価尺度であることを示す.さらに,アルゴリズムを複数の観点から比較可能な統計的要約手法として,複数の領域分割パラメータを試行した際の最高精度と平均精度を用いた統計的要約手法を提案する. 精度評価では,ブラー特徴マップの識別性能評価,単一画像に対するブラー領域分割の精度評価,2枚の画像を用いたブラー領域分割の精度評価を行う.最初に,ブラー特徴マップの識別性能評価では5種類の従来手法によるブラー特徴マップと比較した結果,提案手法によるクラス分類の最高分割性能は0:780 ポイントの精度となり従来手法に対して最小で0:092ポイント,最大で0:366ポイント精度が向上した.また,大津法を用いた際のクラス分類における分割性能は0:697ポイントの精度となり,従来手法に対して最小で0:201ポイント,最大で0:400ポイント精度が向上した.次に,単一画像に対するブラー領域分割精度を比較した.提案領域分割手法は,従来手法を含むすべてのブラー特徴マップに対してクラス分類における分割精度が改善しており,汎用性の高い手法となっている。提案手法はクラス分類において0:722ポイントの精度となり,従来手法に対して最小で0:158ポイント,最大で0:373ポイント精度が向上した.最後に,2 枚の画像を用いたブラー領域分割の精度評価では,単一画像に対するブラー領域分割と精度比較を行った.2枚の画像を用いたブラー領域分割はシンプルな被写体で0:988ポイントの精度となり,単一画像に対する領域分割に対して0:095ポイント精度が向上した.複雑な花画像においては2枚の画像を用いたブラー領域分割は0:827ポイントの精度となり,単一画像に対する領域分割に対して0:058ポイント精度が向上した.また,単一画像に対するブラー領域分割では分割性能が悪い画像に対しても2枚の画像を用いたブラー領域分割は精度が改善されており,提案手法の有効性を示した.電気通信大学201

    Foundations, Inference, and Deconvolution in Image Restoration

    Get PDF
    Image restoration is a critical preprocessing step in computer vision, producing images with reduced noise, blur, and pixel defects. This enables precise higher-level reasoning as to the scene content in later stages of the vision pipeline (e.g., object segmentation, detection, recognition, and tracking). Restoration techniques have found extensive usage in a broad range of applications from industry, medicine, astronomy, biology, and photography. The recovery of high-grade results requires models of the image degradation process, giving rise to a class of often heavily underconstrained, inverse problems. A further challenge specific to the problem of blur removal is noise amplification, which may cause strong distortion by ringing artifacts. This dissertation presents new insights and problem solving procedures for three areas of image restoration, namely (1) model foundations, (2) Bayesian inference for high-order Markov random fields (MRFs), and (3) blind image deblurring (deconvolution). As basic research on model foundations, we contribute to reconciling the perceived differences between probabilistic MRFs on the one hand, and deterministic variational models on the other. To do so, we restrict the variational functional to locally supported finite elements (FE) and integrate over the domain. This yields a sum of terms depending locally on FE basis coefficients, and by identifying the latter with pixels, the terms resolve to MRF potential functions. In contrast with previous literature, we place special emphasis on robust regularizers used commonly in contemporary computer vision. Moreover, we draw samples from the derived models to further demonstrate the probabilistic connection. Another focal issue is a class of high-order Field of Experts MRFs which are learned generatively from natural image data and yield best quantitative results under Bayesian estimation. This involves minimizing an integral expression, which has no closed form solution in general. However, the MRF class under study has Gaussian mixture potentials, permitting expansion by indicator variables as a technical measure. As approximate inference method, we study Gibbs sampling in the context of non-blind deblurring and obtain excellent results, yet at the cost of high computing effort. In reaction to this, we turn to the mean field algorithm, and show that it scales quadratically in the clique size for a standard restoration setting with linear degradation model. An empirical study of mean field over several restoration scenarios confirms advantageous properties with regard to both image quality and computational runtime. This dissertation further examines the problem of blind deconvolution, beginning with localized blur from fast moving objects in the scene, or from camera defocus. Forgoing dedicated hardware or user labels, we rely only on the image as input and introduce a latent variable model to explain the non-uniform blur. The inference procedure estimates freely varying kernels and we demonstrate its generality by extensive experiments. We further present a discriminative method for blind removal of camera shake. In particular, we interleave discriminative non-blind deconvolution steps with kernel estimation and leverage the error cancellation effects of the Regression Tree Field model to attain a deblurring process with tightly linked sequential stages

    Image Enhancement via Deep Spatial and Temporal Networks

    Get PDF
    Image enhancement is a classic problem in computer vision and has been studied for decades. It includes various subtasks such as super-resolution, image deblurring, rain removal and denoise. Among these tasks, image deblurring and rain removal have become increasingly active, as they play an important role in many areas such as autonomous driving, video surveillance and mobile applications. In addition, there exists connection between them. For example, blur and rain often degrade images simultaneously, and the performance of their removal rely on the spatial and temporal learning. To help generate sharp images and videos, in this thesis, we propose efficient algorithms based on deep neural networks for solving the problems of image deblurring and rain removal. In the first part of this thesis, we study the problem of image deblurring. Four deep learning based image deblurring methods are proposed. First, for single image deblurring, a new framework is presented which firstly learns how to transfer sharp images to realistic blurry images via a learning-to-blur Generative Adversarial Network (GAN) module, and then trains a learning-to-deblur GAN module to learn how to generate sharp images from blurry versions. In contrast to prior work which solely focuses on learning to deblur, the proposed method learns to realistically synthesize blurring effects using unpaired sharp and blurry images. Second, for video deblurring, spatio-temporal learning and adversarial training methods are used to recover sharp and realistic video frames from input blurry versions. 3D convolutional kernels on the basis of deep residual neural networks are employed to capture better spatio-temporal features, and train the proposed network with both the content loss and adversarial loss to drive the model to generate realistic frames. Third, the problem of extracting sharp image sequences from a single motion-blurred image is tackled. A detail-aware network is presented, which is a cascaded generator to handle the problems of ambiguity, subtle motion and loss of details. Finally, this thesis proposes a level-attention deblurring network, and constructs a new large-scale dataset including images with blur caused by various factors. We use this dataset to evaluate current deep deblurring methods and our proposed method. In the second part of this thesis, we study the problem of image deraining. Three deep learning based image deraining methods are proposed. First, for single image deraining, the problem of joint removal of raindrops and rain streaks is tackled. In contrast to most of prior works which solely focus on the raindrops or rain streaks removal, a dual attention-in-attention model is presented, which removes raindrops and rain streaks simultaneously. Second, for video deraining, a novel end-to-end framework is proposed to obtain the spatial representation, and temporal correlations based on ResNet-based and LSTM-based architectures, respectively. The proposed method can generate multiple deraining frames at a time, which outperforms the state-of-the-art methods in terms of quality and speed. Finally, for stereo image deraining, a deep stereo semantic-aware deraining network is proposed for the first time in computer vision. Different from the previous methods which only learn from pixel-level loss function or monocular information, the proposed network advances image deraining by leveraging semantic information and visual deviation between two views

    Efficient dense blur map estimation for automatic 2D-to-3D conversion

    No full text
    Focus is an important depth cue for 2D-to-3D conversion of low depth-of-field images and video. However, focus can be only reliably estimated on edges. Therefore, Bea et al. [1] first proposed an optimization based approach to propagate focus to non-edge image portions, for single image focus editing. While their approach produces accurate dense blur maps, the computational complexity and memory requirements for solving the resulting sparse linear system with standard multigrid or (multilevel) preconditioning techniques, are infeasible within the stringent requirements of the consumer electronics and broadcast industry. In this paper we propose fast, efficient, low latency, line scanning based focus propagation, which mitigates the need for complex multigrid or (multilevel) preconditioning techniques. In addition we propose facial blur compensation to compensate for false shading edges that cause incorrect blur estimates in people's faces. In general shading leads to incorrect focus estimates, which may lead to unnatural 3D and visual discomfort. Since visual attention mostly tends to faces, our solution solves the most distracting errors. A subjective assessment by paired comparison on a set of challenging low-depth-of-field images shows that the proposed approach achieves equal 3D image quality as optimization based approaches, and that facial blur compensation results in a significant improvemen
    corecore