240 research outputs found

    듀얼 픽셀 이미지 기반 제로샷 디포커스 디블러링

    Get PDF
    학위논문(석사) -- 서울대학교대학원 : 공과대학 협동과정 인공지능전공, 2022. 8. 한보형.Defocus deblurring in dual-pixel (DP) images is a challenging problem due to diverse camera optics and scene structures. Most of the existing algorithms rely on supervised learning approaches trained on the Canon DSLR dataset but often suffer from weak generalizability to out-of-distribution images including the ones captured by smartphones. We propose a novel zero-shot defocus deblurring algorithm, which only requires a pair of DP images without any training data and a pre-calibrated ground-truth blur kernel. Specifically, our approach first initializes a sharp latent map using a parametric blur kernel with a symmetry constraint. It then uses a convolutional neural network (CNN) to estimate the defocus map that best describes the observed DP image. Finally, it employs a generative model to learn scene-specific non-uniform blur kernels to compute the final enhanced images. We demonstrate that the proposed unsupervised technique outperforms the counterparts based on supervised learning when training and testing run in different datasets. We also present that our model achieves competitive accuracy when tested on in-distribution data.듀얼 픽셀(DP) 이미지 센서를 사용하는 스마트폰에서의 Defocus Blur 현상은 다양한 카메라 광학 구조와 물체의 깊이 마다 다른 흐릿함 정도로 인해 원 영상 복원이 쉽지 않습니다. 기존 알고리즘들은 모두 Canon DSLR 데이터에서 훈련된 지도 학습 접근 방식에 의존하여 스마트폰으로 촬영된 사진에서는 잘 일반화가 되지 않습니다. 본 논문에서는 훈련 데이터와 사전 보정된 실제 Blur 커널 없이도, 한 쌍의 DP 사진만으로도 학습이 가능한 Zero-shot Defocus Deblurring 알고리즘을 제안합니다. 특히, 본 논문에서는 대칭적으로 모델링 된 Blur Kernel을 사용하여 초기 영상을 복원하며, 이후 CNN(Convolutional Neural Network)을 사용하여 관찰된 DP 이미지를 가장 잘 설명하는 Defocus Map을 추정합니다. 마지막으로 CNN을 사용하여 장면 별 Non-uniform한 Blur Kernel을 학습하여 최종 복원 영상의 성능을 개선합니다. 학습과 추론이 다른 데이터 세트에서 실행될 때, 제안된 방법은 비지도 기술 임에도 불구하고 최근에 발표된 지도 학습을 기반의 방법들보다 우수한 성능을 보여줍니다. 또한 학습 된 것과 같은 분포 내 데이터에서 추론할 때도 지도 학습 기반의 방법들과 정량적 또는 정성적으로 비슷한 성능을 보이는 것을 확인할 수 있었습니다.1. Introduction 6 1.1. Background 6 1.2. Overview 9 1.3. Contribution 11 2. Related Works 12 2.1.Defocus Deblurring 12 2.2.Defocus Map 13 2.3.Multiplane Image Representation 14 2.4.DP Blur Kernel 14 3. Proposed Methods 16 3.1. Latent Map Initialization 17 3.2. Defocus Map Estimation 20 3.3. Learning Blur Kernel s 22 3.4. Implementation Details 25 4. Experiments 28 4.1. Dataset 28 4.2. Quantitative Results 29 4.3. Qualitative Results 31 5. Conclusions 37 5.1.Summary 37 5.2. Discussion 38석

    LDP: Language-driven Dual-Pixel Image Defocus Deblurring Network

    Full text link
    Recovering sharp images from dual-pixel (DP) pairs with disparity-dependent blur is a challenging task.~Existing blur map-based deblurring methods have demonstrated promising results. In this paper, we propose, to the best of our knowledge, the first framework to introduce the contrastive language-image pre-training framework (CLIP) to achieve accurate blur map estimation from DP pairs unsupervisedly. To this end, we first carefully design text prompts to enable CLIP to understand blur-related geometric prior knowledge from the DP pair. Then, we propose a format to input stereo DP pair to the CLIP without any fine-tuning, where the CLIP is pre-trained on monocular images. Given the estimated blur map, we introduce a blur-prior attention block, a blur-weighting loss and a blur-aware loss to recover the all-in-focus image. Our method achieves state-of-the-art performance in extensive experiments

    Learning Lens Blur Fields

    Full text link
    Optical blur is an inherent property of any lens system and is challenging to model in modern cameras because of their complex optical elements. To tackle this challenge, we introduce a high-dimensional neural representation of blur-the lens blur field\textit{the lens blur field}-and a practical method for acquiring it. The lens blur field is a multilayer perceptron (MLP) designed to (1) accurately capture variations of the lens 2D point spread function over image plane location, focus setting and, optionally, depth and (2) represent these variations parametrically as a single, sensor-specific function. The representation models the combined effects of defocus, diffraction, aberration, and accounts for sensor features such as pixel color filters and pixel-specific micro-lenses. To learn the real-world blur field of a given device, we formulate a generalized non-blind deconvolution problem that directly optimizes the MLP weights using a small set of focal stacks as the only input. We also provide a first-of-its-kind dataset of 5D blur fields-for smartphone cameras, camera bodies equipped with a variety of lenses, etc. Lastly, we show that acquired 5D blur fields are expressive and accurate enough to reveal, for the first time, differences in optical behavior of smartphone devices of the same make and model

    Learning Depth from Focus in the Wild

    Full text link
    For better photography, most recent commercial cameras including smartphones have either adopted large-aperture lens to collect more light or used a burst mode to take multiple images within short times. These interesting features lead us to examine depth from focus/defocus. In this work, we present a convolutional neural network-based depth estimation from single focal stacks. Our method differs from relevant state-of-the-art works with three unique features. First, our method allows depth maps to be inferred in an end-to-end manner even with image alignment. Second, we propose a sharp region detection module to reduce blur ambiguities in subtle focus changes and weakly texture-less regions. Third, we design an effective downsampling module to ease flows of focal information in feature extractions. In addition, for the generalization of the proposed network, we develop a simulator to realistically reproduce the features of commercial cameras, such as changes in field of view, focal length and principal points. By effectively incorporating these three unique features, our network achieves the top rank in the DDFF 12-Scene benchmark on most metrics. We also demonstrate the effectiveness of the proposed method on various quantitative evaluations and real-world images taken from various off-the-shelf cameras compared with state-of-the-art methods. Our source code is publicly available at https://github.com/wcy199705/DfFintheWild

    Learnable Blur Kernel for Single-Image Defocus Deblurring in the Wild

    Full text link
    Recent research showed that the dual-pixel sensor has made great progress in defocus map estimation and image defocus deblurring. However, extracting real-time dual-pixel views is troublesome and complex in algorithm deployment. Moreover, the deblurred image generated by the defocus deblurring network lacks high-frequency details, which is unsatisfactory in human perception. To overcome this issue, we propose a novel defocus deblurring method that uses the guidance of the defocus map to implement image deblurring. The proposed method consists of a learnable blur kernel to estimate the defocus map, which is an unsupervised method, and a single-image defocus deblurring generative adversarial network (DefocusGAN) for the first time. The proposed network can learn the deblurring of different regions and recover realistic details. We propose a defocus adversarial loss to guide this training process. Competitive experimental results confirm that with a learnable blur kernel, the generated defocus map can achieve results comparable to supervised methods. In the single-image defocus deblurring task, the proposed method achieves state-of-the-art results, especially significant improvements in perceptual quality, where PSNR reaches 25.56 dB and LPIPS reaches 0.111.Comment: 9 pages, 7 figure

    Edge adaptive filtering of depth maps for mobile devices

    Get PDF
    Abstract. Mobile phone cameras have an almost unlimited depth of field, and therefore the images captured with them have wide areas in focus. When the depth of field is digitally manipulated through image processing, accurate perception of depth in a captured scene is important. Capturing depth data requires advanced imaging methods. In case a stereo lens system is used, depth information is calculated from the disparities between stereo frames. The resulting depth map is often noisy or doesn’t have information for every pixel. Therefore it has to be filtered before it is used for emphasizing depth. Edges must be taken into account in this process to create natural-looking shallow depth of field images. In this study five filtering methods are compared with each other. The main focus is the Fast Bilateral Solver, because of its novelty and high reported quality. Mobile imaging requires fast filtering in uncontrolled environments, so optimizing the processing time of the filters is essential. In the evaluations the depth maps are filtered, and the quality and the speed is determined for every method. The results show that the Fast Bilateral Solver filters the depth maps well, and can handle noisy depth maps better than the other evaluated methods. However, in mobile imaging it is slow and needs further optimization.Reunatietoinen syvyyskarttojen suodatus mobiililaitteilla. Tiivistelmä. Matkapuhelimien kameroissa on lähes rajoittamaton syväterävyysalue, ja siksi niillä otetuissa kuvissa laajat alueet näkyvät tarkennettuina. Digitaalisessa syvyysterävyysalueen muokkauksessa tarvitaan luotettava syvyystieto. Syvyysdatan hankinta vaatii edistyneitä kuvausmenetelmiä. Käytettäessä stereokameroita syvyystieto lasketaan kuvien välisistä dispariteeteista. Tuloksena syntyvä syvyyskartta on usein kohinainen, tai se ei sisällä syvyystietoa joka pikselille. Tästä syystä se on suodatettava ennen käyttöä syvyyden korostamiseen. Tässä prosessissa reunat ovat otettava huomioon, jotta saadaan luotua luonnollisen näköisiä kapean syväterävyysalueen kuvia. Tässä tutkimuksessa verrataan viittä suodatusmenetelmää keskenään. Eniten keskitytään nopeaan bilateraaliseen ratkaisijaan, johtuen sen uutuudesta ja korkeasta tuloksen laadusta. Mobiililaitteella kuvantamisen vaatimuksena on nopea suodatus hallitsemattomissa olosuhteissa, joten suodattimien prosessointiajan optimointi on erittäin tärkeää. Vertailuissa syvyyskuvat suodatetaan ja suodatuksen laatu ja nopeus mitataan jokaiselle menetelmälle. Tulokset osoittavat, että nopea bilateraalinen ratkaisija suodattaa syvyyskarttoja hyvin ja osaa käsitellä kohinaisia syvyyskarttoja paremmin kuin muut tarkastellut menetelmät. Mobiilikuvantamiseen se on kuitenkin hidas ja tarvitsee pidemmälle menevää optimointia
    corecore