640 research outputs found

    Convolutional Deblurring for Natural Imaging

    Full text link
    In this paper, we propose a novel design of image deblurring in the form of one-shot convolution filtering that can directly convolve with naturally blurred images for restoration. The problem of optical blurring is a common disadvantage to many imaging applications that suffer from optical imperfections. Despite numerous deconvolution methods that blindly estimate blurring in either inclusive or exclusive forms, they are practically challenging due to high computational cost and low image reconstruction quality. Both conditions of high accuracy and high speed are prerequisites for high-throughput imaging platforms in digital archiving. In such platforms, deblurring is required after image acquisition before being stored, previewed, or processed for high-level interpretation. Therefore, on-the-fly correction of such images is important to avoid possible time delays, mitigate computational expenses, and increase image perception quality. We bridge this gap by synthesizing a deconvolution kernel as a linear combination of Finite Impulse Response (FIR) even-derivative filters that can be directly convolved with blurry input images to boost the frequency fall-off of the Point Spread Function (PSF) associated with the optical blur. We employ a Gaussian low-pass filter to decouple the image denoising problem for image edge deblurring. Furthermore, we propose a blind approach to estimate the PSF statistics for two Gaussian and Laplacian models that are common in many imaging pipelines. Thorough experiments are designed to test and validate the efficiency of the proposed method using 2054 naturally blurred images across six imaging applications and seven state-of-the-art deconvolution methods.Comment: 15 pages, for publication in IEEE Transaction Image Processin

    Scene-Dependency of Spatial Image Quality Metrics

    Get PDF
    This thesis is concerned with the measurement of spatial imaging performance and the modelling of spatial image quality in digital capturing systems. Spatial imaging performance and image quality relate to the objective and subjective reproduction of luminance contrast signals by the system, respectively; they are critical to overall perceived image quality. The Modulation Transfer Function (MTF) and Noise Power Spectrum (NPS) describe the signal (contrast) transfer and noise characteristics of a system, respectively, with respect to spatial frequency. They are both, strictly speaking, only applicable to linear systems since they are founded upon linear system theory. Many contemporary capture systems use adaptive image signal processing, such as denoising and sharpening, to optimise output image quality. These non-linear processes change their behaviour according to characteristics of the input signal (i.e. the scene being captured). This behaviour renders system performance “scene-dependent” and difficult to measure accurately. The MTF and NPS are traditionally measured from test charts containing suitable predefined signals (e.g. edges, sinusoidal exposures, noise or uniform luminance patches). These signals trigger adaptive processes at uncharacteristic levels since they are unrepresentative of natural scene content. Thus, for systems using adaptive processes, the resultant MTFs and NPSs are not representative of performance “in the field” (i.e. capturing real scenes). Spatial image quality metrics for capturing systems aim to predict the relationship between MTF and NPS measurements and subjective ratings of image quality. They cascade both measures with contrast sensitivity functions that describe human visual sensitivity with respect to spatial frequency. The most recent metrics designed for adaptive systems use MTFs measured using the dead leaves test chart that is more representative of natural scene content than the abovementioned test charts. This marks a step toward modelling image quality with respect to real scene signals. This thesis presents novel scene-and-process-dependent MTFs (SPD-MTF) and NPSs (SPDNPS). They are measured from imaged pictorial scene (or dead leaves target) signals to account for system scene-dependency. Further, a number of spatial image quality metrics are revised to account for capture system and visual scene-dependency. Their MTF and NPS parameters were substituted for SPD-MTFs and SPD-NPSs. Likewise, their standard visual functions were substituted for contextual detection (cCSF) or discrimination (cVPF) functions. In addition, two novel spatial image quality metrics are presented (the log Noise Equivalent Quanta (NEQ) and Visual log NEQ) that implement SPD-MTFs and SPD-NPSs. The metrics, SPD-MTFs and SPD-NPSs were validated by analysing measurements from simulated image capture pipelines that applied either linear or adaptive image signal processing. The SPD-NPS measures displayed little evidence of measurement error, and the metrics performed most accurately when they used SPD-NPSs measured from images of scenes. The benefit of deriving SPD-MTFs from images of scenes was traded-off, however, against measurement bias. Most metrics performed most accurately with SPD-MTFs derived from dead leaves signals. Implementing the cCSF or cVPF did not increase metric accuracy. The log NEQ and Visual log NEQ metrics proposed in this thesis were highly competitive, outperforming metrics of the same genre. They were also more consistent than the IEEE P1858 Camera Phone Image Quality (CPIQ) metric when their input parameters were modified. The advantages and limitations of all performance measures and metrics were discussed, as well as their practical implementation and relevant applications

    Bridging the Gap Between Imaging Performance and Image Quality Measures

    Get PDF
    Imaging system performance measures and Image Quality Metrics (IQM) are reviewed from a systems engineering perspective, focusing on spatial quality of still image capture systems. We classify IQMs broadly as: Computational IQMs (CPIQM), Multivariate Formalism IQMs (MF-IQM), Image Fidelity Metrics (IF-IQM), and Signal Transfer Visual IQMs (STV-IQM). Comparison of each genre finds STV-IQMs well suited for capture system quality evaluation: they incorporate performance measures relevant to optical systems design, such as Modulation Transfer Function (MTF) and Noise-Power Spectrum (NPS); their bottom, modular approach enables system components to be optimised separately. We suggest that correlation between STV IQMs and observer quality scores is limited by three factors: current MTF and NPS measures do not characterize scene-dependent performance introduced by imaging system non-linearities; contrast sensitivity models employed do not account for contextual masking effects; cognitive factors are not considered. We hypothesise that implementation of scene and process-dependent MTF (SPD-MTF) and NPS (SPD-NPS) measures should mitigate errors originating from scene dependent system performance. Further, we propose implementation of contextual contrast detection and discrimination models to better represent low-level visual performance in image quality analysis. Finally, we discuss image quality optimization functions that may potentially close the gap between contrast detection/discrimination and quality

    Evaluation and Understandability of Face Image Quality Assessment

    Get PDF
    Face image quality assessment (FIQA) has been an area of interest to researchers as a way to improve the face recognition accuracy. By filtering out the low quality images we can reduce various difficulties faced in unconstrained face recognition, such as, failure in face or facial landmark detection or low presence of useful facial information. In last decade or so, researchers have proposed different methods to assess the face image quality, spanning from fusion of quality measures to using learning based methods. Different approaches have their own strength and weaknesses. But, it is hard to perform a comparative assessment of these methods without a database containing wide variety of face quality, a suitable training protocol that can efficiently utilize this large-scale dataset. In this thesis we focus on developing an evaluation platfrom using a large scale face database containing wide ranging face image quality and try to deconstruct the reason behind the predicted scores of learning based face image quality assessment methods. Contributions of this thesis is two-fold. Firstly, (i) a carefully crafted large scale database dedicated entirely to face image quality assessment has been proposed; (ii) a learning to rank based large-scale training protocol is devel- oped. Finally, (iii) a comprehensive study of 15 face image quality assessment methods using 12 different feature types, and relative ranking based label generation schemes, is performed. Evalua- tion results show various insights about the assessment methods which indicate the significance of the proposed database and the training protocol. Secondly, we have seen that in last few years, researchers have tried various learning based approaches to assess the face image quality. Most of these methods offer either a quality bin or a score summary as a measure of the biometric quality of the face image. But, to the best of our knowledge, so far there has not been any investigation on what are the explainable reasons behind the predicted scores. In this thesis, we propose a method to provide a clear and concise understanding of the predicted quality score of a learning based face image quality assessment. It is believed that this approach can be integrated into the FBI’s understandable template and can help in improving the image acquisition process by providing information on what quality factors need to be addressed

    New Datasets, Models, and Optimization

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2021.8. 손현태.사진 촬영의 궁극적인 목표는 고품질의 깨끗한 영상을 얻는 것이다. 현실적으로, 일상의 사진은 자주 흔들린 카메라와 움직이는 물체가 있는 동적 환경에서 찍는다. 노출시간 중의 카메라와 피사체간의 상대적인 움직임은 사진과 동영상에서 모션 블러를 일으키며 시각적인 화질을 저하시킨다. 동적 환경에서 블러의 세기와 움직임의 모양은 매 이미지마다, 그리고 매 픽셀마다 다르다. 국지적으로 변화하는 블러의 성질은 사진과 동영상에서의 모션 블러 제거를 심각하게 풀기 어려우며 해답이 하나로 정해지지 않은, 잘 정의되지 않은 문제로 만든다. 물리적인 움직임 모델링을 통해 해석적인 접근법을 설계하기보다는 머신러닝 기반의 접근법은 이러한 잘 정의되지 않은 문제를 푸는 보다 현실적인 답이 될 수 있다. 특히 딥 러닝은 최근 컴퓨터 비전 학계에서 표준적인 기법이 되어 가고 있다. 본 학위논문은 사진 및 비디오 디블러링 문제에 대해 딥 러닝 기반 솔루션을 도입하며 여러 현실적인 문제를 다각적으로 다룬다. 첫 번째로, 디블러링 문제를 다루기 위한 데이터셋을 취득하는 새로운 방법을 제안한다. 모션 블러가 있는 이미지와 깨끗한 이미지를 시간적으로 정렬된 상태로 동시에 취득하는 것은 쉬운 일이 아니다. 데이터가 부족한 경우 디블러링 알고리즘들을 평가하는 것 뿐만 아니라 지도학습 기법을 개발하는 것도 불가능해진다. 그러나 고속 비디오를 사용하여 카메라 영상 취득 파이프라인을 모방하면 실제적인 모션 블러 이미지를 합성하는 것이 가능하다. 기존의 블러 합성 기법들과 달리 제안하는 방법은 여러 움직이는 피사체들과 다양한 영상 깊이, 움직임 경계에서의 가리워짐 등으로 인한 자연스러운 국소적 블러의 복잡도를 반영할 수 있다. 두 번째로, 제안된 데이터셋에 기반하여 새로운 단일영상 디블러링을 위한 뉴럴 네트워크 구조를 제안한다. 최적화기법 기반 이미지 디블러링 방식에서 널리 쓰이고 있는 점차적 미세화 접근법을 반영하여 다중규모 뉴럴 네트워크를 설계한다. 제안된 다중규모 모델은 비슷한 복잡도를 가진 단일규모 모델들보다 높은 복원 정확도를 보인다. 세 번째로, 비디오 디블러링을 위한 순환 뉴럴 네트워크 모델 구조를 제안한다. 디블러링을 통해 고품질의 비디오를 얻기 위해서는 각 프레임간의 시간적인 정보와 프레임 내부적인 정보를 모두 사용해야 한다. 제안하는 내부프레임 반복적 연산구조는 두 정보를 효과적으로 함께 사용함으로써 모델 파라미터 수를 증가시키지 않고도 디블러 정확도를 향상시킨다. 마지막으로, 새로운 디블러링 모델들을 보다 잘 최적화하기 위해 로스 함수를 제안한다. 깨끗하고 또렷한 사진 한 장으로부터 자연스러운 모션 블러를 만들어내는 것은 블러를 제거하는 것과 마찬가지로 어려운 문제이다. 그러나 통상 사용하는 로스 함수로 얻은 디블러링 방법들은 블러를 완전히 제거하지 못하며 디블러된 이미지의 남아있는 블러로부터 원래의 블러를 재건할 수 있다. 제안하는 리블러링 로스 함수는 디블러링 수행시 모션 블러를 보다 잘 제거하도록 설계되었다. 이에 나아가 제안한 자기지도학습 과정으로부터 테스트시 모델이 새로운 데이터에 적응하도록 할 수 있다. 이렇게 제안된 데이터셋, 모델 구조, 그리고 로스 함수를 통해 딥 러닝에 기반하여 단일 영상 및 비디오 디블러링 기법들을 제안한다. 광범위한 실험 결과로부터 정량적 및 정성적으로 최첨단 디블러링 성과를 증명한다.Obtaining a high-quality clean image is the ultimate goal of photography. In practice, daily photography is often taken in dynamic environments with moving objects as well as shaken cameras. The relative motion between the camera and the objects during the exposure causes motion blur in images and videos, degrading the visual quality. The degree of blur strength and the shape of motion trajectory varies by every image and every pixel in dynamic environments. The locally-varying property makes the removal of motion blur in images and videos severely ill-posed. Rather than designing analytic solutions with physical modelings, using machine learning-based approaches can serve as a practical solution for such a highly ill-posed problem. Especially, deep-learning has been the recent standard in computer vision literature. This dissertation introduces deep learning-based solutions for image and video deblurring by tackling practical issues in various aspects. First, a new way of constructing the datasets for dynamic scene deblurring task is proposed. It is nontrivial to simultaneously obtain a pair of the blurry and the sharp image that are temporally aligned. The lack of data prevents the supervised learning techniques to be developed as well as the evaluation of deblurring algorithms. By mimicking the camera image pipeline with high-speed videos, realistic blurry images could be synthesized. In contrast to the previous blur synthesis methods, the proposed approach can reflect the natural complex local blur from and multiple moving objects, varying depth, and occlusion at motion boundaries. Second, based on the proposed datasets, a novel neural network architecture for single-image deblurring task is presented. Adopting the coarse-to-fine approach that is widely used in energy optimization-based methods for image deblurring, a multi-scale neural network architecture is derived. Compared with the single-scale model with similar complexity, the multi-scale model exhibits higher accuracy and faster speed. Third, a light-weight recurrent neural network model architecture for video deblurring is proposed. In order to obtain a high-quality video from deblurring, it is important to exploit the intrinsic information in the target frame as well as the temporal relation between the neighboring frames. Taking benefits from both sides, the proposed intra-frame iterative scheme applied to the RNNs achieves accuracy improvements without increasing the number of model parameters. Lastly, a novel loss function is proposed to better optimize the deblurring models. Estimating a dynamic blur for a clean and sharp image without given motion information is another ill-posed problem. While the goal of deblurring is to completely get rid of motion blur, conventional loss functions fail to train neural networks to fulfill the goal, leaving the trace of blur in the deblurred images. The proposed reblurring loss functions are designed to better eliminate the motion blur and to produce sharper images. Furthermore, the self-supervised learning process facilitates the adaptation of the deblurring model at test-time. With the proposed datasets, model architectures, and the loss functions, the deep learning-based single-image and video deblurring methods are presented. Extensive experimental results demonstrate the state-of-the-art performance both quantitatively and qualitatively.1 Introduction 1 2 Generating Datasets for Dynamic Scene Deblurring 7 2.1 Introduction 7 2.2 GOPRO dataset 9 2.3 REDS dataset 11 2.4 Conclusion 18 3 Deep Multi-Scale Convolutional Neural Networks for Single Image Deblurring 19 3.1 Introduction 19 3.1.1 Related Works 21 3.1.2 Kernel-Free Learning for Dynamic Scene Deblurring 23 3.2 Proposed Method 23 3.2.1 Model Architecture 23 3.2.2 Training 26 3.3 Experiments 29 3.3.1 Comparison on GOPRO Dataset 29 3.3.2 Comparison on Kohler Dataset 33 3.3.3 Comparison on Lai et al. [54] dataset 33 3.3.4 Comparison on Real Dynamic Scenes 34 3.3.5 Effect of Adversarial Loss 34 3.4 Conclusion 41 4 Intra-Frame Iterative RNNs for Video Deblurring 43 4.1 Introduction 43 4.2 Related Works 46 4.3 Proposed Method 50 4.3.1 Recurrent Video Deblurring Networks 51 4.3.2 Intra-Frame Iteration Model 52 4.3.3 Regularization by Stochastic Training 56 4.4 Experiments 58 4.4.1 Datasets 58 4.4.2 Implementation details 59 4.4.3 Comparisons on GOPRO [72] dataset 59 4.4.4 Comparisons on [97] Dataset and Real Videos 60 4.5 Conclusion 61 5 Learning Loss Functions for Image Deblurring 67 5.1 Introduction 67 5.2 Related Works 71 5.3 Proposed Method 73 5.3.1 Clean Images are Hard to Reblur 73 5.3.2 Supervision from Reblurring Loss 75 5.3.3 Test-time Adaptation by Self-Supervision 76 5.4 Experiments 78 5.4.1 Effect of Reblurring Loss 78 5.4.2 Effect of Sharpness Preservation Loss 80 5.4.3 Comparison with Other Perceptual Losses 81 5.4.4 Effect of Test-time Adaptation 81 5.4.5 Comparison with State-of-The-Art Methods 82 5.4.6 Real World Image Deblurring 85 5.4.7 Combining Reblurring Loss with Other Perceptual Losses 86 5.4.8 Perception vs. Distortion Trade-Off 87 5.4.9 Visual Comparison of Loss Function 88 5.4.10 Implementation Details 89 5.4.11 Determining Reblurring Module Size 94 5.5 Conclusion 95 6 Conclusion 97 국문 초록 115 감사의 글 117박

    Kuvanlaatukokemuksen arvionnin instrumentit

    Get PDF
    This dissertation describes the instruments available for image quality evaluation, develops new methods for subjective image quality evaluation and provides image and video databases for the assessment and development of image quality assessment (IQA) algorithms. The contributions of the thesis are based on six original publications. The first publication introduced the VQone toolbox for subjective image quality evaluation. It created a platform for free-form experimentation with standardized image quality methods and was the foundation for later studies. The second publication focused on the dilemma of reference in subjective experiments by proposing a new method for image quality evaluation: the absolute category rating with dynamic reference (ACR-DR). The third publication presented a database (CID2013) in which 480 images were evaluated by 188 observers using the ACR-DR method proposed in the prior publication. Providing databases of image files along with their quality ratings is essential in the field of IQA algorithm development. The fourth publication introduced a video database (CVD2014) based on having 210 observers rate 234 video clips. The temporal aspect of the stimuli creates peculiar artifacts and degradations, as well as challenges to experimental design and video quality assessment (VQA) algorithms. When the CID2013 and CVD2014 databases were published, most state-of-the-art I/VQAs had been trained on and tested against databases created by degrading an original image or video with a single distortion at a time. The novel aspect of CID2013 and CVD2014 was that they consisted of multiple concurrent distortions. To facilitate communication and understanding among professionals in various fields of image quality as well as among non-professionals, an attribute lexicon of image quality, the image quality wheel, was presented in the fifth publication of this thesis. Reference wheels and terminology lexicons have a long tradition in sensory evaluation contexts, such as taste experience studies, where they are used to facilitate communication among interested stakeholders; however, such an approach has not been common in visual experience domains, especially in studies on image quality. The sixth publication examined how the free descriptions given by the observers influenced the ratings of the images. Understanding how various elements, such as perceived sharpness and naturalness, affect subjective image quality can help to understand the decision-making processes behind image quality evaluation. Knowing the impact of each preferential attribute can then be used for I/VQA algorithm development; certain I/VQA algorithms already incorporate low-level human visual system (HVS) models in their algorithms.Väitöskirja tarkastelee ja kehittää uusia kuvanlaadun arvioinnin menetelmiä, sekä tarjoaa kuva- ja videotietokantoja kuvanlaadun arviointialgoritmien (IQA) testaamiseen ja kehittämiseen. Se, mikä koetaan kauniina ja miellyttävänä, on psykologisesti kiinnostava kysymys. Työllä on myös merkitystä teollisuuteen kameroiden kuvanlaadun kehittämisessä. Väitöskirja sisältää kuusi julkaisua, joissa tarkastellaan aihetta eri näkökulmista. I. julkaisussa kehitettiin sovellus keräämään ihmisten antamia arvioita esitetyistä kuvista tutkijoiden vapaaseen käyttöön. Se antoi mahdollisuuden testata standardoituja kuvanlaadun arviointiin kehitettyjä menetelmiä ja kehittää niiden pohjalta myös uusia menetelmiä luoden perustan myöhemmille tutkimuksille. II. julkaisussa kehitettiin uusi kuvanlaadun arviointimenetelmä. Menetelmä hyödyntää sarjallista kuvien esitystapaa, jolla muodostettiin henkilöille mielikuva kuvien laatuvaihtelusta ennen varsinaista arviointia. Tämän todettiin vähentävän tulosten hajontaa ja erottelevan pienempiä kuvanlaatueroja. III. julkaisussa kuvaillaan tietokanta, jossa on 188 henkilön 480 kuvasta antamat laatuarviot ja niihin liittyvät kuvatiedostot. Tietokannat ovat arvokas työkalu pyrittäessä kehittämään algoritmeja kuvanlaadun automaattiseen arvosteluun. Niitä tarvitaan mm. opetusmateriaalina tekoälyyn pohjautuvien algoritmien kehityksessä sekä vertailtaessa eri algoritmien suorituskykyä toisiinsa. Mitä paremmin algoritmin tuottama ennuste korreloi ihmisten antamiin laatuarvioihin, sen parempi suorituskyky sillä voidaan sanoa olevan. IV. julkaisussa esitellään tietokanta, jossa on 210 henkilön 234 videoleikkeestä tekemät laatuarviot ja niihin liittyvät videotiedostot. Ajallisen ulottuvuuden vuoksi videoärsykkeiden virheet ovat erilaisia kuin kuvissa, mikä tuo omat haasteensa videoiden laatua arvioiville algoritmeille (VQA). Aikaisempien tietokantojen ärsykkeet on muodostettu esimerkiksi sumentamalla yksittäistä kuvaa asteittain, jolloin ne sisältävät vain yksiulotteisia vääristymiä. Nyt esitetyt tietokannat poikkeavat aikaisemmista ja sisältävät useita samanaikaisia vääristymistä, joiden interaktio kuvanlaadulle voi olla merkittävää. V. julkaisussa esitellään kuvanlaatuympyrä (image quality wheel). Se on kuvanlaadun käsitteiden sanasto, joka on kerätty analysoimalla 146 henkilön tuottamat 39 415 kuvanlaadun sanallista kuvausta. Sanastoilla on pitkät perinteet aistinvaraisen arvioinnin tutkimusperinteessä, mutta niitä ei ole aikaisemmin kehitetty kuvanlaadulle. VI. tutkimuksessa tutkittiin, kuinka arvioitsijoiden antamat käsitteet vaikuttavat kuvien laadun arviointiin. Esimerkiksi kuvien arvioitu terävyys tai luonnollisuus auttaa ymmärtämään laadunarvioinnin taustalla olevia päätöksentekoprosesseja. Tietoa voidaan käyttää esimerkiksi kuvan- ja videonlaadun arviointialgoritmien (I/VQA) kehitystyössä
    corecore