34 research outputs found

    Understanding perceived quality through visual representations

    Get PDF
    The formatting of images can be considered as an optimization problem, whose cost function is a quality assessment algorithm. There is a trade-off between bit budget per pixel and quality. To maximize the quality and minimize the bit budget, we need to measure the perceived quality. In this thesis, we focus on understanding perceived quality through visual representations that are based on visual system characteristics and color perception mechanisms. Specifically, we use the contrast sensitivity mechanisms in retinal ganglion cells and the suppression mechanisms in cortical neurons. We utilize color difference equations and color name distances to mimic pixel-wise color perception and a bio-inspired model to formulate center surround effects. Based on these formulations, we introduce two novel image quality estimators PerSIM and CSV, and a new image quality-assistance method BLeSS. We combine our findings from visual system and color perception with data-driven methods to generate visual representations and measure their quality. The majority of existing data-driven methods require subjective scores or degraded images. In contrast, we follow an unsupervised approach that only utilizes generic images. We introduce a novel unsupervised image quality estimator UNIQUE, and extend it with multiple models and layers to obtain MS-UNIQUE and DMS-UNIQUE. In addition to introducing quality estimators, we analyze the role of spatial pooling and boosting in image quality assessment.Ph.D

    고해상도 CMOS 이미지 μ„Όμ„œλ₯Ό μœ„ν•œ λ‚˜λ…Έκ΄‘ν•™μ†Œμž

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(박사) -- μ„œμšΈλŒ€ν•™κ΅λŒ€ν•™μ› : κ³΅κ³ΌλŒ€ν•™ 전기·정보곡학뢀, 2021.8. μ΄λ³‘ν˜Έ.Image sensor is a device that converts electromagnetic waves scattered by the objects or environment into electric signals. Recently, in the mobile device and autonomous vehicle industries, multiple image sensors having different purposes are required for a single device. In particular, image sensors with more than 100 million pixels are being developed in response to the development of a display to a high resolution of 8K or more. However, due to the limited space of the mobile device, the size of pixels constituting the sensor must be reduced for a high-resolution image sensor, which causes factors that reduce image quality, such as a decrease in light efficiency, a decrease in quantum efficiency, and color interference. Metasurface is a device that modulates electromagnetic waves through an array of antennas smaller than wavelength. It has been proposed as a device that replaces the color filter, lens, and photodiode constituting the optical system of the image sensor. However, the performance of the metasurface corresponding to the miniaturized pixel size was limited by the operating principle that requires several array of nano-antennas. In this dissertation, I present a metasurface optical device that can improve the image quality of an existing image sensor composed of micropixels. First, an absorption type color filter that suppresses reflection is discussed. The reflection that inevitably occurs in the conventional metasurface color filter elements causes a flare phenomenon in the captured image. In this dissertation, I design a color filter that transmits only a specific band and absorbs the rest of the absorption resonant band of a hyperbolic metamaterial antenna using a particle swarm optimization method. In particular, I present a Bayer pattern color filter with a pixel size of 255 nm. Second, I introduce a color distribution meta-surface to increase the light efficiency of the image sensor. Since the photodiode converts light having energy above the band gap into an electric signal, an absorption type color filter is used for color classification in image sensor. This means that the total light efficiency of the image sensor is limited to 33% by the blue, green, and red filters constituting one pixel. Accordingly, a freeform metasurface device is designed that exceeds the conventional optical efficiency limit by distributing light incident on the sub-pixel in different directions according to color. Finally, an optical confinement device capable of increasing signal-to-noise ratio (SNR) in low-illuminance at near-infrared is presented. Through the funnel-shaped plasmonic aperture, the light is focused on a volume much smaller than the wavelength. The focused electric and magnetic fields interact with the spatially distributed semiconductors, which achieve a Purcell effect enhanced by the presence of the metasurface. This dissertation is expected to overcome the conventional nanophotonic devices for image sensors and become a cornerstone of the development of micropixel or nanopixel image sensors. Furthermore, it is expected to contribute to building a new image sensor platform that will replace the optical system constituting the image sensor with metasurface.이미지 μ„Όμ„œλŠ” ν™˜κ²½μ— μ˜ν•΄ μ‚°λž€λ˜λŠ” μ „μžκΈ°νŒŒλ₯Ό μ „κΈ°μ‹ ν˜Έλ‘œ λ°”κΎΈλŠ” μ†Œμžλ‘œ, 졜근 λͺ¨λ°”일 기기와 자율 μ£Όν–‰ μžλ™μ°¨ μ‚°μ—…μ—μ„œ 단일 λ””λ°”μ΄μŠ€μ— λ‹€λ₯Έ λͺ©μ μ„ 가진 이미지 μ„Όμ„œλ“€μ΄ μš”κ΅¬λ˜κ³  μžˆλ‹€. 특히, λ””μŠ€ν”Œλ ˆμ΄κ°€ 8K μ΄μƒμ˜ κ³ ν•΄μƒλ„λ‘œ λ°œμ „ν•¨μ— λŒ€μ‘ν•˜μ—¬ 1μ–΅ν™”μ†Œ μ΄μƒμ˜ 이미지 μ„Όμ„œκ°€ 개발되고 μžˆλ‹€. κ·ΈλŸ¬λ‚˜, λͺ¨λ°”일 기기의 μ œν•œλœ 곡간에 μ˜ν•΄ 고해상도 이미지 μ„Όμ„œλ₯Ό μœ„ν•΄μ„œλŠ” μ„Όμ„œλ₯Ό κ΅¬μ„±ν•˜λŠ” ν”½μ…€μ˜ 크기λ₯Ό 쀄여야 ν•˜λ©°, μ΄λŠ” κ΄‘ 효율 κ°μ†Œ, μ–‘μž 효율 κ°μ†Œ, 색 κ°„μ„­ λ“±μ˜ ν™”μ§ˆμ„ κ°μ†Œμ‹œν‚€λŠ” μš”μ†Œλ“€μ„ μ•ΌκΈ°ν•œλ‹€. λ©”νƒ€ν‘œλ©΄μ€ 파μž₯보닀 μž‘μ€ μ•ˆν…Œλ‚˜λ“€μ˜ 배열을 톡해 μ „μžκΈ°νŒŒλ₯Ό λ³€μ‘°ν•΄μ£ΌλŠ” μ†Œμžλ‘œ, 이미지 μ„Όμ„œμ˜ κ΄‘ν•™ μ‹œμŠ€ν…œμ„ κ΅¬μ„±ν•˜λŠ” 색 ν•„ν„°, 렌즈, 포토 λ‹€μ΄μ˜€λ“œλ₯Ό λŒ€μ²΄ν•˜λŠ” μ†Œμžλ‘œ μ œμ•ˆλ˜μ—ˆλ‹€. ν•˜μ§€λ§Œ, μ†Œν˜•ν™” 된 ν”½μ…€ 크기에 λŒ€μ‘ν•˜λŠ” λ©”νƒ€ν‘œλ©΄μ€ λ‚˜λ…Έ μ•ˆν…Œλ‚˜μ˜ λ™μž‘μ›λ¦¬μ™€ λ°°μ—΄μ˜ ν•œκ³„μ— μ˜ν•΄ μ„±λŠ₯이 μ œν•œλ˜μ—ˆλ‹€. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” μ΄ˆμ†Œν˜• ν”½μ…€λ‘œ κ΅¬μ„±λœ κΈ°μ‘΄ 이미지 μ„Όμ„œμ— λŒ€ν•œ ν™”μ§ˆμ„ 높일 수 μžˆλŠ” λ©”νƒ€ν‘œλ©΄ κ΄‘ν•™μ†Œμžλ₯Ό μ œμ‹œν•œλ‹€. 첫째둜, λ°˜μ‚¬λ₯Ό μ–΅μ œν•˜λŠ” ν‘μˆ˜ν˜• 색 필터에 λŒ€ν•΄μ„œ λ…Όμ˜ν•œλ‹€. κΈ°μ‘΄ λ©”νƒ€ν‘œλ©΄ 색 ν•„ν„° μ†Œμžμ—μ„œ ν•„μ—°μ μœΌλ‘œ λ°œμƒν•˜λŠ” λ‚΄λΆ€ λ°˜μ‚¬λŠ” 찍은 μ΄λ―Έμ§€μ—μ„œ ν”Œλ ˆμ–΄ ν˜„μƒμ„ μœ λ°œν•œλ‹€. λ³Έ λ…Όλ¬Έμ—μ„œλŠ” 쌍곑 λ©”νƒ€λ¬Όμ§ˆ μ•ˆν…Œλ‚˜μ˜ 흑수 곡진 λŒ€μ—­μ„ μž…μž 무리 μ΅œμ ν™” 방식을 μ΄μš©ν•΄ νŠΉμ • λŒ€μ—­ λ§Œμ„ νˆ¬κ³Όν•˜κ³  λ‚˜λ¨Έμ§€λŠ” ν‘μˆ˜ν•˜λŠ” 색 ν•„ν„°λ₯Ό μ„€κ³„ν•œλ‹€. 특히, 255 nm 크기 ν”½μ…€μ˜ 베이어 νŒ¨ν„΄ 색 ν•„ν„°λ₯Ό μ œμ‹œν•œλ‹€. λ‘˜μ§Έλ‘œ, 이미지 μ„Όμ„œμ˜ κ΄‘ νš¨μœ¨μ„ 높이기 μœ„ν•œ 색 λΆ„λ°° λ©”νƒ€ν‘œλ©΄μ„ μ œμ‹œν•œλ‹€. 이미지 μ„Όμ„œμ˜ 포토 λ‹€μ΄μ˜€λ“œλŠ” λ°΄λ“œ κ°­ μ΄μƒμ˜ μ—λ„ˆμ§€λ₯Ό κ°€μ§€λŠ” 빛에 λŒ€ν•΄ μ „κΈ°μ‹ ν˜Έλ‘œ λ³€ν™˜ν•˜λ―€λ‘œ, 색 ꡬ뢄을 μœ„ν•΄ ν‘μˆ˜ν˜• 색 ν•„ν„°λ₯Ό μ‚¬μš©ν•œλ‹€. μ΄λŠ” ν•˜λ‚˜μ˜ 픽셀을 κ΅¬μ„±ν•˜λŠ” μ²­, λ…Ή, 적색 필터에 μ˜ν•΄ 이미지 μ„Όμ„œμ˜ 전체 κ΄‘ 효율이 33 %둜 μ œν•œλ˜λŠ” 것을 μ˜λ―Έν•œλ‹€. λ”°λΌμ„œ, μ„œλΈŒ 픽셀에 μž…μ‚¬ν•˜λŠ” 빛을 색에 따라 λ‹€λ₯Έ λ°©ν–₯으둜 빛을 λΆ„λ°°ν•˜μ—¬ 기쑴의 κ΄‘ 효율 ν•œκ³„λ₯Ό λ„˜μ–΄μ„œλŠ” μžμœ ν˜• λ©”νƒ€ν‘œλ©΄ μ†Œμžλ₯Ό μ„€κ³„ν•œλ‹€. λ§ˆμ§€λ§‰μœΌλ‘œ, μ €μ‘°λ„μ˜ κ·Όμ μ™Έμ„ μ—μ„œ μ‹ ν˜Έ λŒ€ μž‘μŒλΉ„λ₯Ό 높일 수 μžˆλŠ” κ΄‘ 집속 μ†Œμžλ₯Ό μ œμ‹œν•œλ‹€. κΉ”λŒ€κΈ° λͺ¨μ–‘μ˜ ν”ŒλΌμ¦ˆλͺ¨λ‹‰ 개ꡬλ₯Ό 톡해 빛을 파μž₯보닀 맀우 μž‘μ€ 크기의 μ˜μ—­μ— μ§‘μ€‘μ‹œν‚¨λ‹€. μ§‘μ†λœ μ „κΈ°μž₯κ³Ό 자기μž₯은 κ³΅κ°„μ μœΌλ‘œ λΆ„ν¬λœ λ°˜λ„μ²΄μ™€ μƒν˜Έμž‘μš©ν•¨μœΌλ‘œμ¨, λ©”νƒ€ν‘œλ©΄μ˜ μ‘΄μž¬μ— 따라 κ°•ν™”λœ Purcell 효과λ₯Ό μ–»λŠ”λ‹€. λ³Έ λ°•μ‚¬ν•™μœ„ 논문은 이미지 μ„Όμ„œλ₯Ό μœ„ν•œ 기쑴의 μ œν•œλœ λ©”νƒ€ν‘œλ©΄ μ†Œμžλ₯Ό κ·Ήλ³΅ν•˜κ³ , μ΄ˆμ†Œν˜• ν”½μ…€μ˜ 이미지 μ„Όμ„œ 개발의 μ΄ˆμ„μ΄ 될 κ²ƒμœΌλ‘œ κΈ°λŒ€λœλ‹€. λ‚˜μ•„κ°€, 이미지 μ„Όμ„œλ₯Ό κ΅¬μ„±ν•˜λŠ” κ΄‘ν•™ μ‹œμŠ€ν…œμ„ λ©”νƒ€ν‘œλ©΄μœΌλ‘œ λŒ€μ²΄ν•  μƒˆλ‘œμš΄ ν”Œλž«νΌμ„ κ΅¬μΆ•ν•˜λŠ” 것에 κΈ°μ—¬ν•  κ²ƒμœΌλ‘œ κΈ°λŒ€λœλ‹€.Chapter 1 Introduction 1 1.1 Overview of CMOS image sensors 1 1.2 Toward high-resolution miniaturized pixel 2 1.3 Nanophotonic elements for high-resolution camera 3 1.4 Dissertation overview 5 Chapter 2 Light interaction with subwavelength antennas 7 2.1 Overview of plasmonic antenna 7 2.2 Overview of dielectric metasurface 9 2.3 Overview of hyperbolic metamaterials 11 Chapter 3. Absorptive metasurface color filter based on hyperbolic metamaterial for noise reduction 14 3.1 Introduction 14 3.2 Principle of hyperbolic metamaterial absorbers 17 3.3 Absorptive color filter design based on particle swarm optimization method 19 3.4 Numerical analysis on optimized metasurface color filters 23 3.4.1 Single color filter optimization 23 3.4.2 Angle tolerance for optimized metasurface color filters 26 3.5 Sub-micron metasurface color filter array 29 3.6 Conclusion 35 Chapter 4 High-efficient full-color pixel array based on freeform nanostructures for high-resolution image sensor 37 4.1 Introduction 37 4.2 Optimization of metasurface full-color splitter 40 4.3 Implementation of color splitters 46 4.4 Image quality evaluation 52 4.5 Discussion about off-axis color splitters 55 4.6 Conclusion 59 Chapter 5 Plasmonic metasurface cavity for simultaneous enhancement of optical electric and magnetic fields 60 5.1 Introduction 60 5.2 Working principle and numerical results 63 5.2.1 Principle of funnel-shaped metasurface cavity 63 5.2.2 Discussion 67 5.3 Experimental results 69 5.4 Purcell effect 72 5.5 Conclusion 74 Chapter 6 Conclusion 75 Appendix 78 A.1 Colorimetry 78 A.2 Color difference CIEDE2000 79 B. Related work 80 Bibliography 81λ°•

    VIDEO FOREGROUND LOCALIZATION FROM TRADITIONAL METHODS TO DEEP LEARNING

    Get PDF
    These days, detection of Visual Attention Regions (VAR), such as moving objects has become an integral part of many Computer Vision applications, viz. pattern recognition, object detection and classification, video surveillance, autonomous driving, human-machine interaction (HMI), and so forth. The moving object identification using bounding boxes has matured to the level of localizing the objects along their rigid borders and the process is called foreground localization (FGL). Over the decades, many image segmentation methodologies have been well studied, devised, and extended to suit the video FGL. Despite that, still, the problem of video foreground (FG) segmentation remains an intriguing task yet appealing due to its ill-posed nature and myriad of applications. Maintaining spatial and temporal coherence, particularly at object boundaries, persists challenging, and computationally burdensome. It even gets harder when the background possesses dynamic nature, like swaying tree branches or shimmering water body, and illumination variations, shadows cast by the moving objects, or when the video sequences have jittery frames caused by vibrating or unstable camera mounts on a surveillance post or moving robot. At the same time, in the analysis of traffic flow or human activity, the performance of an intelligent system substantially depends on its robustness of localizing the VAR, i.e., the FG. To this end, the natural question arises as what is the best way to deal with these challenges? Thus, the goal of this thesis is to investigate plausible real-time performant implementations from traditional approaches to modern-day deep learning (DL) models for FGL that can be applicable to many video content-aware applications (VCAA). It focuses mainly on improving existing methodologies through harnessing multimodal spatial and temporal cues for a delineated FGL. The first part of the dissertation is dedicated for enhancing conventional sample-based and Gaussian mixture model (GMM)-based video FGL using probability mass function (PMF), temporal median filtering, and fusing CIEDE2000 color similarity, color distortion, and illumination measures, and picking an appropriate adaptive threshold to extract the FG pixels. The subjective and objective evaluations are done to show the improvements over a number of similar conventional methods. The second part of the thesis focuses on exploiting and improving deep convolutional neural networks (DCNN) for the problem as mentioned earlier. Consequently, three models akin to encoder-decoder (EnDec) network are implemented with various innovative strategies to improve the quality of the FG segmentation. The strategies are not limited to double encoding - slow decoding feature learning, multi-view receptive field feature fusion, and incorporating spatiotemporal cues through long-shortterm memory (LSTM) units both in the subsampling and upsampling subnetworks. Experimental studies are carried out thoroughly on all conditions from baselines to challenging video sequences to prove the effectiveness of the proposed DCNNs. The analysis demonstrates that the architectural efficiency over other methods while quantitative and qualitative experiments show the competitive performance of the proposed models compared to the state-of-the-art

    Image segmentation and pigment mapping of cultural heritage based on spectral imaging

    Get PDF
    The goal of the work reported in this dissertation is to develop methods for image segmentation and pigment mapping of paintings based on spectral imaging. To reach this goal it is necessary to achieve sufficient spectral and colorimetric accuracies of both the spectral imaging system and pigment mapping. The output is a series of spatial distributions of pigments (or pigment maps) composing a painting. With these pigment maps, the change of the color appearance of the painting can be simulated when the optical properties of one or more pigments are altered. These pigment maps will also be beneficial for enriching the historical knowledge of the painting and aiding conservators in determining the best course for retouching damaged areas of the painting when metamerism is a factor. First, a new spectral reconstruction algorithm was developed based on Wyszecki’s hypothesis and the matrix R theory developed by Cohen and Kappauf. The method achieved both high spectral and colorimetric accuracies for a certain combination of illuminant and observer. The method was successfully tested with a practical spectral imaging system that included a traditional color-filter-array camera coupled with two optimized filters, developed in the Munsell Color Science Laboratory. The spectral imaging system was used to image test paintings, and the method was used to retrieve spectral reflectance factors for these paintings. Next, pigment mapping methods were brought forth, and these methods were based on Kubelka-Munk (K-M) turbid media theory that can predict spectral reflectance factor for a specimen from the optical properties of the specimen’s constituent pigments. The K-M theory has achieved practical success for opaque materials by reduction in mathematical complexity and elimination of controlling thickness. The use of the general K-M theory for the translucent samples was extensively studied, including determination of optical properties of pigments as functions of film thickness, and prediction of spectral reflectance factor of a specimen by selecting the right pigment combination. After that, an investigation was carried out to evaluate the impact of opacity and layer configuration of a specimen on pigment mapping. The conclusions were drawn from the comparisons of prediction accuracies of pigment mapping between opaque and translucent assumption, and between single and bi-layer assumptions. Finally, spectral imaging and pigment mapping were applied to three paintings. Large images were first partitioned into several small images, and each small image was segmented into different clusters based on either an unsupervised or supervised classification method. For each cluster, pigment mapping was done pixel-wise with a limited number of pigments, or with a limited number of pixels and then extended to other pixels based on a similarity calculation. For the masterpiece The Starry Night, these pigment maps can provide historical knowledge about the painting, aid conservators for inpainting damaged areas, and digitally rejuvenate the original color appearance of the painting (e.g. when the lead white was not noticeably darkened)

    Two calibration models for compensation of the individual elements properties of self-emitting displays

    Get PDF
    In this paper, we examine the applicability limits of different methods of compensation of the individual properties of self-emitting displays with significant non-uniformity of chromaticity and maximum brightness. The aim of the compensation is to minimize the perceived image non-uniformity. Compensation of the displayed image non-uniformity is based on minimizing the perceived distance between the target (ideally displayed) and the simulated image displayed by the calibrated screen. The S-CIELAB model of the human visual system properties is used to estimate the perceived distance between two images. In this work, we compare the efficiency of the channel-wise and linear (with channel mixing) compensation models depending on the models of variation in the characteristics of display elements (subpixels). It was found that even for a display with uniform chromatic subpixels characteristics, the linear model with channel mixing is superior in terms of compensation accuracy.This work was supported by Russian Science Foundation (Project No. 20-61-47089)

    Real-time multispectral fluorescence and reflectance imaging for intraoperative applications

    Get PDF
    Fluorescence guided surgery supports doctors by making unrecognizable anatomical or pathological structures become recognizable. For instance, cancer cells can be targeted with one fluorescent dye whereas muscular tissue, nerves or blood vessels can be targeted by other dyes to allow distinction beyond conventional color vision. Consequently, intraoperative imaging devices should combine multispectral fluorescence with conventional reflectance color imaging over the entire visible and near-infrared spectral range at video rate, which remains a challenge. In this work, the requirements for such a fluorescence imaging device are analyzed in detail. A concept based on temporal and spectral multiplexing is developed, and a prototype system is build. Experiments and numerical simulations show that the prototype fulfills the design requirements and suggest future improvements. The multispectral fluorescence image stream is processed to present fluorescent dye images to the surgeon using linear unmixing. However, artifacts in the unmixed images may not be noticed by the surgeon. A tool is developed in this work to indicate unmixing inconsistencies on a per pixel and per frame basis. In-silico optimization and a critical review suggest future improvements and provide insight for clinical translation

    Visual Human-Computer Interaction

    Get PDF

    Color in scientific visualization: Perception and image-based data display

    Get PDF
    Visualization is the transformation of information into a visual display that enhances users understanding and interpretation of the data. This thesis project has investigated the use of color and human vision modeling for visualization of image-based scientific data. Two preliminary psychophysical experiments were first conducted on uniform color patches to analyze the perception and understanding of different color attributes, which provided psychophysical evidence and guidance for the choice of color space/attributes for color encoding. Perceptual color scales were then designed for univariate and bivariate image data display and their effectiveness was evaluated through three psychophysical experiments. Some general guidelines were derived for effective color scales design. Extending to high-dimensional data, two visualization techniques were developed for hyperspectral imagery. The first approach takes advantage of the underlying relationships between PCA/ICA of hyperspectral images and the human opponent color model, and maps the first three PCs or ICs to several opponent color spaces including CIELAB, HSV, YCbCr, and YUV. The gray world assumption was adopted to automatically set the mapping origins. The rendered images are well color balanced and can offer a first look capability or initial classification for a wide variety of spectral scenes. The second approach combines a true color image and a PCA image based on a biologically inspired visual attention model that simulates the center-surround structure of visual receptive fields as the difference between fine and coarse scales. The model was extended to take into account human contrast sensitivity and include high-level information such as the second order statistical structure in the form of local variance map, in addition to low-level features such as color, luminance, and orientation. It generates a topographic saliency map for both the true color image and the PCA image, a difference map is then derived and used as a mask to select interesting locations where the PCA image has more salient features than available in the visible bands. The resulting representations preserve consistent natural appearance of the scene, while the selected attentional locations may be analyzed by more advanced algorithms
    corecore