22 research outputs found
The Effect of Exposure on MaxRGB Color Constancy
The performance of the MaxRGB illumination-estimation method for color constancy and automatic white balancing has been reported in the literature as being mediocre at best; however, MaxRGB has usually been tested on images of only 8-bits per channel. The question arises as to whether the method itself is inadequate, or rather whether it has simply been tested on data of inadequate dynamic range. To address this question, a database of sets of exposure-bracketed images was created. The image sets include exposures ranging from very underexposed to slightly overexposed. The color of the scene illumination was determined by taking an extra image of the scene containing 4 Gretag Macbeth mini Colorcheckers placed at an angle to one another. MaxRGB was then run on the images of increasing exposure. The results clearly show that its performance drops dramatically when the 14-bit exposure range of the Nikon D700 camera is exceeded, thereby resulting in clipping of high values. For those images exposed such that no clipping occurs, the median error in MaxRGB’s estimate of the color of the scene illumination is found to be relatively small
Rank-Based Illumination Estimation
A new two-stage illumination estimation method based on the concept of rank is presented. The method first estimates the illuminant locally in subwindows using a ranking of digital counts in each color channel and then combines local subwindow estimates again based on a ranking of the local estimates. The proposed method unifies the MaxRGB and Grayworld methods. Despite its simplicity, the performance of the method is found to be competitive with other state-of-the art methods for estimating the chromaticity of the overall scene illumination
The Rehabilitation of MaxRGB
The poor performance of the MaxRGB illuminationestimation method is often used in the literature as a foil when promoting some new illumination-estimation method. However, the results presented here show that in fact MaxRGB works surprisingly well when tested on a new dataset of 105 high dynamic range images, and also better than previously reported when some simple pre-processing is applied to the images of the standard 321 image set [1]. The HDR images in the dataset for color constancy research were constructed in the standard way from multiple exposures of the same scene. The color of the scene illumination was determined by photographing an extra HDR image of the scene with 4 Gretag Macbeth mini Colorcheckers at 45 degrees relative to one another placed in it. With preprocessing, MaxRGB’s performance is statistically equivalent to that of Color by Correlation [2] and statistically superior to that of the Greyedge [3] algorithm on the 321 set (null hypothesis rejected at the 5% significance level). It also performs as well as Greyedge on the HDR set. These results demonstrate that MaxRGB is far more effective than it has been reputed to be so long as it is applied to image data that encodes the full dynamic range of the original scene
Algorithms for the enhancement of dynamic range and colour constancy of digital images & video
One of the main objectives in digital imaging is to mimic the capabilities of the human eye, and perhaps, go beyond in certain aspects. However, the human visual system is so versatile, complex, and only partially understood that no up-to-date imaging technology has been able to accurately reproduce the capabilities of the it. The extraordinary capabilities of the human eye have become a crucial shortcoming in digital imaging, since digital photography, video recording, and computer vision applications have continued to demand more realistic and accurate imaging reproduction and analytic capabilities.
Over decades, researchers have tried to solve the colour constancy problem, as well as extending the dynamic range of digital imaging devices by proposing a number of algorithms and instrumentation approaches. Nevertheless, no unique solution has been identified; this is partially due to the wide range of computer vision applications that require colour constancy and high dynamic range imaging, and the complexity of the human visual system to achieve effective colour constancy and dynamic range capabilities.
The aim of the research presented in this thesis is to enhance the overall image quality within an image signal processor of digital cameras by achieving colour constancy and extending dynamic range capabilities. This is achieved by developing a set of advanced image-processing algorithms that are robust to a number of practical challenges and feasible to be implemented within an image signal processor used in consumer electronics imaging devises.
The experiments conducted in this research show that the proposed algorithms supersede state-of-the-art methods in the fields of dynamic range and colour constancy. Moreover, this unique set of image processing algorithms show that if they are used within an image signal processor, they enable digital camera devices to mimic the human visual system s dynamic range and colour constancy capabilities; the ultimate goal of any state-of-the-art technique, or commercial imaging device
Highlights Analysis System (HAnS) for low dynamic range to high dynamic range conversion of cinematic low dynamic range content
We propose a novel and efficient algorithm for detection of specular reflections and light sources (highlights) in cinematic content. The detection of highlights is important for reconstructing them properly in the conversion of the low dynamic range (LDR) to high dynamic range (HDR) content. Highlights are often difficult to be distinguished from bright diffuse surfaces, due to their brightness being reduced in the conventional LDR content production. Moreover, the cinematic LDR content is subject to the artistic use of effects that change the apparent brightness of certain image regions (e.g. limiting depth of field, grading, complex multi-lighting setup, etc.). To ensure the robustness of highlights detection to these effects, the proposed algorithm goes beyond considering only absolute brightness and considers five different features. These features are: the size of the highlight relative to the size of the surrounding image structures, the relative contrast in the surrounding of the highlight, its absolute brightness expressed through the luminance (luma feature), through the saturation in the color space (maxRGB feature) and through the saturation in white (minRGB feature). We evaluate the algorithm on two different image data-sets. The first one is a publicly available LDR image data-set without cinematic content, which allows comparison to the broader State of the art. Additionally, for the evaluation on cinematic content, we create an image data-set consisted of manually annotated cinematic frames and real-world images. For the purpose of demonstrating the proposed highlights detection algorithm in a complete LDR-to-HDR conversion pipeline, we additionally propose a simple inverse-tone-mapping algorithm. The experimental analysis shows that the proposed approach outperforms conventional highlights detection algorithms on both image data-sets, achieves high quality reconstruction of the HDR content and is suited for use in LDR-to-HDR conversion
Screening for Neonatal Jaundice by Smartphone Sclera Imaging
Jaundice is observed in over 60% of neonates and must be carefully monitored. Ifsevere cases go unnoticed, death or permanent disability can result. Neonatal jaun-dice causes 100,000 deaths yearly, with low-income countries in Africa and SouthAsia particularly affected. There is an unmet need for an accessible and objectivescreening method. This thesis proposes a smartphone camera-based method forscreening based on quantification of yellow discolouration in the sclera.The primary aim is to develop and test an app to screen for neonatal jaundicethat requires only the smartphone itself. To this end, a novel ambient subtractionmethod is proposed and validated, with less dependence on external hardware orcolour cards than previous app-based methods. Another aim is to investigate thebenefits of screening via the sclera. An existing dataset of newborn sclera images(n=87) is used to show that sclera chromaticity can predict jaundice severity.The neoSCB app is developed to predict total serum bilirubin (TSB) fromambient-subtracted sclera chromaticity via a flash/ no-flash image pair. A studyis conducted in Accra, Ghana to evaluate the app. With 847 capture sessions, thisis the largest study on image-based jaundice detection to date. A model trained onsclera chromaticity is found to be more accurate than one based on skin. The modelis validated on an independent dataset collected at UCLH (n=38).The neoSCB app has a sensitivity of 100% and a specificity of 76% in iden-tifying neonates with TSB≥250μmol/L (n=179). This is equivalent to the TcB(JM-105) data collected concurrently, and as good as the best-performing app in theliterature (BiliCam). Following a one-time calibration, neoSCB works without spe-cialist equipment, which could help widen access to effective jaundice screening
Recommended from our members
Multimodal Indexing of Presentation Videos
This thesis presents four novel methods to help users efficiently and effectively retrieve information from unstructured and unsourced multimedia sources, in particular the increasing amount and variety of presentation videos such as those in e-learning, conference recordings, corporate talks, and student presentations. We demonstrate a system to summarize, index and cross-reference such videos, and measure the quality of the produced indexes as perceived by the end users. We introduce four major semantic indexing cues: text, speaker faces, graphics, and mosaics, going beyond standard tag based searches and simple video playbacks. This work aims at recognizing visual content "in the wild", where the system cannot rely on any additional information besides the video itself. For text, within a scene text detection and recognition framework, we present a novel locally optimal adaptive binarization algorithm, implemented with integral histograms. It determines of an optimal threshold that maximizes the between-classes variance within a subwindow, with computational complexity independent from the size of the window itself. We obtain character recognition rates of 74%, as validated against ground truth of 8 presentation videos spanning over 1 hour and 45 minutes, which almost doubles the baseline performance of an open source OCR engine. For speaker faces, we detect, track, match, and finally select a humanly preferred face icon per speaker, based on three quality measures: resolution, amount of skin, and pose. We register a 87% accordance (51 out of 58 speakers) between the face indexes automatically generated from three unstructured presentation videos of approximately 45 minutes each, and human preferences recorded through Mechanical Turk experiments. For diagrams, we locate graphics inside frames showing a projected slide, cluster them according to an on-line algorithm based on a combination of visual and temporal information, and select and color-correct their representatives to match human preferences recorded through Mechanical Turk experiments. We register 71% accuracy (57 out of 81 unique diagrams properly identified, selected and color-corrected) on three hours of videos containing five different presentations. For mosaics, we combine two existing suturing measures, to extend video images into in-the-world coordinate system. A set of frames to be registered into a mosaic are sampled according to the PTZ camera movement, which is computed through least square estimation starting from the luminance constancy assumption. A local features based stitching algorithm is then applied to estimate the homography among a set of video frames and median blending is used to render pixels in overlapping regions of the mosaic. For two of these indexes, namely faces and diagrams, we present two novel MTurk-derived user data collections to determine viewer preferences, and show that they are matched in selection by our methods. The net result work of this thesis allows users to search, inside a video collection as well as within a single video clip, for a segment of presentation by professor X on topic Y, containing graph Z
INTEL-TAU : A Color Constancy Dataset
In this paper, we describe a new large dataset for illumination estimation. This dataset, called INTEL-TAU, contains 7022 images in total, which makes it the largest available high-resolution dataset for illumination estimation research. The variety of scenes captured using three different camera models, namely Canon 5DSR, Nikon D810, and Sony IMX135, makes the dataset appropriate for evaluating the camera and scene invariance of the different illumination estimation techniques. Privacy masking is done for sensitive information, e.g., faces. Thus, the dataset is coherent with the new General Data Protection Regulation (GDPR). Furthermore, the effect of color shading for mobile images can be evaluated with INTEL-TAU dataset, as both corrected and uncorrected versions of the raw data are provided. Furthermore, this paper benchmarks several color constancy approaches on the proposed dataset