35 research outputs found
Algorithms for the enhancement of dynamic range and colour constancy of digital images & video
One of the main objectives in digital imaging is to mimic the capabilities of the human eye, and perhaps, go beyond in certain aspects. However, the human visual system is so versatile, complex, and only partially understood that no up-to-date imaging technology has been able to accurately reproduce the capabilities of the it. The extraordinary capabilities of the human eye have become a crucial shortcoming in digital imaging, since digital photography, video recording, and computer vision applications have continued to demand more realistic and accurate imaging reproduction and analytic capabilities.
Over decades, researchers have tried to solve the colour constancy problem, as well as extending the dynamic range of digital imaging devices by proposing a number of algorithms and instrumentation approaches. Nevertheless, no unique solution has been identified; this is partially due to the wide range of computer vision applications that require colour constancy and high dynamic range imaging, and the complexity of the human visual system to achieve effective colour constancy and dynamic range capabilities.
The aim of the research presented in this thesis is to enhance the overall image quality within an image signal processor of digital cameras by achieving colour constancy and extending dynamic range capabilities. This is achieved by developing a set of advanced image-processing algorithms that are robust to a number of practical challenges and feasible to be implemented within an image signal processor used in consumer electronics imaging devises.
The experiments conducted in this research show that the proposed algorithms supersede state-of-the-art methods in the fields of dynamic range and colour constancy. Moreover, this unique set of image processing algorithms show that if they are used within an image signal processor, they enable digital camera devices to mimic the human visual system s dynamic range and colour constancy capabilities; the ultimate goal of any state-of-the-art technique, or commercial imaging device
Recommended from our members
Colour object search
The visual search process is required when locating an object in some region of space. To perform this search two capabilities must be available: the ability to recognise the object when it comes into view; and a way of selecting these views. Visual search is often complicated by object occlusion and low spatial resolutions of the object. Although the human visual system performs this task effortlessly, the mechanisms of it are not properly understood. Object colour and geometry, however do play an important role. This thesis develops an object search methodology which assumes that a computer vision system captures both wide-angle and zoomed images of the scene containing the object. Since most of the research has focused on object recognition using geometry, this system is purely colour-based. It is not expected that object colour will always give a definitive solution, however database pruning will often occur leading to reduced search times.
The thesis argues that because colour is salient and more resilient than geometry to decreases in spatial resolution, it is more appropriate for visual search when the object occupies a small spatial resolution in an image with a large field of view. It also demonstrates that colour can be used to recognise objects when they occupy most of the field of view; as well as discriminate between database models with similar colour proportions but different region topologies. These conclusions are supported by the results produced by three algorithms, two of which perform colour object search and one that performs colour object recognition.
The first object search algorithm uses image locations containing salient object colours as a method of selecting views. Each of these views are ranked indicating which view most likely contains the object. The second object search algorithm identifies image regions with similar colour and topology as the object. These results are produced in a best-first order. The object recognition algorithm uses an invariant based on region area to identify three corresponding model and image regions. A transformation is calculated to bring the model and object into the same viewpoint where region matches are based on position and colour.
Each of these methods produced good results in complex indoor scenes with fluorescence and/or tungsten filament lighting; also the search speeds were impressive
Particle Filters for Colour-Based Face Tracking Under Varying Illumination
Automatic human face tracking is the basis of robotic and active vision systems used for facial feature analysis, automatic surveillance, video conferencing, intelligent transportation, human-computer interaction and many other applications. Superior human face tracking will allow future safety surveillance systems which monitor drowsy drivers, or patients and elderly people at the risk of seizure or sudden falls and will perform with lower risk of failure in unexpected situations. This area has actively been researched in the current literature in an attempt to make automatic face trackers more stable in challenging real-world environments. To detect faces in video sequences, features like colour, texture, intensity, shape or motion is used. Among these feature colour has been the most popular, because of its insensitivity to orientation and size changes and fast process-ability. The challenge of colour-based face trackers, however, has been dealing with the instability of trackers in case of colour changes due to the drastic variation in environmental illumination. Probabilistic tracking and the employment of particle filters as powerful Bayesian stochastic estimators, on the other hand, is increasing in the visual tracking field thanks to their ability to handle multi-modal distributions in cluttered scenes. Traditional particle filters utilize transition prior as importance sampling function, but this can result in poor posterior sampling. The objective of this research is to investigate and propose stable face tracker capable of dealing with challenges like rapid and random motion of head, scale changes when people are moving closer or further from the camera, motion of multiple people with close skin tones in the vicinity of the model person, presence of clutter and occlusion of face. The main focus has been on investigating an efficient method to address the sensitivity of the colour-based trackers in case of gradual or drastic illumination variations. The particle filter is used to overcome the instability of face trackers due to nonlinear and random head motions. To increase the traditional particle filter\u27s sampling efficiency an improved version of the particle filter is introduced that considers the latest measurements. This improved particle filter employs a new colour-based bottom-up approach that leads particles to generate an effective proposal distribution. The colour-based bottom-up approach is a classification technique for fast skin colour segmentation. This method is independent to distribution shape and does not require excessive memory storage or exhaustive prior training. Finally, to address the adaptability of the colour-based face tracker to illumination changes, an original likelihood model is proposed based of spatial rank information that considers both the illumination invariant colour ordering of a face\u27s pixels in an image or video frame and the spatial interaction between them. The original contribution of this work lies in the unique mixture of existing and proposed components to improve colour-base recognition and tracking of faces in complex scenes, especially where drastic illumination changes occur. Experimental results of the final version of the proposed face tracker, which combines the methods developed, are provided in the last chapter of this manuscript
Factor Graphs for Computer Vision and Image Processing
Factor graphs have been used extensively in the decoding of error
correcting codes such as turbo codes, and in signal processing.
However, while computer vision and pattern recognition are awash
with graphical model usage, it is some-what surprising that
factor graphs are still somewhat under-researched in these
communities. This is surprising because factor graphs naturally
generalise both Markov random fields and Bayesian networks.
Moreover, they are useful in modelling relationships between
variables that are not necessarily probabilistic and allow for
efficient marginalisation via a sum-product of probabilities.
In this thesis, we present and illustrate the utility of factor
graphs in the vision community through some of the field’s
popular problems. The thesis does so with a particular focus on
maximum a posteriori (MAP) inference in graphical
structures with layers. To this end, we are able to break-down
complex problems into factored representations and more
computationally realisable constructions. Firstly, we present a
sum-product framework that uses the explicit factorisation
in local subgraphs from the partitioned factor graph of a layered
structure to perform inference. This provides an efficient method
to perform inference since exact inference is attainable in the
resulting local subtrees. Secondly, we extend this framework to
the entire graphical structure without partitioning, and discuss
preliminary ways to combine outputs from a multilevel
construction. Lastly, we further our endeavour to combine
evidence from different methods through
a simplicial spanning tree reparameterisation of the factor graph
in a way that ensures consistency, to produce an ensembled and
improved result. Throughout the thesis, the underlying feature we
make use of is to enforce adjacency constraints using Delaunay
triangulations computed by adding points dynamically, or using a
convex hull algorithm. The adjacency relationships from Delaunay
triangulations aid the factor graph approaches in this thesis to
be both efficient and
competitive for computer vision tasks. This is because of the low
treewidth they provide in local subgraphs, as well as the
reparameterised interpretation of the graph they form through the
spanning tree of simplexes. While exact inference is known to be
intractable for junction trees obtained from the loopy graphs in
computer vision, in this thesis we are able to effect exact
inference on our spanning tree of simplexes. More importantly,
the approaches presented here are not restricted to the computer
vision and image processing fields, but are extendable to more
general applications that involve distributed computations
A context-sensitive meta-classifier for color-naming
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 93-97).Humans are sensitive to situational and semantic context when applying labels to colors. This is especially challenging for algorithms which attempt to replicate human categorization for communicative tasks. Additionally, mismatched color models between dialog partners can lead to a back-and-forth negotiation of terms to find common ground. This thesis presents a color-classification algorithm that takes advantage of a dialog-like interaction model to provide fast-adaptation for a specific exchange. The model learned in each exchange is then integrated into the system as a whole. This algorithm is an incremental meta-learner, leveraging a generic online-learner and adding context-sensitivity. A human study is presented, assessing the extent of semantic contextual effects on color naming. An evaluation of the algorithm based on the corpus gathered in this experiment is then tendered.by Rony Daniel Kubat.S.M
Colour Communication Within Different Languages
For computational methods aiming to reproduce colour names that are meaningful to speakers of different languages, the mapping between perceptual and linguistic aspects of colour is a problem of central information processing. This thesis advances the field of computational colour communication within different languages in five main directions. First, we show that web-based experimental methodologies offer considerable advantages in obtaining a large number of colour naming responses in British and American English, Greek, Russian, Thai and Turkish. We continue with the application of machine learning methods to discover criteria in linguistic, behavioural and geometric features of colour names that distinguish classes of colours. We show that primary colour terms do not form a coherent class, whilst achromatic and basic classes do. We then propose and evaluate a computational model trained by human responses in the online experiment to automate the assignment of colour names in different languages across the full three-dimensional colour gamut. Fourth, we determine for the first time the location of colour names within a physiologically-based cone excitation space through an unconstrained colour naming experiment using a calibrated monitor under controlled viewing conditions. We show a good correspondence between online and offline datasets; and confirm the validity of both experimental methodologies for estimating colour naming functions in laboratory and real-world monitor settings. Finally, we present a novel information theoretic measure, called dispensability, for colour categories that predicts a gradual scale of basicness across languages from both web- and laboratory- based unconstrained colour naming datasets. As a result, this thesis contributes experimental and computational methodologies towards the development of multilingual colour communication schemes
Analytical methods fort he study of color in digital images
La descripció qualitativa dels colors que composen una imatge digital és una tasca molt senzilla pel sistema visual humà. Per un ordinador aquesta tasca involucra una gran quantitat de qüestions i de dades que la converteixen en una operació de gran complexitat. En aquesta tesi desenvolupam un mètode automàtic per a la construcció d’una paleta de colors d’una imatge digital, intentant respondre a les diferents qüestions que se’ns plantegen quan treballam amb colors a dins el món computacional. El desenvolupament d’aquest mètode suposa l’obtenció d’un algorisme automàtic de segmentació d’histogrames, el qual és construït en detall a la tesi i diferents aplicacions del mateix son donades. Finalment, també s’explica el funcionament de CProcess, un ‘software’ amigable desenvolupat per a la fàcil comprensió del color
A STUDY OF ILLUMINANT ESTIMATION AND GROUND TRUTH COLORS FOR COLOR CONSTANCY
Ph.DDOCTOR OF PHILOSOPH
Computational mechanisms for colour and lightness constancy
Attributes of colour images have been found which allow colour and lightness constancy to be computed without prior knowledge of the illumination, even in complex scenes with three -dimensional objects and multiple light sources of different colours. The ratio of surface reflectance colour can be immediately determined between any two image points, however distant. It is possible to determine the number of spectrally independent light sources, and to isolate the effect of each. Reflectance edges across which the illumination remains constant can be correctly identified.In a scene illuminated by multiple distant point sources of distinguishalbe colours, the spatial angle between the sources and their brightness ratios can be computed from the image alone. If there are three or more sources then reflectance constancy is immediately possible without use of additional knowledge.The results are an extension of Edwin Land's Retinex algorithm. They account for previously unexplained data such as Gilchrist's veiling luminances and his single- colour rooms.The validity of the algorithms has been demonstrated by implementing them in a series of computer programs. The computational methods do not follow the edge or region finding paradigms of previous vision mechanisms. Although the new reflectance constancy cues occur in all normal scenes, it is likely that human vision makes use of only some of them.In a colour image all the pixels of a single surface colour lie in a single structure in flux space. The dimension of the structure equals the number of illumination colours. The reflectance ratio between two regions is determined by the transformation between their structures. Parallel tracing of edge pairs in their respective structures identifies an edge of constant illumination, and gives the lightness ratio of each such edge. Enhanced noise reduction techniques for colour pictures follow from the natural constraints on the flux structures
Configural and perceptual factors influencing the perception of color transparency
The mechanisms by which the brain represents colors are largely unknown. In addition, the large number of color phenomena in the natural world has made understanding color rather difficult. Color transparency perception, which is studied in this thesis, is precisely one of these interesting phenomena: when a surface is seen both in plain view and through a transparent overlay, the visual system still identifies it as a single surface. Processes of the visual system have widely inspired researchers in many domains such as neurosciences, psychology, as well as computer vision. The progress of digital imaging technologies requires research engineers to deal with issues that demand knowledge of human visual processing. To humans, an image is not a random collection of pixels, but a meaningful arrangement of regions and objects. One thus can be inspired by the human visual system to investigate color representation and its applicability to digital image processing. Finding a model of perception is still a challenging matter for researchers among multidisciplinary fields. This thesis discusses the problem of defining an accurate model of transparency perception. Despite the large number of studies on this topic, the underlying mechanisms are still not well understood. Investigating perceptual transparency is challenging due to its interactions with different visual phenomena, but the most intensively studied conditions for perceptual transparency are those involving achromatic luminance and chromatic constraints. Although these models differ in many aspects, a broad distinction can be drawn between models of additive and subtractive transparency. The General Convergence Model (GCM) combines both additive and subtractive color mixtures in showing that systematic chromatic changes in a linear color space, such as translation and convergence (or a combination of both), lead to perceptual transparency. However, while this model seems to be a necessary condition, it is not a sufficient one for transparency perception. A first motivation of this thesis was to evaluate and define situations more general than the GCM. Several chromatic changes consistent or not with the GCM were generated. Additional parameters, such as configural complexity, luminance level, magnitude of the chromatic change and shift direction were tested. The main results showed that observers' responses are influenced by each of the above cited parameters. Convergences appear significantly more transparent when motion is added for bipartite configurations, or when they are generated in a checkerboard configuration. Translations are influenced by both configuration and motion. Shears are described as opaque, except when short vector lengths are combined with motion: the overlay tends to be transparent. Divergences are strongly affected by motion and vector lengths, and rotations by a combination of checkerboard configuration with luminance level and vector length. These results question the generality of the GCM. We also investigated the effects of shadows on the perception of a transparent filter. An attempt to extend these models to handle transparency perception in complex scenes involving surfaces varying in shape and depth, change in conditions of illumination and shadow, is described. A lightness-matching task was performed to evaluate how much constancy is shown by the subject among six experimental conditions, in which shadow position, shadow blur, shadow and filter blending values were varied. The results showed that lightness constancy is very high even if surfaces were seen under both filter and shadow. A systematic deviation from perfect constancy in a manner consistent with a perceived additive shift was also observed. Because the GCM includes additive mixture and is related to color and lightness constancy, these results are promising and may be explained ultimately by this model