548 research outputs found
Ridge Regression Approach to Color Constancy
This thesis presents the work on color constancy and its application in the field of computer vision. Color constancy is a phenomena of representing (visualizing) the reflectance properties of the scene independent of the illumination spectrum. The motivation behind this work is two folds:The primary motivation is to seek ‘consistency and stability’ in color reproduction and algorithm performance respectively because color is used as one of the important features in many computer vision applications; therefore consistency of the color features is essential for high application success. Second motivation is to reduce ‘computational complexity’ without sacrificing the primary motivation.This work presents machine learning approach to color constancy. An empirical model is developed from the training data. Neural network and support vector machine are two prominent nonlinear learning theories. The work on support vector machine based color constancy shows its superior performance over neural networks based color constancy in terms of stability. But support vector machine is time consuming method. Alternative approach to support vectormachine, is a simple, fast and analytically solvable linear modeling technique known as ‘Ridge regression’. It learns the dependency between the surface reflectance and illumination from a presented training sample of data. Ridge regression provides answer to the two fold motivation behind this work, i.e., stable and computationally simple approach. The proposed algorithms, ‘Support vector machine’ and ‘Ridge regression’ involves three step processes: First, an input matrix constructed from the preprocessed training data set is trained toobtain a trained model. Second, test images are presented to the trained model to obtain the chromaticity estimate of the illuminants present in the testing images. Finally, linear diagonal transformation is performed to obtain the color corrected image. The results show the effectiveness of the proposed algorithms on both calibrated and uncalibrated data set in comparison to the methods discussed in literature review. Finally, thesis concludes with a complete discussion and summary on comparison between the proposed approaches and other algorithms
Spectral methods for multimodal data analysis
Spectral methods have proven themselves as an important and versatile tool in a wide range of problems in the fields of computer graphics, machine learning, pattern recognition, and computer vision, where many important problems boil down to constructing a Laplacian operator and finding a few of its eigenvalues and eigenfunctions. Classical examples include the computation of diffusion distances on manifolds in computer graphics, Laplacian eigenmaps, and spectral clustering in machine learning. In many cases, one has to deal with multiple data spaces simultaneously. For example, clustering multimedia data in machine learning applications involves various modalities or ``views'' (e.g., text and images), and finding correspondence between shapes in computer graphics problems is an operation performed between two or more modalities. In this thesis, we develop a generalization of spectral methods to deal with multiple data spaces and apply them to problems from the domains of computer graphics, machine learning, and image processing. Our main construction is based on simultaneous diagonalization of Laplacian operators. We present an efficient numerical technique for computing joint approximate eigenvectors of two or more Laplacians in challenging noisy scenarios, which also appears to be the first general non-smooth manifold optimization method. Finally, we use the relation between joint approximate diagonalizability and approximate commutativity of operators to define a structural similarity measure for images. We use this measure to perform structure-preserving color manipulations of a given image
Adaptive Methods for Color Vision Impaired Users
Color plays a key role in the understanding of the information in computer environments. It
happens that about 5% of the world population is affected by color vision deficiency (CVD),
also called color blindness. This visual impairment hampers the color perception, ending up by
limiting the overall perception that CVD people have about the surrounding environment, no
matter it is real or virtual. In fact, a CVD individual may not distinguish between two different
colors, what often originates confusion or a biased understanding of the reality, including web
environments, whose web pages are plenty of media elements like text, still images, video,
sprites, and so on.
Aware of the difficulties that color-blind people may face in interpreting colored contents,
a significant number of recoloring algorithms have been proposed in the literature with the
purpose of improving the visual perception of those people somehow. However, most of those
algorithms lack a systematic study of subjective assessment, what undermines their validity, not
to say usefulness. Thus, in the sequel of the research work behind this Ph.D. thesis, the central
question that needs to be answered is whether recoloring algorithms are of any usefulness and
help for colorblind people or not.
With this in mind, we conceived a few preliminary recoloring algorithms that were published in
conference proceedings elsewhere. Except the algorithm detailed in Chapter 3, these conference
algorithms are not described in this thesis, though they have been important to engender
those presented here. The first algorithm (Chapter 3) was designed and implemented for people
with dichromacy to improve their color perception. The idea is to project the reddish hues onto
other hues that are perceived more regularly by dichromat people.
The second algorithm (Chapter 4) is also intended for people with dichromacy to improve their
perception of color, but its applicability covers the adaptation of text and image, in HTML5-
compliant web environments. This enhancement of color contrast of text and imaging in web
pages is done while keeping the naturalness of color as much as possible. Also, to the best of our
knowledge, this is the first web recoloring approach targeted to dichromat people that takes
into consideration both text and image recoloring in an integrated manner.
The third algorithm (Chapter 5) primarily focuses on the enhancement of some of the object
contours in still images, instead of recoloring the pixels of the regions bounded by such contours.
Enhancing contours is particularly suited to increase contrast in images, where we find adjacent
regions that are color indistinguishable from dichromat’s point of view. To our best knowledge,
this is one of the first algorithms that take advantage of image analysis and processing techniques
for region contours.
After accurate subjective assessment studies for color-blind people, we concluded that the CVD
adaptation methods are useful in general. Nevertheless, each method is not efficient enough to
adapt all sorts of images, that is, the adequacy of each method depends on the type of image
(photo-images, graphical representations, etc.).
Furthermore, we noted that the experience-based perceptual learning of colorblind people
throughout their lives determines their visual perception. That is, color adaptation algorithms must satisfy requirements such as color naturalness and consistency, to ensure that dichromat
people improve their visual perception without artifacts. On the other hand, CVD adaptation
algorithms should be object-oriented, instead of pixel-oriented (as typically done), to select
judiciously pixels that should be adapted. This perspective opens an opportunity window for
future research in color accessibility in the field of in human-computer interaction (HCI).A cor desempenha um papel fundamental na compreensão da informação em ambientes computacionais.
Porém, cerca de 5% da população mundial é afetada pela deficiência de visão de
cor (ou Color Vision Deficiency (CVD), do Inglês), correntemente designada por daltonismo. Esta
insuficiência visual dificulta a perceção das cores, o que limita a perceção geral que os indivÃduos
têm sobre o meio, seja real ou virtual. Efetivamente, um indivÃduo com CVD vê como iguais
cores que são diferentes, o que origina confusão ou uma compreensão distorcida da realidade,
assim como dos ambientes web, onde existe uma abundância de conteúdos média coloridos,
como texto, imagens fixas e vÃdeo, entre outros.
Com o intuito de mitigar as dificuldades que as pessoas com CVD enfrentam na interpretação de
conteúdos coloridos, tem sido proposto na literatura um número significativo de algoritmos de
recoloração, que têm como o objetivo melhorar, de alguma forma, a perceção visual de pessoas
com CVD. Porém, a maioria desses trabalhos carece de um estudo sistemático de avaliação
subjetiva, o que põe em causa a sua validação, se não mesmo a sua utilidade. Assim, a principal
questão à qual se pretende responder, como resultado do trabalho de investigação subjacente
a esta tese de doutoramento, é se os algoritmos de recoloração têm ou não uma real utilidade,
constituindo assim uma ajuda efetiva às pessoas com daltonismo.
Tendo em mente esta questão, concebemos alguns algoritmos de recoloração preliminares que
foram publicados em atas de conferências. Com exceção do algoritmo descrito no CapÃtulo 3,
esses algoritmos não são descritos nesta tese, não obstante a sua importância na conceção
daqueles descritos nesta dissertação. O primeiro algoritmo (CapÃtulo 3) foi projetado e implementado
para pessoas com dicromacia, a fim de melhorar a sua perceção da cor. A ideia consiste
em projetar as cores de matiz avermelhada em matizes que são melhor percebidos pelas pessoas
com os tipos de daltonismo em causa.
O segundo algoritmo (CapÃtulo 4) também se destina a melhorar a perceção da cor por parte de
pessoas com dicromacia, porém a sua aplicabilidade abrange a adaptação de texto e imagem,
em ambientes web compatÃveis com HTML5. Isto é conseguido através do realce do contraste
de cores em blocos de texto e em imagens, em páginas da web, mantendo a naturalidade da
cor tanto quanto possÃvel. Além disso, tanto quanto sabemos, esta é a primeira abordagem de
recoloração em ambiente web para pessoas com dicromacia, que trata o texto e a imagem de
forma integrada.
O terceiro algoritmo (CapÃtulo 5) centra-se principalmente na melhoria de alguns dos contornos
de objetos em imagens, em vez de aplicar a recoloração aos pixels das regiões delimitadas por
esses contornos. Esta abordagem é particularmente adequada para aumentar o contraste em
imagens, quando existem regiões adjacentes que são de cor indistinguÃvel sob a perspetiva dos
observadores com dicromacia. Também neste caso, e tanto quanto é do nosso conhecimento,
este é um dos primeiros algoritmos em que se recorre a técnicas de análise e processamento de
contornos de regiões.
Após rigorosos estudos de avaliação subjetiva com pessoas com daltonismo, concluiu-se que os
métodos de adaptação CVD são úteis em geral. No entanto, cada método não é suficientemente
eficiente para todos os tipo de imagens, isto é, o desempenho de cada método depende do tipo de imagem (fotografias, representações gráficas, etc.).
Além disso, notámos que a aprendizagem perceptual baseada na experiência das pessoas daltónicas
ao longo de suas vidas é determinante para perceber aquilo que vêem. Isto significa que os
algoritmos de adaptação de cor devem satisfazer requisitos tais como a naturalidade e a consistência
da cor, de modo a não pôr em causa aquilo que os destinatários consideram razoável
ver no mundo real. Por outro lado, a abordagem seguida na adaptação CVD deve ser orientada
aos objetos, em vez de ser orientada aos pixéis (como tem sido feito até ao momento), de
forma a possibilitar uma seleção mais criteriosa dos pixéis que deverão ser sujeitos ao processo
de adaptação. Esta perspectiva abre uma janela de oportunidade para futura investigação em
acessibilidade da cor no domÃnio da interacção humano-computador (HCI)
Whole Word Phonetic Displays for Speech Articulation Training
The main objective of this dissertation is to investigate and develop speech recognition technologies for speech training for people with hearing impairments. During the course of this work, a computer aided speech training system for articulation speech training was also designed and implemented. The speech training system places emphasis on displays to improve children\u27s pronunciation of isolated Consonant-Vowel-Consonant (CVC) words, with displays at both the phonetic level and whole word level. This dissertation presents two hybrid methods for combining Hidden Markov Models (HMMs) and Neural Networks (NNs) for speech recognition. The first method uses NN outputs as posterior probability estimators for HMMs. The second method uses NNs to transform the original speech features to normalized features with reduced correlation. Based on experimental testing, both of the hybrid methods give higher accuracy than standard HMM methods. The second method, using the NN to create normalized features, outperforms the first method in terms of accuracy. Several graphical displays were developed to provide real time visual feedback to users, to help them to improve and correct their pronunciations
Computer mediated colour fidelity and communication
Developments in technology have meant that computercontrolled
imaging devices are becoming more powerful and more
affordable. Despite their increasing prevalence, computer-aided
design and desktop publishing software has failed to keep pace, leading
to disappointing colour reproduction across different devices.
Although there has been a recent drive to incorporate colour management
functionality into modern computer systems, in general this
is limited in scope and fails to properly consider the way in which
colours are perceived. Furthermore, differences in viewing conditions
or representation severely impede the communication of colour
between groups of users.
The approach proposed here is to provide WYSIWYG colour
across a range of imaging devices through a combination of existing
device characterisation and colour appearance modeling techniques.
In addition, to further facilitate colour communication, various common
colour notation systems are defined by a series of mathematical
mappings. This enables both the implementation of computer-based
colour atlases (which have a number of practical advantages over
physical specifiers) and also the interrelation of colour represented in
hitherto incompatible notations.
Together with the proposed solution, details are given of a computer
system which has been implemented. The system was used by
textile designers for a real task. Prior to undertaking this work,
designers were interviewed in order to ascertain where colour played
an important role in their work and where it was found to be a problem.
A summary of the findings of these interviews together with a
survey of existing approaches to the problems of colour fidelity and
communication in colour computer systems are also given. As background
to this work, the topics of colour science and colour imaging
are introduced
Introduction to Facial Micro Expressions Analysis Using Color and Depth Images: A Matlab Coding Approach (Second Edition, 2023)
The book attempts to introduce a gentle introduction to the field of Facial
Micro Expressions Recognition (FMER) using Color and Depth images, with the aid
of MATLAB programming environment. FMER is a subset of image processing and it
is a multidisciplinary topic to analysis. So, it requires familiarity with
other topics of Artifactual Intelligence (AI) such as machine learning, digital
image processing, psychology and more. So, it is a great opportunity to write a
book which covers all of these topics for beginner to professional readers in
the field of AI and even without having background of AI. Our goal is to
provide a standalone introduction in the field of MFER analysis in the form of
theorical descriptions for readers with no background in image processing with
reproducible Matlab practical examples. Also, we describe any basic definitions
for FMER analysis and MATLAB library which is used in the text, that helps
final reader to apply the experiments in the real-world applications. We
believe that this book is suitable for students, researchers, and professionals
alike, who need to develop practical skills, along with a basic understanding
of the field. We expect that, after reading this book, the reader feels
comfortable with different key stages such as color and depth image processing,
color and depth image representation, classification, machine learning, facial
micro-expressions recognition, feature extraction and dimensionality reduction.
The book attempts to introduce a gentle introduction to the field of Facial
Micro Expressions Recognition (FMER) using Color and Depth images, with the aid
of MATLAB programming environment.Comment: This is the second edition of the boo
Investigations into colour constancy by bridging human and computer colour vision
PhD ThesisThe mechanism of colour constancy within the human visual system has long been of great interest to researchers within the psychophysical and image processing communities. With the maturation of colour imaging techniques for both scientific and artistic applications the importance of colour capture accuracy has consistently increased. Colour offers a great deal more information for the viewer than grayscale imagery, ranging from object detection to food ripeness and health estimation amongst many others.
However these tasks rely upon the colour constancy process in order to discount scene illumination to allow these tasks to be carried out. Psychophysical studies have attempted to uncover the inner workings of this mechanism, which would allow it to be reproduced algorithmically. This would allow the development of devices which can eventually capture and perceive colour in the same manner as a human viewer.
These two communities have approached this challenge from opposite ends, and as such very different and largely unconnected approaches. This thesis investigates the development of studies and algorithms which bridge the two communities. Utilising findings from psychophysical studies as inspiration to firstly improve an existing image enhancement algorithm. Results are then compared to state of the art methods. Then, using further knowledge, and inspiration, of the human visual system to develop a novel colour constancy approach. This approach attempts to mimic and replicate the mechanism of colour constancy by investigating the use of a physiological colour space and specific scene contents to estimate illumination. Performance of the colour constancy mechanism within the visual system is then also investigated. The performance of the mechanism across different scenes and commonly and uncommonly encountered illuminations is tested.
The importance of being able to bridge these two communities, with a successful colour constancy method, is then further illustrated with a case study investigating the human visual perception of the agricultural produce of tomatoes.EPSRC DTA:
Institute of Neuroscience, Newcastle University
Revisiting and evaluating colour constancy and colour stabilisation algorithms
When we capture a scene with a digital camera, the sensor generates a digital response which is the Raw image. This response depends on the ambient light, the object reflectance and the sensitivity of the camera. The generated image is processed with the the camera pipeline, which is a series of operations aiming at processing the colours of the image to make it more pleasant for the user. Further colour processing can also be performed on the pipeline output image. This said, processing the colours is not only important for aesthetic reasons, but also for various computer vision tasks where a faithful reproduction of the scene colours is needed e.g. for object recognition and tracking. In this thesis, we focus on two important colour processing operations: colour constancy and colour stabilisation.
Colour constancy is the ability of a visual system to see an object with the same colour independently of the light colour; the camera processes the image so the scene looks like captured under a canonical light, usually a white light. This means that when we take two images of, let’s say, a green apple in the sunlight and indoor under a tungsten light, we want the apple to appear green in both cases. To do that one important step of the pipeline is to estimate the light colour in the scene to then discount it from the image.
In this thesis we first focus on the illuminant estimation problem, in particular on the performance evaluation of illuminant estimation algorithms on the benchmark ColorChecker dataset. More precisely, we show the importance of the accuracy of the ground-truth illuminants when evaluating algorithms and comparing them.
The following part of the thesis is about chromagenic illuminant estimation which is based on using two images of the scene: one filtered and one unfiltered where the two images need to be registered. We revisit the preprocessing step (colour correction) of the chromagenic method and we introduce the use of the Monge-Kantorovitch transform (MKT) that removes the need for the expensive registration task. We also introduce two new datasets of chromagenic images for the evaluation of illuminant estimation methods.
The last part of the thesis is about colour stabilisation which is particularly important in video processing, where consistency of colours is required across image frames. When the camera moves or when the shooting parameters change, the same object in the scene can appear with different colours in two consecutive frames. To solve for colour stabilisation given a pair of images of the same scene we need to process the first image to match the second. We propose using MKT to find the mapping. Our novel method gives competitive results compared to other recent methods while being less computationally expensive
- …