40 research outputs found
Two Decades of Colorization and Decolorization for Images and Videos
Colorization is a computer-aided process, which aims to give color to a gray
image or video. It can be used to enhance black-and-white images, including
black-and-white photos, old-fashioned films, and scientific imaging results. On
the contrary, decolorization is to convert a color image or video into a
grayscale one. A grayscale image or video refers to an image or video with only
brightness information without color information. It is the basis of some
downstream image processing applications such as pattern recognition, image
segmentation, and image enhancement. Different from image decolorization, video
decolorization should not only consider the image contrast preservation in each
video frame, but also respect the temporal and spatial consistency between
video frames. Researchers were devoted to develop decolorization methods by
balancing spatial-temporal consistency and algorithm efficiency. With the
prevalance of the digital cameras and mobile phones, image and video
colorization and decolorization have been paid more and more attention by
researchers. This paper gives an overview of the progress of image and video
colorization and decolorization methods in the last two decades.Comment: 12 pages, 19 figure
Learning based image transformation using convolutional neural networks
We have developed a learning-based image transformation framework and successfully applied it to three common image transformation operations: downscaling, decolorization, and high dynamic range image tone mapping. We use a convolutional neural network (CNN) as a non-linear mapping function to transform an input image to a desired output. A separate CNN network trained for a very large image classification task is used as a feature extractor to construct the training loss function of the image transformation CNN. Unlike similar applications in the related literature such as image super-resolution, none of the problems addressed in this paper have a known ground truth or target. For each problem, we reason abouta suitable learning objective function and develop an effective solution. This is the first work that uses deep learning to solve and unify these three common image processing tasks. We present experimental results to demonstrate the effectiveness of the new technique and its state-of-the-art performances
wEscore: quality assessment method of multichannel image visualization with regard to angular resolution
This work considers the problem of quality assessment of multichannel image visualization methods. One approach to such an assessment, the Escore quality measure, is studied. This measure, initially proposed for decolorization methods evaluation, can be generalized for the assessment of hyperspectral image visualization methods. It is shown that Escore does not account for the loss of local contrast at the supra-pixel scale. The sensitivity to the latter in humans depends on the observation conditions, so we propose a modified wEscore measure which includes the parameters allowing for the adjustment of the local contrast scale based on the angular resolution of the images. We also describe the adjustment of wEscore parameters for the evaluation of known decolorization algorithms applied to the images from the COLOR250 and the Cadik datasets with given observational conditions. When ranking the results of these algorithms and comparing it to the ranking based on human perception, wEscore turned out to be more accurate than Escore.This work was supported by Russian Science Foundation (Project No. 20-61-47089)
Color2Hatch: conversion of color to hatching for low-cost printing
In this paper, we propose Color2Hatch, a decolorization method for business/presentation graphics. In Color2Hatch, each region represented as a closed path and uniformly colored in scalable vector graphics (SVG) is converted to a region hatched in black and white. From the characteristics of business graphics, the hatching patterns are designed to represent mainly the hue in the region; additionally, lightness and saturation can also be reflected. To discriminate subtle differences between colors, attached short line segments, zigzag lines, and wave lines are used in hatching by analogy to a clock. Compared with the existing decolorization methods, for example, grayscale conversion and texturing, our method is superior in the discrimination of regions, suitable for low-cost black and white printing that meets real-world needs
StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models
Content and style (C-S) disentanglement is a fundamental problem and critical
challenge of style transfer. Existing approaches based on explicit definitions
(e.g., Gram matrix) or implicit learning (e.g., GANs) are neither interpretable
nor easy to control, resulting in entangled representations and less satisfying
results. In this paper, we propose a new C-S disentangled framework for style
transfer without using previous assumptions. The key insight is to explicitly
extract the content information and implicitly learn the complementary style
information, yielding interpretable and controllable C-S disentanglement and
style transfer. A simple yet effective CLIP-based style disentanglement loss
coordinated with a style reconstruction prior is introduced to disentangle C-S
in the CLIP image space. By further leveraging the powerful style removal and
generative ability of diffusion models, our framework achieves superior results
than state of the art and flexible C-S disentanglement and trade-off control.
Our work provides new insights into the C-S disentanglement in style transfer
and demonstrates the potential of diffusion models for learning
well-disentangled C-S characteristics.Comment: Accepted by ICCV 202
A Cookbook of Self-Supervised Learning
Self-supervised learning, dubbed the dark matter of intelligence, is a
promising path to advance machine learning. Yet, much like cooking, training
SSL methods is a delicate art with a high barrier to entry. While many
components are familiar, successfully training a SSL method involves a dizzying
set of choices from the pretext tasks to training hyper-parameters. Our goal is
to lower the barrier to entry into SSL research by laying the foundations and
latest SSL recipes in the style of a cookbook. We hope to empower the curious
researcher to navigate the terrain of methods, understand the role of the
various knobs, and gain the know-how required to explore how delicious SSL can
be
Adaptive Methods for Color Vision Impaired Users
Color plays a key role in the understanding of the information in computer environments. It
happens that about 5% of the world population is affected by color vision deficiency (CVD),
also called color blindness. This visual impairment hampers the color perception, ending up by
limiting the overall perception that CVD people have about the surrounding environment, no
matter it is real or virtual. In fact, a CVD individual may not distinguish between two different
colors, what often originates confusion or a biased understanding of the reality, including web
environments, whose web pages are plenty of media elements like text, still images, video,
sprites, and so on.
Aware of the difficulties that color-blind people may face in interpreting colored contents,
a significant number of recoloring algorithms have been proposed in the literature with the
purpose of improving the visual perception of those people somehow. However, most of those
algorithms lack a systematic study of subjective assessment, what undermines their validity, not
to say usefulness. Thus, in the sequel of the research work behind this Ph.D. thesis, the central
question that needs to be answered is whether recoloring algorithms are of any usefulness and
help for colorblind people or not.
With this in mind, we conceived a few preliminary recoloring algorithms that were published in
conference proceedings elsewhere. Except the algorithm detailed in Chapter 3, these conference
algorithms are not described in this thesis, though they have been important to engender
those presented here. The first algorithm (Chapter 3) was designed and implemented for people
with dichromacy to improve their color perception. The idea is to project the reddish hues onto
other hues that are perceived more regularly by dichromat people.
The second algorithm (Chapter 4) is also intended for people with dichromacy to improve their
perception of color, but its applicability covers the adaptation of text and image, in HTML5-
compliant web environments. This enhancement of color contrast of text and imaging in web
pages is done while keeping the naturalness of color as much as possible. Also, to the best of our
knowledge, this is the first web recoloring approach targeted to dichromat people that takes
into consideration both text and image recoloring in an integrated manner.
The third algorithm (Chapter 5) primarily focuses on the enhancement of some of the object
contours in still images, instead of recoloring the pixels of the regions bounded by such contours.
Enhancing contours is particularly suited to increase contrast in images, where we find adjacent
regions that are color indistinguishable from dichromat’s point of view. To our best knowledge,
this is one of the first algorithms that take advantage of image analysis and processing techniques
for region contours.
After accurate subjective assessment studies for color-blind people, we concluded that the CVD
adaptation methods are useful in general. Nevertheless, each method is not efficient enough to
adapt all sorts of images, that is, the adequacy of each method depends on the type of image
(photo-images, graphical representations, etc.).
Furthermore, we noted that the experience-based perceptual learning of colorblind people
throughout their lives determines their visual perception. That is, color adaptation algorithms must satisfy requirements such as color naturalness and consistency, to ensure that dichromat
people improve their visual perception without artifacts. On the other hand, CVD adaptation
algorithms should be object-oriented, instead of pixel-oriented (as typically done), to select
judiciously pixels that should be adapted. This perspective opens an opportunity window for
future research in color accessibility in the field of in human-computer interaction (HCI).A cor desempenha um papel fundamental na compreensão da informação em ambientes computacionais.
Porém, cerca de 5% da população mundial é afetada pela deficiência de visão de
cor (ou Color Vision Deficiency (CVD), do Inglês), correntemente designada por daltonismo. Esta
insuficiência visual dificulta a perceção das cores, o que limita a perceção geral que os indivíduos
têm sobre o meio, seja real ou virtual. Efetivamente, um indivíduo com CVD vê como iguais
cores que são diferentes, o que origina confusão ou uma compreensão distorcida da realidade,
assim como dos ambientes web, onde existe uma abundância de conteúdos média coloridos,
como texto, imagens fixas e vídeo, entre outros.
Com o intuito de mitigar as dificuldades que as pessoas com CVD enfrentam na interpretação de
conteúdos coloridos, tem sido proposto na literatura um número significativo de algoritmos de
recoloração, que têm como o objetivo melhorar, de alguma forma, a perceção visual de pessoas
com CVD. Porém, a maioria desses trabalhos carece de um estudo sistemático de avaliação
subjetiva, o que põe em causa a sua validação, se não mesmo a sua utilidade. Assim, a principal
questão à qual se pretende responder, como resultado do trabalho de investigação subjacente
a esta tese de doutoramento, é se os algoritmos de recoloração têm ou não uma real utilidade,
constituindo assim uma ajuda efetiva às pessoas com daltonismo.
Tendo em mente esta questão, concebemos alguns algoritmos de recoloração preliminares que
foram publicados em atas de conferências. Com exceção do algoritmo descrito no Capítulo 3,
esses algoritmos não são descritos nesta tese, não obstante a sua importância na conceção
daqueles descritos nesta dissertação. O primeiro algoritmo (Capítulo 3) foi projetado e implementado
para pessoas com dicromacia, a fim de melhorar a sua perceção da cor. A ideia consiste
em projetar as cores de matiz avermelhada em matizes que são melhor percebidos pelas pessoas
com os tipos de daltonismo em causa.
O segundo algoritmo (Capítulo 4) também se destina a melhorar a perceção da cor por parte de
pessoas com dicromacia, porém a sua aplicabilidade abrange a adaptação de texto e imagem,
em ambientes web compatíveis com HTML5. Isto é conseguido através do realce do contraste
de cores em blocos de texto e em imagens, em páginas da web, mantendo a naturalidade da
cor tanto quanto possível. Além disso, tanto quanto sabemos, esta é a primeira abordagem de
recoloração em ambiente web para pessoas com dicromacia, que trata o texto e a imagem de
forma integrada.
O terceiro algoritmo (Capítulo 5) centra-se principalmente na melhoria de alguns dos contornos
de objetos em imagens, em vez de aplicar a recoloração aos pixels das regiões delimitadas por
esses contornos. Esta abordagem é particularmente adequada para aumentar o contraste em
imagens, quando existem regiões adjacentes que são de cor indistinguível sob a perspetiva dos
observadores com dicromacia. Também neste caso, e tanto quanto é do nosso conhecimento,
este é um dos primeiros algoritmos em que se recorre a técnicas de análise e processamento de
contornos de regiões.
Após rigorosos estudos de avaliação subjetiva com pessoas com daltonismo, concluiu-se que os
métodos de adaptação CVD são úteis em geral. No entanto, cada método não é suficientemente
eficiente para todos os tipo de imagens, isto é, o desempenho de cada método depende do tipo de imagem (fotografias, representações gráficas, etc.).
Além disso, notámos que a aprendizagem perceptual baseada na experiência das pessoas daltónicas
ao longo de suas vidas é determinante para perceber aquilo que vêem. Isto significa que os
algoritmos de adaptação de cor devem satisfazer requisitos tais como a naturalidade e a consistência
da cor, de modo a não pôr em causa aquilo que os destinatários consideram razoável
ver no mundo real. Por outro lado, a abordagem seguida na adaptação CVD deve ser orientada
aos objetos, em vez de ser orientada aos pixéis (como tem sido feito até ao momento), de
forma a possibilitar uma seleção mais criteriosa dos pixéis que deverão ser sujeitos ao processo
de adaptação. Esta perspectiva abre uma janela de oportunidade para futura investigação em
acessibilidade da cor no domínio da interacção humano-computador (HCI)