17 research outputs found
Visual quality assessment for super-resolved images: database and method
Image super-resolution (SR) has been an active re-search problem which has recently received renewed interest due to the introduction of new technologies such as deep learning. However, the lack of suitable criteria to evaluate the SR perfor-mance has hindered technology development. In this paper, we fill a gap in the literature by providing the first publicly available database as well as a new image quality assessment (IQA) method specifically designed for assessing the visual quality of su-per-resolved images (SRIs). In constructing the Quality Assess-ment Database for SRIs (QADS), we carefully selected 20 refer-ence images and created 980 SRIs using 21 image SR methods. Mean opinion score (MOS) for these SRIs are collected through 100 individuals participating a suitably designed psychovisual experiment. Extensive numerical and statistical analysis is per-formed to show that the MOS of QADS has excellent suitability and reliability. The psychovisual experiment has led to the dis-covery that, unlike distortions encountered in other IQA data-bases, artifacts of the SRIs degenerate the image structure as well as image texture. Moreover, the structural and textural degener-ations have distinctive perceptual properties. Based on these in-sights, we propose a novel method to assess the visual quality of SRIs by separately considering the structural and textural com-ponents of images. Observing that textural degenerations are mainly attributed to dissimilar texture or checkerboard artifacts, we propose to measure the changes of textural distributions. We also observe that structural degenerations appear as blurring and jaggies artifacts in SRIs and develop separate similarity measures for different types of structural degenerations. A new pooling mechanism is then used to fuse the different similarities together to give the final quality score for an SRI. Experiments conducted on the QADS demonstrate that our method significantly outper-forms classical as well as current state-of-the-art IQA methods
Deep Learning frameworks for Image Quality Assessment
Technology is advancing by the arrival of deep learning and it finds huge application in image
processing also. Deep learning itself sufficient to perform over all the statistical methods. As a
research work, I implemented image quality assessment techniques using deep learning. Here I
proposed two full reference image quality assessment algorithms and two no reference image quality
algorithms. Among the two algorithms on each method, one is in a supervised manner and other is
in an unsupervised manner.
First proposed method is the full reference image quality assessment using autoencoder. Existing
literature shows that statistical features of pristine images will get distorted in presence of distortion.
It will be more advantageous if algorithm itself learns the distortion discriminating features. It will
be more complex if the feature length is more. So autoencoder is trained using a large number of
pristine images. An autoencoder will give the best lower dimensional representation of the input.
It is showed that encoded distance features have good distortion discrimination properties. The
proposed algorithm delivers competitive performance over standard databases.
If we are giving both reference and distorted images to the model and the model learning itself
and gives the scores will reduce the load of extracting features and doing post-processing. But model
should be capable one for discriminating the features by itself. Second method which I proposed is
a full reference and no reference image quality assessment using deep convolutional neural networks.
A network is trained in a supervised manner with subjective scores as targets. The algorithm is
performing e�ciently for the distortions that are learned while training the model.
Last proposed method is a classiffication based no reference image quality assessment. Distortion
level in an image may vary from one region to another region. We may not be able to view distortion
in some part but it may be present in other parts. A classiffication model is able to tell whether a
given input patch is of low quality or high quality. It is shown that aggregate of the patch quality
scores is having a high correlation with the subjective scores
Deep Image Prior Amplitude SAR Image Anonymization
This paper presents an extensive evaluation of the Deep Image Prior (DIP) technique for image inpainting on Synthetic Aperture Radar (SAR) images. SAR images are gaining popularity in various applications, but there may be a need to conceal certain regions of them. Image inpainting provides a solution for this. However, not all inpainting techniques are designed to work on SAR images. Some are intended for use on photographs, while others have to be specifically trained on top of a huge set of images. In this work, we evaluate the performance of the DIP technique that is capable of addressing these challenges: it can adapt to the image under analysis including SAR imagery; it does not require any training. Our results demonstrate that the DIP method achieves great performance in terms of objective and semantic metrics. This indicates that the DIP method is a promising approach for inpainting SAR images, and can provide high-quality results that meet the requirements of various applications
Beyond the pixels: learning and utilising video compression features for localisation of digital tampering.
Video compression is pervasive in digital society. With rising usage of deep convolutional neural networks (CNNs) in the fields of computer vision, video analysis and video tampering detection, it is important to investigate how patterns invisible to human eyes may be influencing modern computer vision techniques and how they can be used advantageously. This work thoroughly explores how video compression influences accuracy of CNNs and shows how optimal performance is achieved when compression levels in the training set closely match those of the test set. A novel method is then developed, using CNNs, to derive compression features directly from the pixels of video frames. It is then shown that these features can be readily used to detect inauthentic video content with good accuracy across multiple different video tampering techniques. Moreover, the ability to explain these features allows predictions to be made about their effectiveness against future tampering methods. The problem is motivated with a novel investigation into recent video manipulation methods, which shows that there is a consistent drive to produce convincing, photorealistic, manipulated or synthetic video. Humans, blind to the presence of video tampering, are also blind to the type of tampering. New detection techniques are required and, in order to compensate for human limitations, they should be broadly applicable to multiple tampering types. This thesis details the steps necessary to develop and evaluate such techniques
Blind Multimodal Quality Assessment of Low-light Images
Blind image quality assessment (BIQA) aims at automatically and accurately
forecasting objective scores for visual signals, which has been widely used to
monitor product and service quality in low-light applications, covering
smartphone photography, video surveillance, autonomous driving, etc. Recent
developments in this field are dominated by unimodal solutions inconsistent
with human subjective rating patterns, where human visual perception is
simultaneously reflected by multiple sensory information. In this article, we
present a unique blind multimodal quality assessment (BMQA) of low-light images
from subjective evaluation to objective score. To investigate the multimodal
mechanism, we first establish a multimodal low-light image quality (MLIQ)
database with authentic low-light distortions, containing image-text modality
pairs. Further, we specially design the key modules of BMQA, considering
multimodal quality representation, latent feature alignment and fusion, and
hybrid self-supervised and supervised learning. Extensive experiments show that
our BMQA yields state-of-the-art accuracy on the proposed MLIQ benchmark
database. In particular, we also build an independent single-image modality
Dark-4K database, which is used to verify its applicability and generalization
performance in mainstream unimodal applications. Qualitative and quantitative
results on Dark-4K show that BMQA achieves superior performance to existing
BIQA approaches as long as a pre-trained model is provided to generate text
description. The proposed framework and two databases as well as the collected
BIQA methods and evaluation metrics are made publicly available on here.Comment: 15 page
Automatsko povećanje pamtljivosti slika
The dissertation considers the problem of automatic increase of image memorability. The problem-solving approach is based on editing-byapplying-filters paradigm. Given an arbitrary input image, the proposed deep learning model is able to automatically retrieve a set of “style seeds”, i.e., a set of style images which, applied to the input image through a neural style transfer algorithm, provide the highest increase in memorability. We show the effectiveness of the approach with experiments, performing both a quantitative evaluation and a user study.Дисертација разматра проблем аутоматског повећања памтљивости фотографије на основу модела дубоког учења. Овој проблематици се приступа са аспекта развоја иновативног приступа заснованог на парадигми уређивања слике применом филтера. Арбитрарна улазна слика аутоматски преузима сет стилских карактеристика који се преносе путем алгоритма неуронског стила, омогућавајући на овај начин пораст памтљивости целокупне слике. Ефикасност предложеног приступа евалуирана је експерименталнo уз изведбу корисничке студије.Disertacija razmatra problem automatskog povećanja pamtljivosti fotografije na osnovu modela dubokog učenja. Ovoj problematici se pristupa sa aspekta razvoja inovativnog pristupa zasnovanog na paradigmi uređivanja slike primenom filtera. Arbitrarna ulazna slika automatski preuzima set stilskih karakteristika koji se prenose putem algoritma neuronskog stila, omogućavajući na ovaj način porast pamtljivosti celokupne slike. Efikasnost predloženog pristupa evaluirana je eksperimentalno uz izvedbu korisničke studije
Scene-Dependency of Spatial Image Quality Metrics
This thesis is concerned with the measurement of spatial imaging performance and the modelling of spatial image quality in digital capturing systems. Spatial imaging performance and image quality relate to the objective and subjective reproduction of luminance contrast signals by the system, respectively; they are critical to overall perceived image quality.
The Modulation Transfer Function (MTF) and Noise Power Spectrum (NPS) describe the signal (contrast) transfer and noise characteristics of a system, respectively, with respect to spatial frequency. They are both, strictly speaking, only applicable to linear systems since they are founded upon linear system theory. Many contemporary capture systems use adaptive image signal processing, such as denoising and sharpening, to optimise output image quality. These non-linear processes change their behaviour according to characteristics of the input signal (i.e. the scene being captured). This behaviour renders system performance “scene-dependent” and difficult to measure accurately. The MTF and NPS are traditionally measured from test charts containing suitable predefined signals (e.g. edges, sinusoidal exposures, noise or uniform luminance patches). These signals trigger adaptive processes at uncharacteristic levels since they are unrepresentative of natural scene content. Thus, for systems using adaptive processes, the resultant MTFs and NPSs are not representative of performance “in the field” (i.e. capturing real scenes).
Spatial image quality metrics for capturing systems aim to predict the relationship between MTF and NPS measurements and subjective ratings of image quality. They cascade both measures with contrast sensitivity functions that describe human visual sensitivity with respect to spatial frequency. The most recent metrics designed for adaptive systems use MTFs measured using the dead leaves test chart that is more representative of natural scene content than the abovementioned test charts. This marks a step toward modelling image quality with respect to real scene signals.
This thesis presents novel scene-and-process-dependent MTFs (SPD-MTF) and NPSs (SPDNPS). They are measured from imaged pictorial scene (or dead leaves target) signals to account for system scene-dependency. Further, a number of spatial image quality metrics are revised to account for capture system and visual scene-dependency. Their MTF and NPS parameters were substituted for SPD-MTFs and SPD-NPSs. Likewise, their standard visual functions were substituted for contextual detection (cCSF) or discrimination (cVPF) functions. In addition, two novel spatial image quality metrics are presented (the log Noise Equivalent Quanta (NEQ) and Visual log NEQ) that implement SPD-MTFs and SPD-NPSs.
The metrics, SPD-MTFs and SPD-NPSs were validated by analysing measurements from simulated image capture pipelines that applied either linear or adaptive image signal processing. The SPD-NPS measures displayed little evidence of measurement error, and the metrics performed most accurately when they used SPD-NPSs measured from images of scenes. The benefit of deriving SPD-MTFs from images of scenes was traded-off, however, against measurement bias. Most metrics performed most accurately with SPD-MTFs derived from dead leaves signals. Implementing the cCSF or cVPF did not increase metric accuracy.
The log NEQ and Visual log NEQ metrics proposed in this thesis were highly competitive, outperforming metrics of the same genre. They were also more consistent than the IEEE P1858 Camera Phone Image Quality (CPIQ) metric when their input parameters were modified. The advantages and limitations of all performance measures and metrics were discussed, as well as their practical implementation and relevant applications
Segmentation and Characterization of Small Retinal Vessels in Fundus Images Using the Tensor Voting Approach
RÉSUMÉ
La rétine permet de visualiser facilement une partie du réseau vasculaire humain. Elle offre
ainsi un aperçu direct sur le développement et le résultat de certaines maladies liées au réseau
vasculaire dans son entier. Chaque complication visible sur la rétine peut avoir un impact sur
la capacité visuelle du patient. Les plus petits vaisseaux sanguins sont parmi les premières
structures anatomiques affectées par la progression d’une maladie, être capable de les analyser
est donc crucial. Les changements dans l’état, l’aspect, la morphologie, la fonctionnalité, ou
même la croissance des petits vaisseaux indiquent la gravité des maladies.
Le diabète est une maladie métabolique qui affecte des millions de personnes autour
du monde. Cette maladie affecte le taux de glucose dans le sang et cause des changements
pathologiques dans différents organes du corps humain. La rétinopathie diabétique décrit l’en-
semble des conditions et conséquences du diabète au niveau de la rétine. Les petits vaisseaux
jouent un rôle dans le déclenchement, le développement et les conséquences de la rétinopa-
thie. Dans les dernières étapes de cette maladie, la croissance des nouveaux petits vaisseaux,
appelée néovascularisation, présente un risque important de provoquer la cécité. Il est donc
crucial de détecter tous les changements qui ont lieu dans les petits vaisseaux de la rétine
dans le but de caractériser les vaisseaux sains et les vaisseaux anormaux. La caractérisation
en elle-même peut faciliter la détection locale d’une rétinopathie spécifique.
La segmentation automatique des structures anatomiques comme le réseau vasculaire est
une étape cruciale. Ces informations peuvent être fournies à un médecin pour qu’elles soient
considérées lors de son diagnostic. Dans les systèmes automatiques d’aide au diagnostic, le
rôle des petits vaisseaux est significatif. Ne pas réussir à les détecter automatiquement peut
conduire à une sur-segmentation du taux de faux positifs des lésions rouges dans les étapes
ultérieures. Les efforts de recherche se sont concentrés jusqu’à présent sur la localisation
précise des vaisseaux de taille moyenne. Les modèles existants ont beaucoup plus de difficultés
à extraire les petits vaisseaux sanguins. Les modèles existants ne sont pas robustes à la grande
variance d’apparence des vaisseaux ainsi qu’à l’interférence avec l’arrière-plan. Les modèles de
la littérature existante supposent une forme générale qui n’est pas suffisante pour s’adapter
à la largeur étroite et la courbure qui caractérisent les petits vaisseaux sanguins. De plus, le
contraste avec l’arrière-plan dans les régions des petits vaisseaux est très faible. Les méthodes
de segmentation ou de suivi produisent des résultats fragmentés ou discontinus. Par ailleurs,
la segmentation des petits vaisseaux est généralement faite aux dépends de l’amplification
du bruit. Les modèles déformables sont inadéquats pour segmenter les petits vaisseaux. Les
forces utilisées ne sont pas assez flexibles pour compenser le faible contraste, la largeur, et
vii
la variance des vaisseaux. Enfin, les approches de type apprentissage machine nécessitent un
entraînement avec une base de données étiquetée. Il est très difficile d’obtenir ces bases de
données dans le cas des petits vaisseaux.
Cette thèse étend les travaux de recherche antérieurs en fournissant une nouvelle mé-
thode de segmentation des petits vaisseaux rétiniens. La détection de ligne à échelles multiples
(MSLD) est une méthode récente qui démontre une bonne performance de segmentation dans
les images de la rétine, tandis que le vote tensoriel est une méthode proposée pour reconnecter
les pixels. Une approche combinant un algorithme de détection de ligne et de vote tensoriel est
proposée. L’application des détecteurs de lignes a prouvé son efficacité à segmenter les vais-
seaux de tailles moyennes. De plus, les approches d’organisation perceptuelle comme le vote
tensoriel ont démontré une meilleure robustesse en combinant les informations voisines d’une
manière hiérarchique. La méthode de vote tensoriel est plus proche de la perception humain
que d’autres modèles standards. Comme démontré dans ce manuscrit, c’est un outil pour
segmenter les petits vaisseaux plus puissant que les méthodes existantes. Cette combinaison
spécifique nous permet de surmonter les défis de fragmentation éprouvés par les méthodes de
type modèle déformable au niveau des petits vaisseaux. Nous proposons également d’utiliser
un seuil adaptatif sur la réponse de l’algorithme de détection de ligne pour être plus robuste
aux images non-uniformes. Nous illustrons également comment une combinaison des deux
méthodes individuelles, à plusieurs échelles, est capable de reconnecter les vaisseaux sur des
distances variables. Un algorithme de reconstruction des vaisseaux est également proposé.
Cette dernière étape est nécessaire car l’information géométrique complète est requise pour
pouvoir utiliser la segmentation dans un système d’aide au diagnostic.
La segmentation a été validée sur une base de données d’images de fond d’oeil à haute
résolution. Cette base contient des images manifestant une rétinopathie diabétique. La seg-
mentation emploie des mesures de désaccord standards et aussi des mesures basées sur la
perception. En considérant juste les petits vaisseaux dans les images de la base de données,
l’amélioration dans le taux de sensibilité que notre méthode apporte par rapport à la méthode
standard de détection multi-niveaux de lignes est de 6.47%. En utilisant les mesures basées
sur la perception, l’amélioration est de 7.8%.
Dans une seconde partie du manuscrit, nous proposons également une méthode pour
caractériser les rétines saines ou anormales. Certaines images contiennent de la néovascula-
risation. La caractérisation des vaisseaux en bonne santé ou anormale constitue une étape
essentielle pour le développement d’un système d’aide au diagnostic. En plus des défis que
posent les petits vaisseaux sains, les néovaisseaux démontrent eux un degré de complexité
encore plus élevé. Ceux-ci forment en effet des réseaux de vaisseaux à la morphologie com-
plexe et inhabituelle, souvent minces et à fortes courbures. Les travaux existants se limitent
viii
à l’utilisation de caractéristiques de premier ordre extraites des petits vaisseaux segmentés.
Notre contribution est d’utiliser le vote tensoriel pour isoler les jonctions vasculaires et d’uti-
liser ces jonctions comme points d’intérêts. Nous utilisons ensuite une statistique spatiale
de second ordre calculée sur les jonctions pour caractériser les vaisseaux comme étant sains
ou pathologiques. Notre méthode améliore la sensibilité de la caractérisation de 9.09% par
rapport à une méthode de l’état de l’art.
La méthode développée s’est révélée efficace pour la segmentation des vaisseaux réti-
niens. Des tenseurs d’ordre supérieur ainsi que la mise en œuvre d’un vote par tenseur via
un filtrage orientable pourraient être étudiés pour réduire davantage le temps d’exécution et
résoudre les défis encore présents au niveau des jonctions vasculaires. De plus, la caractéri-
sation pourrait être améliorée pour la détection de la rétinopathie proliférative en utilisant
un apprentissage supervisé incluant des cas de rétinopathie diabétique non proliférative ou
d’autres pathologies. Finalement, l’incorporation des méthodes proposées dans des systèmes
d’aide au diagnostic pourrait favoriser le dépistage régulier pour une détection précoce des
rétinopathies et d’autres pathologies oculaires dans le but de réduire la cessité au sein de la
population.----------ABSTRACT
As an easily accessible site for the direct observation of the circulation system, human retina
can offer a unique insight into diseases development or outcome. Retinal vessels are repre-
sentative of the general condition of the whole systematic circulation, and thus can act as
a "window" to the status of the vascular network in the whole body. Each complication on
the retina can have an adverse impact on the patient’s sight. In this direction, small vessels’
relevance is very high as they are among the first anatomical structures that get affected
as diseases progress. Moreover, changes in the small vessels’ state, appearance, morphology,
functionality, or even growth indicate the severity of the diseases.
This thesis will focus on the retinal lesions due to diabetes, a serious metabolic disease
affecting millions of people around the world. This disorder disturbs the natural blood glucose
levels causing various pathophysiological changes in different systems across the human body.
Diabetic retinopathy is the medical term that describes the condition when the fundus and
the retinal vessels are affected by diabetes. As in other diseases, small vessels play a crucial
role in the onset, the development, and the outcome of the retinopathy. More importantly,
at the latest stage, new small vessels, or neovascularizations, growth constitutes a factor of
significant risk for blindness. Therefore, there is a need to detect all the changes that occur
in the small retinal vessels with the aim of characterizing the vessels to healthy or abnormal.
The characterization, in turn, can facilitate the detection of a specific retinopathy locally,
like the sight-threatening proliferative diabetic retinopathy.
Segmentation techniques can automatically isolate important anatomical structures like
the vessels, and provide this information to the physician to assist him in the final decision. In
comprehensive systems for the automatization of DR detection, small vessels role is significant
as missing them early in a CAD pipeline might lead to an increase in the false positive rate
of red lesions in subsequent steps. So far, the efforts have been concentrated mostly on the
accurate localization of the medium range vessels. In contrast, the existing models are weak
in case of the small vessels. The required generalization to adapt an existing model does not
allow the approaches to be flexible, yet robust to compensate for the increased variability in
the appearance as well as the interference with the background. So far, the current template
models (matched filtering, line detection, and morphological processing) assume a general
shape for the vessels that is not enough to approximate the narrow, curved, characteristics
of the small vessels. Additionally, due to the weak contrast in the small vessel regions,
the current segmentation and the tracking methods produce fragmented or discontinued
results. Alternatively, the small vessel segmentation can be accomplished at the expense of
x
background noise magnification, in the case of using thresholding or the image derivatives
methods. Furthermore, the proposed deformable models are not able to propagate a contour
to the full extent of the vasculature in order to enclose all the small vessels. The deformable
model external forces are ineffective to compensate for the low contrast, the low width, the
high variability in the small vessel appearance, as well as the discontinuities. Internal forces,
also, are not able to impose a global shape constraint to the contour that could be able to
approximate the variability in the appearance of the vasculature in different categories of
vessels. Finally, machine learning approaches require the training of a classifier on a labelled
set. Those sets are difficult to be obtained, especially in the case of the smallest vessels. In
the case of the unsupervised methods, the user has to predefine the number of clusters and
perform an effective initialization of the cluster centers in order to converge to the global
minimum.
This dissertation expanded the previous research work and provides a new segmentation
method for the smallest retinal vessels. Multi-scale line detection (MSLD) is a recent method
that demonstrates good segmentation performance in the retinal images, while tensor voting
is a method first proposed for reconnecting pixels. For the first time, we combined the
line detection with the tensor voting framework. The application of the line detectors has
been proved an effective way to segment medium-sized vessels. Additionally, perceptual
organization approaches like tensor voting, demonstrate increased robustness by combining
information coming from the neighborhood in a hierarchical way. Tensor voting is closer than
standard models to the way human perception functions. As we show, it is a more powerful
tool to segment small vessels than the existing methods. This specific combination allows us
to overcome the apparent fragmentation challenge of the template methods at the smallest
vessels. Moreover, we thresholded the line detection response adaptively to compensate for
non-uniform images. We also combined the two individual methods in a multi-scale scheme
in order to reconnect vessels at variable distances. Finally, we reconstructed the vessels
from their extracted centerlines based on pixel painting as complete geometric information
is required to be able to utilize the segmentation in a CAD system.
The segmentation was validated on a high-resolution fundus image database that in-
cludes diabetic retinopathy images of varying stages, using standard discrepancy as well as
perceptual-based measures. When only the smallest vessels are considered, the improve-
ments in the sensitivity rate for the database against the standard multi-scale line detection
method is 6.47%. For the perceptual-based measure, the improvement is 7.8% against the
basic method.
The second objective of the thesis was to implement a method for the characterization of
isolated retinal areas into healthy or abnormal cases. Some of the original images, from which
xi
these patches are extracted, contain neovascularizations. Investigation of image features
for the vessels characterization to healthy or abnormal constitutes an essential step in the
direction of developing CAD system for the automatization of DR screening. Given that the
amount of data will significantly increase under CAD systems, the focus on this category of
vessels can facilitate the referral of sight-threatening cases to early treatment. In addition
to the challenges that small healthy vessels pose, neovessels demonstrate an even higher
degree of complexity as they form networks of convolved, twisted, looped thin vessels. The
existing work is limited to the use of first-order characteristics extracted from the small
segmented vessels that limits the study of patterns. Our contribution is in using the tensor
voting framework to isolate the retinal vascular junctions and in turn using those junctions
as points of interests. Second, we exploited second-order statistics computed on the junction
spatial distribution to characterize the vessels as healthy or neovascularizations. In fact, the
second-order spatial statistics extracted from the junction distribution are combined with
widely used features to improve the characterization sensitivity by 9.09% over the state of
art.
The developed method proved effective for the segmentation of the retinal vessels. Higher
order tensors along with the implementation of tensor voting via steerable filtering could
be employed to further reduce the execution time, and resolve the challenges at vascular
junctions. Moreover, the characterization could be advanced to the detection of prolifera-
tive retinopathy by extending the supervised learning to include non-proliferative diabetic
retinopathy cases or other pathologies. Ultimately, the incorporation of the methods into
CAD systems could facilitate screening for the effective reduction of the vision-threatening
diabetic retinopathy rates, or the early detection of other than ocular pathologies