18 research outputs found

    Automatic example-based image colorization using location-aware cross-scale matching

    Get PDF
    Given a reference colour image and a destination grayscale image, this paper presents a novel automatic colourisation algorithm that transfers colour information from the reference image to the destination image. Since the reference and destination images may contain content at different or even varying scales (due to changes of distance between objects and the camera), existing texture matching based methods can often perform poorly. We propose a novel cross-scale texture matching method to improve the robustness and quality of the colourisation results. Suitable matching scales are considered locally, which are then fused using global optimisation that minimises both the matching errors and spatial change of scales. The minimisation is efficiently solved using a multi-label graph-cut algorithm. Since only low-level texture features are used, texture matching based colourisation can still produce semantically incorrect results, such as meadow appearing above the sky. We consider a class of semantic violation where the statistics of up-down relationships learnt from the reference image are violated and propose an effective method to identify and correct unreasonable colourisation. Finally, a novel nonlocal ℓ1 optimisation framework is developed to propagate high confidence micro-scribbles to regions of lower confidence to produce a fully colourised image. Qualitative and quantitative evaluations show that our method outperforms several state-of-the-art methods

    Understanding and advancing PDE-based image compression

    Get PDF
    This thesis is dedicated to image compression with partial differential equations (PDEs). PDE-based codecs store only a small amount of image points and propagate their information into the unknown image areas during the decompression step. For certain classes of images, PDE-based compression can already outperform the current quasi-standard, JPEG2000. However, the reasons for this success are not yet fully understood, and PDE-based compression is still in a proof-of-concept stage. With a probabilistic justification for anisotropic diffusion, we contribute to a deeper insight into design principles for PDE-based codecs. Moreover, by analysing the interaction between efficient storage methods and image reconstruction with diffusion, we can rank PDEs according to their practical value in compression. Based on these observations, we advance PDE-based compression towards practical viability: First, we present a new hybrid codec that combines PDE- and patch-based interpolation to deal with highly textured images. Furthermore, a new video player demonstrates the real-time capacities of PDE-based image interpolation and a new region of interest coding algorithm represents important image areas with high accuracy. Finally, we propose a new framework for diffusion-based image colourisation that we use to build an efficient codec for colour images. Experiments on real world image databases show that our new method is qualitatively competitive to current state-of-the-art codecs.Diese Dissertation ist der Bildkompression mit partiellen Differentialgleichungen (PDEs, partial differential equations) gewidmet. PDE-Codecs speichern nur einen geringen Anteil aller Bildpunkte und transportieren deren Information in fehlende Bildregionen. In einigen Fällen kann PDE-basierte Kompression den aktuellen Quasi-Standard, JPEG2000, bereits schlagen. Allerdings sind die Grßnde fßr diesen Erfolg noch nicht vollständig erforscht, und PDE-basierte Kompression befindet sich derzeit noch im Anfangsstadium. Wir tragen durch eine probabilistische Rechtfertigung anisotroper Diffusion zu einem tieferen Verständnis PDE-basierten Codec-Designs bei. Eine Analyse der Interaktion zwischen effizienten Speicherverfahren und Bildrekonstruktion erlaubt es uns, PDEs nach ihrem Nutzen fßr die Kompression zu beurteilen. Anhand dieser Einsichten entwickeln wir PDE-basierte Kompression hinsichtlich ihrer praktischen Nutzbarkeit weiter: Wir stellen einen Hybrid-Codec fßr hochtexturierte Bilder vor, der umgebungsbasierte Interpolation mit PDEs kombiniert. Ein neuer Video-Dekodierer demonstriert die Echtzeitfähigkeit PDE-basierter Interpolation und eine Region-of-Interest-Methode erlaubt es, wichtige Bildbereiche mit hoher Genauigkeit zu speichern. Schlussendlich stellen wir ein neues diffusionsbasiertes Kolorierungsverfahren vor, welches uns effiziente Kompression von Farbbildern ermÜglicht. Experimente auf Realwelt-Bilddatenbanken zeigen die Konkurrenzfähigkeit dieses Verfahrens auf

    Understanding and advancing PDE-based image compression

    Get PDF
    This thesis is dedicated to image compression with partial differential equations (PDEs). PDE-based codecs store only a small amount of image points and propagate their information into the unknown image areas during the decompression step. For certain classes of images, PDE-based compression can already outperform the current quasi-standard, JPEG2000. However, the reasons for this success are not yet fully understood, and PDE-based compression is still in a proof-of-concept stage. With a probabilistic justification for anisotropic diffusion, we contribute to a deeper insight into design principles for PDE-based codecs. Moreover, by analysing the interaction between efficient storage methods and image reconstruction with diffusion, we can rank PDEs according to their practical value in compression. Based on these observations, we advance PDE-based compression towards practical viability: First, we present a new hybrid codec that combines PDE- and patch-based interpolation to deal with highly textured images. Furthermore, a new video player demonstrates the real-time capacities of PDE-based image interpolation and a new region of interest coding algorithm represents important image areas with high accuracy. Finally, we propose a new framework for diffusion-based image colourisation that we use to build an efficient codec for colour images. Experiments on real world image databases show that our new method is qualitatively competitive to current state-of-the-art codecs.Diese Dissertation ist der Bildkompression mit partiellen Differentialgleichungen (PDEs, partial differential equations) gewidmet. PDE-Codecs speichern nur einen geringen Anteil aller Bildpunkte und transportieren deren Information in fehlende Bildregionen. In einigen Fällen kann PDE-basierte Kompression den aktuellen Quasi-Standard, JPEG2000, bereits schlagen. Allerdings sind die Grßnde fßr diesen Erfolg noch nicht vollständig erforscht, und PDE-basierte Kompression befindet sich derzeit noch im Anfangsstadium. Wir tragen durch eine probabilistische Rechtfertigung anisotroper Diffusion zu einem tieferen Verständnis PDE-basierten Codec-Designs bei. Eine Analyse der Interaktion zwischen effizienten Speicherverfahren und Bildrekonstruktion erlaubt es uns, PDEs nach ihrem Nutzen fßr die Kompression zu beurteilen. Anhand dieser Einsichten entwickeln wir PDE-basierte Kompression hinsichtlich ihrer praktischen Nutzbarkeit weiter: Wir stellen einen Hybrid-Codec fßr hochtexturierte Bilder vor, der umgebungsbasierte Interpolation mit PDEs kombiniert. Ein neuer Video-Dekodierer demonstriert die Echtzeitfähigkeit PDE-basierter Interpolation und eine Region-of-Interest-Methode erlaubt es, wichtige Bildbereiche mit hoher Genauigkeit zu speichern. Schlussendlich stellen wir ein neues diffusionsbasiertes Kolorierungsverfahren vor, welches uns effiziente Kompression von Farbbildern ermÜglicht. Experimente auf Realwelt-Bilddatenbanken zeigen die Konkurrenzfähigkeit dieses Verfahrens auf

    Learning visual representations of style

    Get PDF
    Learning Visual Representations of Style Door Nanne van Noord De stijl van een kunstenaar is zichtbaar in zijn/haar werk, onafhankelijk van de vorm of het onderwerp van een kunstwerk kunnen kunstexperts deze stijl herkennen. Of het nu om een landschap of een portret gaat, het connaisseurschap van kunstexperts stelt hen in staat om de stijl van de kunstenaar te herkennen. Het vertalen van dit vermogen tot connaisseurschap naar een computer, zodat de computer in staat is om de stijl van een kunstenaar te herkennen, en om kunstwerken te (re)produceren in de stijl van de kunstenaar, staat centraal in dit onderzoek. Voor visuele analyseren van kunstwerken maken computers gebruik van beeldverwerkingstechnieken. Traditioneel gesproken bestaan deze technieken uit door computerwetenschappers ontwikkelde algoritmes die vooraf gedefinieerde visuele kernmerken kunnen herkennen. Omdat deze kenmerken zijn ontwikkelt voor de analyse van de inhoud van foto’s zijn ze beperkt toepasbaar voor de analyse van de stijl van visuele kunst. Daarnaast is er ook geen definitief antwoord welke visuele kenmerken indicatief zijn voor stijl. Om deze beperkingen te overkomen maken we in dit onderzoek gebruik van Deep Learning, een methodologie die het beeldverwerking onderzoeksveld in de laatste jaren enorm heeft gerevolutionaliseerd. De kracht van Deep Learning komt voort uit het zelflerende vermogen, in plaats van dat we afhankelijk zijn van vooraf gedefinieerde kenmerken, kan de computer zelf leren wat de juiste kenmerken zijn. In dit onderzoek hebben we algoritmes ontwikkelt met het doel om het voor de computer mogelijk te maken om 1) zelf te leren om de stijl van een kunstenaar te herkennen, en 2) nieuwe afbeeldingen te genereren in de stijl van een kunstenaar. Op basis van het in het proefschrift gepresenteerde werk kunnen we concluderen dat de computer inderdaad in staat is om te leren om de stijl van een kunstenaar te herkennen, ook in een uitdagende setting met duizenden kunstwerken en enkele honderden kunstenaars. Daarnaast kunnen we concluderen dat het mogelijk is om, op basis van bestaande kunstwerken, nieuwe kunstwerken te generen in de stijl van de kunstenaar. Namelijk, een kleurloze afbeeldingen van een kunstwerk kan ingekleurd worden in de stijl van de kunstenaar, en wanneer er delen missen uit een kunstwerk is het mogelijk om deze missende stukken in te vullen (te retoucheren). Alhoewel we nog niet in staat zijn om volledig nieuwe kunstwerken te generen, is dit onderzoek een grote stap in die richting. Bovendien zijn de in dit onderzoek ontwikkelde technieken en methodes veelbelovend als digitale middelen ter ondersteuning van kunstexperts en restauratoren

    An evaluation of partial differential equations based digital inpainting algorithms

    Get PDF
    Partial Differential equations (PDEs) have been used to model various phenomena/tasks in different scientific and engineering endeavours. This thesis is devoted to modelling image inpainting by numerical implementations of certain PDEs. The main objectives of image inpainting include reconstructing damaged parts and filling-in regions in which data/colour information are missing. Different automatic and semi-automatic approaches to image inpainting have been developed including PDE-based, texture synthesis-based, exemplar-based, and hybrid approaches. Various challenges remain unresolved in reconstructing large size missing regions and/or missing areas with highly textured surroundings. Our main aim is to address such challenges by developing new advanced schemes with particular focus on using PDEs of different orders to preserve continuity of textural and geometric information in the surrounding of missing regions. We first investigated the problem of partial colour restoration in an image region whose greyscale channel is intact. A PDE-based solution is known that is modelled as minimising total variation of gradients in the different colour channels. We extend the applicability of this model to partial inpainting in other 3-channels colour spaces (such as RGB where information is missing in any of the two colours), simply by exploiting the known linear/affine relationships between different colouring models in the derivation of a modified PDE solution obtained by using the Euler-Lagrange minimisation of the corresponding gradient Total Variation (TV). We also developed two TV models on the relations between greyscale and colour channels using the Laplacian operator and the directional derivatives of gradients. The corresponding Euler-Lagrange minimisation yields two new PDEs of different orders for partial colourisation. We implemented these solutions in both spatial and frequency domains. We measure the success of these models by evaluating known image quality measures in inpainted regions for sufficiently large datasets and scenarios. The results reveal that our schemes compare well with existing algorithms, but inpainting large regions remains a challenge. Secondly, we investigate the Total Inpainting (TI) problem where all colour channels are missing in an image region. Reviewing and implementing existing PDE-based total inpainting methods reveal that high order PDEs, applied to each colour channel separately, perform well but are influenced by the size of the region and the quantity of texture surrounding it. Here we developed a TI scheme that benefits from our partial inpainting approach and apply two PDE methods to recover the missing regions in the image. First, we extract the (Y, Cb, Cr) of the image outside the missing region, apply the above PDE methods for reconstructing the missing regions in the luminance channel (Y), and then use the colourisation method to recover the missing (Cb, Cr) colours in the region. We shall demonstrate that compared to existing TI algorithms, our proposed method (using 2 PDE methods) performs well when tested on large datasets of natural and face images. Furthermore, this helps understanding of the impact of the texture in the surrounding areas on inpainting and opens new research directions. Thirdly, we investigate existing Exemplar-Based Inpainting (EBI) methods that do not use PDEs but simultaneously propagate the texture and structure into the missing region by finding similar patches within the rest of image and copying them into the boundary of the missing region. The order of patch propagation is determined by a priority function, and the similarity is determined by matching criteria. We shall exploit recently emerging Topological Data Analysis (TDA) tools to create innovative EBI schemes, referred to as TEBI. TDA studies shapes of data/objects to quantify image texture in terms of connectivity and closeness properties of certain data landmarks. Such quantifications help determine the appropriate size of patch propagation and will be used to modify the patch propagation priority function using the geometrical properties of curvature of isophotes, and to improve the matching criteria of patches by calculating the correlation coefficients from the spatial, gradient and Laplacian domains. The performance of this TEBI method will be tested by applying it to natural dataset images, resulting in improved inpainting when compared with other EBI methods. Fourthly, the recent hybrid-based inpainting techniques are reviewed and a number of highly performing innovative hybrid techniques that combine the use of high order PDE methods with the TEBI method for the simultaneous rebuilding of the missing texture and structure regions in an image are proposed. Such a hybrid scheme first decomposes the image into texture and structure components, and then the missing regions in these components are recovered by TEBI and PDE based methods respectively. The performance of our hybrid schemes will be compared with two existing hybrid algorithms. Fifthly, we turn our attention to inpainting large missing regions, and develop an innovative inpainting scheme that uses the concept of seam carving to reduce this problem to that of inpainting a smaller size missing region that can be dealt with efficiently using the inpainting schemes developed above. Seam carving resizes images based on content-awareness of the image for both reduction and expansion without affecting those image regions that have rich information. The missing region of the seam-carved version will be recovered by the TEBI method, original image size is restored by adding the removed seams and the missing parts of the added seams are then repaired using a high order PDE inpainting scheme. The benefits of this approach in dealing with large missing regions are demonstrated. The extensive performance testing of the developed inpainting methods shows that these methods significantly outperform existing inpainting methods for such a challenging task. However, the performance is still not acceptable in recovering large missing regions in high texture and structure images, and hence we shall identify remaining challenges to be investigated in the future. We shall also extend our work by investigating recently developed deep learning based image/video colourisation, with the aim of overcoming its limitations and shortcoming. Finally, we should also describe our on-going research into using TDA to detect recently growing serious “malicious” use of inpainting to create Fake images/videos

    Rethinking auto-colourisation of natural Images in the context of deep learning

    Get PDF
    Auto-colourisation is the ill-posed problem of creating a plausible full-colour image from a grey-scale prior. The current state of the art utilises image-to-image Generative Adversarial Networks (GANs). The standard method for training colourisation is reformulating RGB images into a luminance prior and two-channel chrominance supervisory signal. However, progress in auto-colourisation is inherently limited by multiple prerequisite dilemmas, where unsolved problems are mutual prerequisites. This thesis advances the field of colourisation on three fronts: architecture, measures, and data. Changes are recommended to common GAN colourisation architectures. Firstly, removing batch normalisation from the discriminator to allow the discriminator to learn the primary statistics of plausible colour images. Secondly, eliminating the direct L1 loss on the generator as L1 will limit the discovery of the plausible colour manifold. The lack of an objective measure of plausible colourisation necessitates resource-intensive human evaluation and repurposed objective measures from other fields. There is no consensus on the best objective measure due to a knowledge gap regarding how well objective measures model the mean human opinion of plausible colourisation. An extensible data set of human-evaluated colourisations, the Human Evaluated Colourisation Dataset (HECD) is presented. The results from this dataset are compared to the commonly-used objective measures and uncover a poor correlation between the objective measures and mean human opinion. The HECD can assess the future appropriateness of proposed objective measures. An interactive tool supplied with the HECD allows for a first exploration of the space of plausible colourisation. Finally, it will be shown that the luminance channel is not representative of the legacy black-and-white images that will be presented to models when deployed; This leads to out-of-distribution errors in all three channels of the final colour image. A novel technique is proposed to simulate priors that match any black-and-white media for which the spectral response is known

    Article Segmentation in Digitised Newspapers

    Get PDF
    Digitisation projects preserve and make available vast quantities of historical text. Among these, newspapers are an invaluable resource for the study of human culture and history. Article segmentation identifies each region in a digitised newspaper page that contains an article. Digital humanities, information retrieval (IR), and natural language processing (NLP) applications over digitised archives improve access to text and allow automatic information extraction. The lack of article segmentation impedes these applications. We contribute a thorough review of the existing approaches to article segmentation. Our analysis reveals divergent interpretations of the task, and inconsistent and often ambiguously defined evaluation metrics, making comparisons between systems challenging. We solve these issues by contributing a detailed task definition that examines the nuances and intricacies of article segmentation that are not immediately apparent. We provide practical guidelines on handling borderline cases and devise a new evaluation framework that allows insightful comparison of existing and future approaches. Our review also reveals that the lack of large datasets hinders meaningful evaluation and limits machine learning approaches. We solve these problems by contributing a distant supervision method for generating large datasets for article segmentation. We manually annotate a portion of our dataset and show that our method produces article segmentations over characters nearly as well as costly human annotators. We reimplement the seminal textual approach to article segmentation (Aiello and Pegoretti, 2006) and show that it does not generalise well when evaluated on a large dataset. We contribute a framework for textual article segmentation that divides the task into two distinct phases: block representation and clustering. We propose several techniques for block representation and contribute a novel highly-compressed semantic representation called similarity embeddings. We evaluate and compare different clustering techniques, and innovatively apply label propagation (Zhu and Ghahramani, 2002) to spread headline labels to similar blocks. Our similarity embeddings and label propagation approach substantially outperforms Aiello and Pegoretti but still falls short of human performance. Exploring visual approaches to article segmentation, we reimplement and analyse the state-of-the-art Bansal et al. (2014) approach. We contribute an innovative 2D Markov model approach that captures reading order dependencies and reduces the structured labelling problem to a Markov chain that we decode with Viterbi (1967). Our approach substantially outperforms Bansal et al., achieves accuracy as good as human annotators, and establishes a new state of the art in article segmentation. Our task definition, evaluation framework, and distant supervision dataset will encourage progress in the task of article segmentation. Our state-of-the-art textual and visual approaches will allow sophisticated IR and NLP applications over digitised newspaper archives, supporting research in the digital humanities

    Specialised global methods for binocular and trinocular stereo matching

    Get PDF
    The problem of estimating depth from two or more images is a fundamental problem in computer vision, which is commonly referred as to stereo matching. The applications of stereo matching range from 3D reconstruction to autonomous robot navigation. Stereo matching is particularly attractive for applications in real life because of its simplicity and low cost, especially compared to costly laser range finders/scanners, such as for the case of 3D reconstruction. However, stereo matching has its very unique problems like convergence issues in the optimisation methods, and challenges to find matches accurately due to changes in lighting conditions, occluded areas, noisy images, etc. It is precisely because of these challenges that stereo matching continues to be a very active field of research. In this thesis we develop a binocular stereo matching algorithm that works with rectified images (i.e. scan lines in two images are aligned) to find a real valued displacement (i.e. disparity) that best matches two pixels. To accomplish this our research has developed techniques to efficiently explore a 3D space, compare potential matches, and an inference algorithm to assign the optimal disparity to each pixel in the image. The proposed approach is also extended to the trinocular case. In particular, the trinocular extension deals with a binocular set of images captured at the same time and a third image displaced in time. This approach is referred as to t +1 trinocular stereo matching, and poses the challenge of recovering camera motion, which is addressed by a novel technique we call baseline recovery. We have extensively validated our binocular and trinocular algorithms using the well known KITTI and Middlebury data sets. The performance of our algorithms is consistent across different data sets, and its performance is among the top performers in the KITTI and Middlebury datasets. The time-stamped results of our algorithms as reported in this thesis can be found at: • LCU on Middlebury V2 (https://web.archive.org/web/20150106200339/http://vision.middlebury. edu/stereo/eval/). • LCU on Middlebury V3 (https://web.archive.org/web/20150510133811/http://vision.middlebury. edu/stereo/eval3/). • LPU on Middlebury V3 (https://web.archive.org/web/20161210064827/http://vision.middlebury. edu/stereo/eval3/). • LPU on KITTI 2012 (https://web.archive.org/web/20161106202908/http://cvlibs.net/datasets/ kitti/eval_stereo_flow.php?benchmark=stereo). • LPU on KITTI 2015 (https://web.archive.org/web/20161010184245/http://cvlibs.net/datasets/ kitti/eval_scene_flow.php?benchmark=stereo). • TBR on KITTI 2012 (https://web.archive.org/web/20161230052942/http://cvlibs.net/datasets/ kitti/eval_stereo_flow.php?benchmark=stereo)
    corecore