50 research outputs found

    Spectral methods for multimodal data analysis

    Get PDF
    Spectral methods have proven themselves as an important and versatile tool in a wide range of problems in the fields of computer graphics, machine learning, pattern recognition, and computer vision, where many important problems boil down to constructing a Laplacian operator and finding a few of its eigenvalues and eigenfunctions. Classical examples include the computation of diffusion distances on manifolds in computer graphics, Laplacian eigenmaps, and spectral clustering in machine learning. In many cases, one has to deal with multiple data spaces simultaneously. For example, clustering multimedia data in machine learning applications involves various modalities or ``views'' (e.g., text and images), and finding correspondence between shapes in computer graphics problems is an operation performed between two or more modalities. In this thesis, we develop a generalization of spectral methods to deal with multiple data spaces and apply them to problems from the domains of computer graphics, machine learning, and image processing. Our main construction is based on simultaneous diagonalization of Laplacian operators. We present an efficient numerical technique for computing joint approximate eigenvectors of two or more Laplacians in challenging noisy scenarios, which also appears to be the first general non-smooth manifold optimization method. Finally, we use the relation between joint approximate diagonalizability and approximate commutativity of operators to define a structural similarity measure for images. We use this measure to perform structure-preserving color manipulations of a given image

    Estimating varying illuminant colours in images

    Get PDF
    Colour Constancy is the ability to perceive colours independently of varying illumi-nation colour. A human could tell that a white t-shirt was indeed white, even under the presence of blue or red illumination. These illuminant colours would actually make the reflectance colour of the t-shirt bluish or reddish. Humans can, to a good extent, see colours constantly. Getting a computer to achieve the same goal, with a high level of accuracy has proven problematic. Particularly if we wanted to use colour as a main cue in object recognition. If we trained a system on object colours under one illuminant and then tried to recognise the objects under another illuminant, the system would likely fail. Early colour constancy algorithms assumed that an image contains a single uniform illuminant. They would then attempt to estimate the colour of the illuminant to apply a single correction to the entire image. It’s not hard to imagine a scenario where a scene is lit by more than one illuminant. If we take the case of an outdoors scene on a typical summers day, we would see objects brightly lit by sunlight and others that are in shadow. The ambient light in shadows is known to be a different colour to that of direct sunlight (bluish and yellowish respectively). This means that there are at least two illuminant colours to be recovered in this scene. This thesis focuses on the harder case of recovering the illuminant colours when more than one are present in a scene. Early work on this subject made the empirical observation that illuminant colours are actually very predictable compared to surface colours. Real-world illuminants tend not to be greens or purples, but rather blues, yellows and reds. We can think of an illuminant mapping as the function which takes a scene from some unknown illuminant to a known illuminant. We model this mapping as a simple multiplication of the Red, Green and Blue channels of a pixel. It turns out that the set of realistic mappings approximately lies on a line segment in chromaticity space. We propose an algorithm that uses this knowledge and only requires two pixels of the same surface under two illuminants as input. We can then recover an estimate for the surface reflectance colour, and subsequently the two illuminants. Additionally in this thesis, we propose a more robust algorithm that can use vary-ing surface reflectance data in a scene. One of the most successful colour constancy algorithms, known Gamut Mappping, was developed by Forsyth (1990). He argued that the illuminant colour of a scene naturally constrains the surfaces colours that are possible to perceive. We couldn’t perceive a very chromatic red under a deep blue illuminant. We introduce our multiple illuminant constraint in a Gamut Mapping context and are able to further improve it’s performance. The final piece of work proposes a method for detecting shadow-edges, so that we can automatically recover estimates for the illuminant colours in and out of shadow. We also formulate our illuminant estimation algorithm in a voting scheme, that probabilistically chooses an illuminant estimate on both sides of the shadow edge. We test the performance of all our algorithms experimentally on well known datasets, as well as our new proposed shadow datasets

    Semantik renk değişmezliği

    Get PDF
    Color constancy aims to perceive the actual color of an object, disregarding the effectof the light source. Recent works showed that utilizing the semantic information inan image enhances the performance of the computational color constancy methods.Considering the recent success of the segmentation methods and the increased numberof labeled images, we propose a color constancy method that combines individualilluminant estimations of detected objects which are computed using the classes of theobjects and their associated colors. Then we introduce a weighting system that valuesthe applicability of the object classes to the color constancy problem. Lastly, weintroduce another metric expressing the detected object and how well it fits the learnedmodel of its class. Finally, we evaluate our proposed method on a popular colorconstancy dataset, confirming that each weight addition enhances the performanceof the global illuminant estimation. Experimental results show promising results,outperforming the conventional methods while competing with the state of the artmethods.--M.S. - Master of Scienc

    Texture and Colour in Image Analysis

    Get PDF
    Research in colour and texture has experienced major changes in the last few years. This book presents some recent advances in the field, specifically in the theory and applications of colour texture analysis. This volume also features benchmarks, comparative evaluations and reviews

    Stereoscopic high dynamic range imaging

    Get PDF
    Two modern technologies show promise to dramatically increase immersion in virtual environments. Stereoscopic imaging captures two images representing the views of both eyes and allows for better depth perception. High dynamic range (HDR) imaging accurately represents real world lighting as opposed to traditional low dynamic range (LDR) imaging. HDR provides a better contrast and more natural looking scenes. The combination of the two technologies in order to gain advantages of both has been, until now, mostly unexplored due to the current limitations in the imaging pipeline. This thesis reviews both fields, proposes stereoscopic high dynamic range (SHDR) imaging pipeline outlining the challenges that need to be resolved to enable SHDR and focuses on capture and compression aspects of that pipeline. The problems of capturing SHDR images that would potentially require two HDR cameras and introduce ghosting, are mitigated by capturing an HDR and LDR pair and using it to generate SHDR images. A detailed user study compared four different methods of generating SHDR images. Results demonstrated that one of the methods may produce images perceptually indistinguishable from the ground truth. Insights obtained while developing static image operators guided the design of SHDR video techniques. Three methods for generating SHDR video from an HDR-LDR video pair are proposed and compared to the ground truth SHDR videos. Results showed little overall error and identified a method with the least error. Once captured, SHDR content needs to be efficiently compressed. Five SHDR compression methods that are backward compatible are presented. The proposed methods can encode SHDR content to little more than that of a traditional single LDR image (18% larger for one method) and the backward compatibility property encourages early adoption of the format. The work presented in this thesis has introduced and advanced capture and compression methods for the adoption of SHDR imaging. In general, this research paves the way for a novel field of SHDR imaging which should lead to improved and more realistic representation of captured scenes

    The role of chromatic texture and 3D shape in colour discrimination, memory colour, and colour constancy of natural objects

    Get PDF
    The primary goal of this work was to investigate colour perception in a natural environment and to contribute to the understanding of how cues to familiar object identity influence colour appearance. A large number of studies on colour appearance employ 2D uniformly coloured patches, discarding perceptual cues such as binocular disparity, 3D luminance shading, mutual reflection, and glossy highlights are integral part of a natural scene. Moreover, natural objects possess specific cues that help our recognition (shape, surface texture or colour distribution). The aim of the first main experiment presented in this thesis was to understand the effect of shape on (1) memory colour under constant and varying illumination and on (2) colour constancy for uniformly coloured stimuli. The results demonstrated the existence of a range of memory colours associated with a familiar object, the size of which was strongly object-shape-dependent. For all objects, memory retrieval was significantly faster for object-diagnostic shape relative to generic shapes. Based on two successive controls, the author suggests that shape cues to the object identity affect the range of memory colour proportionally to the original object chromatic distribution. The second experiment examined the subject’s accuracy and precision in adjusting a stimulus colour to its typical appearance. Independently on the illuminant, results showed that memory colour accuracy and precision were enhanced by the presence of chromatic textures, diagnostic shapes, or 3D configurations with a strong interaction between diagnosticity and dimensionality of the shape. Hence, more cues to the object identity and more natural stimuli facilitate the observers in accessing their colour information from memory. A direct relationship was demonstrated between chromatic surface representation, object’s physical properties, and identificability and dimensionality of shape on memory colour accuracy, suggesting high-level mechanisms. Chromatic textures facilitated colour constancy. The third and fourth experiments tested the subject’s ability to discriminate between two chromatic stimuli in a simultaneous and successive 2AFC task, respectively. Simultaneous discrimination threshold performances for polychromatic surfaces were only due to low-level mechanism of the stimulus, whereas in the successive discrimination, i.e. when memory is involved, high-level mechanisms were established. The effect of shape was strongly task- dependent and was modulate by the object memory colour. These findings together with the strong interaction between chromatic cues and shape cues to the object identity lead to the conclusion that high level mechanisms linked to object recognition facilitated both tasks. Hence, the current thesis presents new findings on memory colour and colour constancy presented in a natural context and demonstrates the effect of high-level mechanisms in chromatic discrimination as a function of cues to the object identity such as shape and texture. This work contributes to a deeper understanding of colour perception and object recognition in the natural world.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Visualisation of Long in Time Dynamic Networks on Large Touch Displays

    Get PDF
    Any dataset containing information about relationships between entities can be modelled as a network. This network can be static, where the entities/relationships do not change over time, or dynamic, where the entities/relationships change over time. Network data that changes over time, dynamic network data, is a powerful resource when studying many important phenomena, across wide-ranging fields from travel networks to epidemiology.However, it is very difficult to analyse this data, especially if it covers a long period of time (e.g, one month) with respect to its temporal resolution (e.g. seconds). In this thesis, we address the problem of visualising long in time dynamic networks: networks that may not be particularly large in terms of the number of entities or relationships, but are long in terms of the length of time they cover when compared to their temporal resolution.We first introduce Dynamic Network Plaid, a system for the visualisation and analysis of long in time dynamic networks. We design and build for an 84" touch-screen vertically-mounted display as existing work reports positive results for the use of these in a visualisation context, and that they are useful for collaboration. The Plaid integrates multiple views and we prioritise the visualisation of interaction provenance. In this system we also introduce a novel method of time exploration called ‘interactive timeslicing’. This allows the selection and comparison of points that are far apart in time, a feature not offered by existing visualisation systems. The Plaid is validated through an expert user evaluation with three public health researchers.To confirm observations of the expert user evaluation, we then carry out a formal laboratory study with a large touch-screen display to verify our novel method of time navigation against existing animation and small multiples approaches. From this study, we find that interactive timeslicing outperforms animation and small multiples for complex tasks requiring a compari-son between multiple points that are far apart in time. We also find that small multiples is best suited to comparisons of multiple sequential points in time across a time interval.To generalise the results of this experiment, we later run a second formal laboratory study in the same format as the first, but this time using standard-sized displays with indirect mouse input. The second study reaffirms the results of the first, showing that our novel method of time navigation can facilitate the visual comparison of points that are distant in time in a way that existing approaches, small multiples and animation, cannot. The study demonstrates that our previous results generalise across display size and interaction type (touch vs mouse).In this thesis we introduce novel representations and time interaction techniques to improve the visualisation of long in time dynamic networks, and experimentally show that our novel method of time interaction outperforms other popular methods for some task types

    Cross Dynamic Range And Cross Resolution Objective Image Quality Assessment With Applications

    Get PDF
    In recent years, image and video signals have become an indispensable part of human life. There has been an increasing demand for high quality image and video products and services. To monitor, maintain and enhance image and video quality objective image and video quality assessment tools play crucial roles in a wide range of applications throughout the field of image and video processing, including image and video acquisition, communication, interpolation, retrieval, and displaying. A number of objective image and video quality measures have been introduced in the last decades such as mean square error (MSE), peak signal to noise ratio (PSNR), and structural similarity index (SSIM). However, they are not applicable when the dynamic range or spatial resolution of images being compared is different from that of the corresponding reference images. In this thesis, we aim to tackle these two main problems in the field of image quality assessment. Tone mapping operators (TMOs) that convert high dynamic range (HDR) to low dynamic range (LDR) images provide practically useful tools for the visualization of HDR images on standard LDR displays. Most TMOs have been designed in the absence of a well-established and subject-validated image quality assessment (IQA) model, without which fair comparisons and further improvement are difficult. We propose an objective quality assessment algorithm for tone-mapped images using HDR images as references by combining 1) a multi-scale signal fidelity measure based on a modified structural similarity (SSIM) index; and 2) a naturalness measure based on intensity statistics of natural images. To evaluate the proposed Tone-Mapped image Quality Index (TMQI), its performance in several applications and optimization problems is provided. Specifically, the main component of TMQI known as structural fidelity is modified and adopted to enhance the visualization of HDR medical images on standard displays. Moreover, a substantially different approach to design TMOs is presented, where instead of using any pre-defined systematic computational structure (such as image transformation or contrast/edge enhancement) for tone-mapping, we navigate in the space of all LDR images, searching for the image that maximizes structural fidelity or TMQI. There has been an increasing number of image interpolation and image super-resolution (SR) algorithms proposed recently to create images with higher spatial resolution from low-resolution (LR) images. However, the evaluation of such SR and interpolation algorithms is cumbersome. Most existing image quality measures are not applicable because LR and resultant high resolution (HR) images have different spatial resolutions. We make one of the first attempts to develop objective quality assessment methods to compare LR and HR images. Our method adopts a framework based on natural scene statistics (NSS) where image quality degradation is gauged by the deviation of its statistical features from NSS models trained upon high quality natural images. In particular, we extract frequency energy falloff, dominant orientation and spatial continuity statistics from natural images and build statistical models to describe such statistics. These models are then used to measure statistical naturalness of interpolated images. We carried out subjective tests to validate our approach, which also demonstrates promising results. The performance of the proposed measure is further evaluated when applied to parameter tuning in image interpolation algorithms

    Physically based geometry and reflectance recovery from images

    Get PDF
    An image is a projection of the three-dimensional world taken at an instance in space and time. Its formation involves a complex interplay between geometry, illumination and material properties of objects in the scene. Given image data and knowledge of some scene properties, the recovery of the remaining components can be cast as a set of physically based inverse problems. This thesis investigates three inverse problems on the recovery of scene properties and discusses how we can develop appropriate physical constraints and build them into effective algorithms. Firstly, we study the problem of geometry recovery from a single image with repeated texture. Our technique leverages the PatchMatch algorithm to detect and match repeated patterns undergoing geometric transformations. This allows effective enforcement of translational symmetry constraint in the recovery of texture lattice. Secondly, we study the problem of computational relighting using RGB-D data, where the depth data is acquired through a Kinect sensor and is often noisy. We show how the inclusion of noisy depth input helps to resolve ambiguities in the recovery of shape and reflectance in the inverse rendering problem. Our results show that the complementary nature of RGB and depth is highly beneficial for a practical relighting system. Lastly, in the third problem, we exploit the use of geometric constraints relating two views, to address a challenging problem in Internet image matching. Our solution is robust to geometric and photometric distortions over wide baselines. It also accommodates repeated structures that are commonly found in our modern environment. Building on the image correspondence, we also investigate the use of color transfer as an additional global constraint in relating Internet images. It shows promising results in obtaining more accurate and denser correspondence
    corecore