12 research outputs found

    Leveraging colour-based pseudo-labels to supervise saliency detection in hyperspectral image datasets

    Get PDF
    Saliency detection mimics the natural visual attention mechanism that identifies an imagery region to be salient when it attracts visual attention more than the background. This image analysis task covers many important applications in several fields such as military science, ocean research, resources exploration, disaster and land-use monitoring tasks. Despite hundreds of models have been proposed for saliency detection in colour images, there is still a large room for improving saliency detection performances in hyperspectral imaging analysis. In the present study, an ensemble learning methodology for saliency detection in hyperspectral imagery datasets is presented. It enhances saliency assignments yielded through a robust colour-based technique with new saliency information extracted by taking advantage of the abundance of spectral information on multiple hyperspectral images. The experiments performed with the proposed methodology provide encouraging results, also compared to several competitors

    Non-parametric Methods for Automatic Exposure Control, Radiometric Calibration and Dynamic Range Compression

    Get PDF
    Imaging systems are essential to a wide range of modern day applications. With the continuous advancement in imaging systems, there is an on-going need to adapt and improve the imaging pipeline running inside the imaging systems. In this thesis, methods are presented to improve the imaging pipeline of digital cameras. Here we present three methods to improve important phases of the imaging process, which are (i) ``Automatic exposure adjustment'' (ii) ``Radiometric calibration'' (iii) ''High dynamic range compression''. These contributions touch the initial, intermediate and final stages of imaging pipeline of digital cameras. For exposure control, we propose two methods. The first makes use of CCD-based equations to formulate the exposure control problem. To estimate the exposure time, an initial image was acquired for each wavelength channel to which contrast adjustment techniques were applied. This helps to recover a reference cumulative distribution function of image brightness at each channel. The second method proposed for automatic exposure control is an iterative method applicable for a broad range of imaging systems. It uses spectral sensitivity functions such as the photopic response functions for the generation of a spectral power image of the captured scene. A target image is then generated using the spectral power image by applying histogram equalization. The exposure time is hence calculated iteratively by minimizing the squared difference between target and the current spectral power image. Here we further analyze the method by performing its stability and controllability analysis using a state space representation used in control theory. The applicability of the proposed method for exposure time calculation was shown on real world scenes using cameras with varying architectures. Radiometric calibration is the estimate of the non-linear mapping of the input radiance map to the output brightness values. The radiometric mapping is represented by the camera response function with which the radiance map of the scene is estimated. Our radiometric calibration method employs an L1 cost function by taking advantage of Weisfeld optimization scheme. The proposed calibration works with multiple input images of the scene with varying exposure. It can also perform calibration using a single input with few constraints. The proposed method outperforms, quantitatively and qualitatively, various alternative methods found in the literature of radiometric calibration. Finally, to realistically represent the estimated radiance maps on low dynamic range display (LDR) devices, we propose a method for dynamic range compression. Radiance maps generally have higher dynamic range (HDR) as compared to the widely used display devices. Thus, for display purposes, dynamic range compression is required on HDR images. Our proposed method generates few LDR images from the HDR radiance map by clipping its values at different exposures. Using contrast information of each LDR image generated, the method uses an energy minimization approach to estimate the probability map of each LDR image. These probability maps are then used as label set to form final compressed dynamic range image for the display device. The results of our method were compared qualitatively and quantitatively with those produced by widely cited and professionally used methods

    Illumination Invariant Deep Learning for Hyperspectral Data

    Get PDF
    Motivated by the variability in hyperspectral images due to illumination and the difficulty in acquiring labelled data, this thesis proposes different approaches for learning illumination invariant feature representations and classification models for hyperspectral data captured outdoors, under natural sunlight. The approaches integrate domain knowledge into learning algorithms and hence does not rely on a priori knowledge of atmospheric parameters, additional sensors or large amounts of labelled training data. Hyperspectral sensors record rich semantic information from a scene, making them useful for robotics or remote sensing applications where perception systems are used to gain an understanding of the scene. Images recorded by hyperspectral sensors can, however, be affected to varying degrees by intrinsic factors relating to the sensor itself (keystone, smile, noise, particularly at the limits of the sensed spectral range) but also by extrinsic factors such as the way the scene is illuminated. The appearance of the scene in the image is tied to the incident illumination which is dependent on variables such as the position of the sun, geometry of the surface and the prevailing atmospheric conditions. Effects like shadows can make the appearance and spectral characteristics of identical materials to be significantly different. This degrades the performance of high-level algorithms that use hyperspectral data, such as those that do classification and clustering. If sufficient training data is available, learning algorithms such as neural networks can capture variability in the scene appearance and be trained to compensate for it. Learning algorithms are advantageous for this task because they do not require a priori knowledge of the prevailing atmospheric conditions or data from additional sensors. Labelling of hyperspectral data is, however, difficult and time-consuming, so acquiring enough labelled samples for the learning algorithm to adequately capture the scene appearance is challenging. Hence, there is a need for the development of techniques that are invariant to the effects of illumination that do not require large amounts of labelled data. In this thesis, an approach to learning a representation of hyperspectral data that is invariant to the effects of illumination is proposed. This approach combines a physics-based model of the illumination process with an unsupervised deep learning algorithm, and thus requires no labelled data. Datasets that vary both temporally and spatially are used to compare the proposed approach to other similar state-of-the-art techniques. The results show that the learnt representation is more invariant to shadows in the image and to variations in brightness due to changes in the scene topography or position of the sun in the sky. The results also show that a supervised classifier can predict class labels more accurately and more consistently across time when images are represented using the proposed method. Additionally, this thesis proposes methods to train supervised classification models to be more robust to variations in illumination where only limited amounts of labelled data are available. The transfer of knowledge from well-labelled datasets to poorly labelled datasets for classification is investigated. A method is also proposed for enabling small amounts of labelled samples to capture the variability in spectra across the scene. These samples are then used to train a classifier to be robust to the variability in the data caused by variations in illumination. The results show that these approaches make convolutional neural network classifiers more robust and achieve better performance when there is limited labelled training data. A case study is presented where a pipeline is proposed that incorporates the methods proposed in this thesis for learning robust feature representations and classification models. A scene is clustered using no labelled data. The results show that the pipeline groups the data into clusters that are consistent with the spatial distribution of the classes in the scene as determined from ground truth

    Multisensory Imagery Cues for Object Separation, Specularity Detection and Deep Learning based Inpainting

    Full text link
    Multisensory imagery cues have been actively investigated in diverse applications in the computer vision community to provide additional geometric information that is either absent or difficult to capture from mainstream two-dimensional imaging. The inherent features of multispectral polarimetric light field imagery (MSPLFI) include object distribution over spectra, surface properties, shape, shading and pixel flow in light space. The aim of this dissertation is to explore these inherent properties to exploit new structures and methodologies for the tasks of object separation, specularity detection and deep learning-based inpainting in MSPLFI. In the first part of this research, an application to separate foreground objects from the background in both outdoor and indoor scenes using multispectral polarimetric imagery (MSPI) cues is examined. Based on the pixel neighbourhood relationship, an on-demand clustering technique is proposed and implemented to separate artificial objects from natural background in a complex outdoor scene. However, due to indoor scenes only containing artificial objects, with vast variations in energy levels among spectra, a multiband fusion technique followed by a background segmentation algorithm is proposed to separate the foreground from the background. In this regard, first, each spectrum is decomposed into low and high frequencies using the fast Fourier transform (FFT) method. Second, principal component analysis (PCA) is applied on both frequency images of the individual spectrum and then combined with the first principal components as a fused image. Finally, a polarimetric background segmentation (BS) algorithm based on the Stokes vector is proposed and implemented on the fused image. The performance of the proposed approaches are evaluated and compared using publicly available MSPI datasets and the dice similarity coefficient (DSC). The proposed multiband fusion and BS methods demonstrate better fusion quality and higher segmentation accuracy compared with other studies for several metrics, including mean absolute percentage error (MAPE), peak signal-to-noise ratio (PSNR), Pearson correlation coefficient (PCOR) mutual information (MI), accuracy, Geometric Mean (G-mean), precision, recall and F1-score. In the second part of this work, a twofold framework for specular reflection detection (SRD) and specular reflection inpainting (SRI) in transparent objects is proposed. The SRD algorithm is based on the mean, the covariance and the Mahalanobis distance for predicting anomalous pixels in MSPLFI. The SRI algorithm first selects four-connected neighbouring pixels from sub-aperture images and then replaces the SRD pixel with the closest matched pixel. For both algorithms, a 6D MSPLFI transparent object dataset is captured from multisensory imagery cues due to the unavailability of this kind of dataset. The experimental results demonstrate that the proposed algorithms predict higher SRD accuracy and better SRI quality than the existing approaches reported in this part in terms of F1-score, G-mean, accuracy, the structural similarity index (SSIM), the PSNR, the mean squared error (IMMSE) and the mean absolute deviation (MAD). However, due to synthesising SRD pixels based on the pixel neighbourhood relationship, the proposed inpainting method in this research produces artefacts and errors when inpainting large specularity areas with irregular holes. Therefore, in the last part of this research, the emphasis is on inpainting large specularity areas with irregular holes based on the deep feature extraction from multisensory imagery cues. The proposed six-stage deep learning inpainting (DLI) framework is based on the generative adversarial network (GAN) architecture and consists of a generator network and a discriminator network. First, pixels’ global flow in the sub-aperture images is calculated by applying the large displacement optical flow (LDOF) method. The proposed training algorithm combines global flow with local flow and coarse inpainting results predicted from the baseline method. The generator attempts to generate best-matched features, while the discriminator seeks to predict the maximum difference between the predicted results and the actual results. The experimental results demonstrate that in terms of the PSNR, MSSIM, IMMSE and MAD, the proposed DLI framework predicts superior inpainting quality to the baseline method and the previous part of this research

    Precision modulation in predictive coding hierarchies: theoretical, behavioural and neuroimaging investigations

    Get PDF
    Estimation of uncertainty is an important aspect of perception and a prerequisite for effective action. This thesis explores the implementation of uncertainty estimation as precision modulation within a predictive coding hierarchy, optimised within a neurbiologically-plausible message-passing scheme via the minimisation of free-energy. This thesis consists of six chapters. The first presents a new model of a classic visual illusion, the Cornsweet illusion, which demonstrates that the Cornsweet illusion is a natural consequence of Bayes-optimal perception under the free-energy principle, and demonstrates that increasing contrast can be modelled by increasing signal-to-noise ratio. The second chapter describes dynamic causal modelling of EEG data collected from participants viewing the Cornsweet illusion, demonstrating that a reduction in precision, or superficial pyramidal cell gain, in lower visual hierarchical levels, is sufficient to explain contrast-dependent changes in ERPs. The third describes a model of a simple attentional paradigm – the Posner paradigm – recasting attention as the optimal modulation of precision in sensory channels. The fourth describes an MEG study of the Posner paradigm, using Bayesian model selection to explore the role of changes in backwards and modulatory connections and changes in local superficial pyramidal cell gain in producing the electrophysiological and behavioural correlates of the Posner paradigm. The fifth chapter recasts the Posner paradigm in the motor domain to investigate the level (intrinsic vs. extrinsic) of precision modulation by motor cues. The sixth describes a new model of sensory attenuation based on using precision modulation to balance the imperatives to act and perceive. I hope to demonstrate that precision modulation within predictive coding hierarchies, under the free-energy principle, is a flexible and powerful way of describing and explaining both behavioural and neuroimaging data

    Realistic visualisation of cultural heritage objects

    Get PDF
    This research investigation used digital photography in a hemispherical dome, enabling a set of 64 photographic images of an object to be captured in perfect pixel register, with each image illuminated from a different direction. This representation turns out to be much richer than a single 2D image, because it contains information at each point about both the 3D shape of the surface (gradient and local curvature) and the directionality of reflectance (gloss and specularity). Thereby it enables not only interactive visualisation through viewer software, giving the illusion of 3D, but also the reconstruction of an actual 3D surface and highly realistic rendering of a wide range of materials. The following seven outcomes of the research are claimed as novel and therefore as representing contributions to knowledge in the field: A method for determining the geometry of an illumination dome; An adaptive method for finding surface normals by bounded regression; Generating 3D surfaces from photometric stereo; Relationship between surface normals and specular angles; Modelling surface specularity by a modified Lorentzian function; Determining the optimal wavelengths of colour laser scanners; Characterising colour devices by synthetic reflectance spectra

    Variational image fusion

    Get PDF
    The main goal of this work is the fusion of multiple images to a single composite that offers more information than the individual input images. We approach those fusion tasks within a variational framework. First, we present iterative schemes that are well-suited for such variational problems and related tasks. They lead to efficient algorithms that are simple to implement and well-parallelisable. Next, we design a general fusion technique that aims for an image with optimal local contrast. This is the key for a versatile method that performs well in many application areas such as multispectral imaging, decolourisation, and exposure fusion. To handle motion within an exposure set, we present the following two-step approach: First, we introduce the complete rank transform to design an optic flow approach that is robust against severe illumination changes. Second, we eliminate remaining misalignments by means of brightness transfer functions that relate the brightness values between frames. Additional knowledge about the exposure set enables us to propose the first fully coupled method that jointly computes an aligned high dynamic range image and dense displacement fields. Finally, we present a technique that infers depth information from differently focused images. In this context, we additionally introduce a novel second order regulariser that adapts to the image structure in an anisotropic way.Das Hauptziel dieser Arbeit ist die Fusion mehrerer Bilder zu einem Einzelbild, das mehr Informationen bietet als die einzelnen Eingangsbilder. Wir verwirklichen diese Fusionsaufgaben in einem variationellen Rahmen. Zunächst präsentieren wir iterative Schemata, die sich gut für solche variationellen Probleme und verwandte Aufgaben eignen. Danach entwerfen wir eine Fusionstechnik, die ein Bild mit optimalem lokalen Kontrast anstrebt. Dies ist der Schlüssel für eine vielseitige Methode, die gute Ergebnisse für zahlreiche Anwendungsbereiche wie Multispektralaufnahmen, Bildentfärbung oder Belichtungsreihenfusion liefert. Um Bewegungen in einer Belichtungsreihe zu handhaben, präsentieren wir folgenden Zweischrittansatz: Zuerst stellen wir die komplette Rangtransformation vor, um eine optische Flussmethode zu entwerfen, die robust gegenüber starken Beleuchtungsänderungen ist. Dann eliminieren wir verbleibende Registrierungsfehler mit der Helligkeitstransferfunktion, welche die Helligkeitswerte zwischen Bildern in Beziehung setzt. Zusätzliches Wissen über die Belichtungsreihe ermöglicht uns, die erste vollständig gekoppelte Methode vorzustellen, die gemeinsam ein registriertes Hochkontrastbild sowie dichte Bewegungsfelder berechnet. Final präsentieren wir eine Technik, die von unterschiedlich fokussierten Bildern Tiefeninformation ableitet. In diesem Kontext stellen wir zusätzlich einen neuen Regularisierer zweiter Ordnung vor, der sich der Bildstruktur anisotrop anpasst

    Variational image fusion

    Get PDF
    The main goal of this work is the fusion of multiple images to a single composite that offers more information than the individual input images. We approach those fusion tasks within a variational framework. First, we present iterative schemes that are well-suited for such variational problems and related tasks. They lead to efficient algorithms that are simple to implement and well-parallelisable. Next, we design a general fusion technique that aims for an image with optimal local contrast. This is the key for a versatile method that performs well in many application areas such as multispectral imaging, decolourisation, and exposure fusion. To handle motion within an exposure set, we present the following two-step approach: First, we introduce the complete rank transform to design an optic flow approach that is robust against severe illumination changes. Second, we eliminate remaining misalignments by means of brightness transfer functions that relate the brightness values between frames. Additional knowledge about the exposure set enables us to propose the first fully coupled method that jointly computes an aligned high dynamic range image and dense displacement fields. Finally, we present a technique that infers depth information from differently focused images. In this context, we additionally introduce a novel second order regulariser that adapts to the image structure in an anisotropic way.Das Hauptziel dieser Arbeit ist die Fusion mehrerer Bilder zu einem Einzelbild, das mehr Informationen bietet als die einzelnen Eingangsbilder. Wir verwirklichen diese Fusionsaufgaben in einem variationellen Rahmen. Zunächst präsentieren wir iterative Schemata, die sich gut für solche variationellen Probleme und verwandte Aufgaben eignen. Danach entwerfen wir eine Fusionstechnik, die ein Bild mit optimalem lokalen Kontrast anstrebt. Dies ist der Schlüssel für eine vielseitige Methode, die gute Ergebnisse für zahlreiche Anwendungsbereiche wie Multispektralaufnahmen, Bildentfärbung oder Belichtungsreihenfusion liefert. Um Bewegungen in einer Belichtungsreihe zu handhaben, präsentieren wir folgenden Zweischrittansatz: Zuerst stellen wir die komplette Rangtransformation vor, um eine optische Flussmethode zu entwerfen, die robust gegenüber starken Beleuchtungsänderungen ist. Dann eliminieren wir verbleibende Registrierungsfehler mit der Helligkeitstransferfunktion, welche die Helligkeitswerte zwischen Bildern in Beziehung setzt. Zusätzliches Wissen über die Belichtungsreihe ermöglicht uns, die erste vollständig gekoppelte Methode vorzustellen, die gemeinsam ein registriertes Hochkontrastbild sowie dichte Bewegungsfelder berechnet. Final präsentieren wir eine Technik, die von unterschiedlich fokussierten Bildern Tiefeninformation ableitet. In diesem Kontext stellen wir zusätzlich einen neuen Regularisierer zweiter Ordnung vor, der sich der Bildstruktur anisotrop anpasst

    Computational Models for Automated Histopathological Assessment of Colorectal Liver Metastasis Progression

    Get PDF
    PhDHistopathology imaging is a type of microscopy imaging commonly used for the microlevel clinical examination of a patient’s pathology. Due to the extremely large size of histopathology images, especially whole slide images (WSIs), it is difficult for pathologists to make a quantitative assessment by inspecting the details of a WSI. Hence, a computeraided system is necessary to provide a subjective and consistent assessment of the WSI for personalised treatment decisions. In this thesis, a deep learning framework for the automatic analysis of whole slide histopathology images is presented for the first time, which aims to address the challenging task of assessing and grading colorectal liver metastasis (CRLM). Quantitative evaluations of a patient’s condition with CRLM are conducted through quantifying different tissue components in resected tumorous specimens. This study mimics the visual examination process of human experts, by focusing on three levels of information, the tissue level, cell level and pixel level, to achieve the step by step segmentation of histopathology images. At the tissue level, patches with category information are utilised to analyse the WSIs. Both classification-based approaches and segmentation-based approaches are investigated to locate the metastasis region and quantify different components of the WSI. For the classification-based method, different factors that might affect the classification accuracy are explored using state-of-the-art deep convolutional neural networks (DCNNs). Furthermore, a novel network is proposed to merge the information from different magnification levels to include contextual information to support the final decision. With the support by the segmentation-based method, edge information from the image is integrated with the proposed fully convolutional neural network to further enhance the segmentation results. At the cell level, nuclei related information is examined to tackle the challenge of inadequate annotations. The problem is approached from two aspects: a weakly supervised nuclei detection and classification method is presented to model the nuclei in the CRLM by integrating a traditional image processing method and variational auto-encoder (VAE). A novel nuclei instance segmentation framework is proposed to boost the accuracy of the nuclei detection and segmentation using the idea of transfer learning. Afterwards, a fusion framework is proposed to enhance the tissue level segmentation results by leveraging the statistical and spatial properties of the cells. At the pixel level, the segmentation problem is tackled by introducing the information from the immunohistochemistry (IHC) stained images. Firstly, two data augmentation approaches, synthesis-based and transfer-based, are proposed to address the problem of insufficient pixel level segmentation. Afterwards, with the paired image and masks having been obtained, an end-to-end model is trained to achieve pixel level segmentation. Secondly, another novel weakly supervised approach based on the generative adversarial network (GAN) is proposed to explore the feasibility of transforming unpaired haematoxylin and eosin (HE) images to IHC stained images. Extensive experiments reveal that the virtually stained images can also be used for pixel level segmentation
    corecore