51 research outputs found

    Graph-based Estimation of Information Divergence Functions

    Get PDF
    abstract: Information divergence functions, such as the Kullback-Leibler divergence or the Hellinger distance, play a critical role in statistical signal processing and information theory; however estimating them can be challenge. Most often, parametric assumptions are made about the two distributions to estimate the divergence of interest. In cases where no parametric model fits the data, non-parametric density estimation is used. In statistical signal processing applications, Gaussianity is usually assumed since closed-form expressions for common divergence measures have been derived for this family of distributions. Parametric assumptions are preferred when it is known that the data follows the model, however this is rarely the case in real-word scenarios. Non-parametric density estimators are characterized by a very large number of parameters that have to be tuned with costly cross-validation. In this dissertation we focus on a specific family of non-parametric estimators, called direct estimators, that bypass density estimation completely and directly estimate the quantity of interest from the data. We introduce a new divergence measure, the DpD_p-divergence, that can be estimated directly from samples without parametric assumptions on the distribution. We show that the DpD_p-divergence bounds the binary, cross-domain, and multi-class Bayes error rates and, in certain cases, provides provably tighter bounds than the Hellinger divergence. In addition, we also propose a new methodology that allows the experimenter to construct direct estimators for existing divergence measures or to construct new divergence measures with custom properties that are tailored to the application. To examine the practical efficacy of these new methods, we evaluate them in a statistical learning framework on a series of real-world data science problems involving speech-based monitoring of neuro-motor disorders.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    The effectiveness of features in pattern recognition

    Get PDF
    Imperial Users onl

    New perspectives and methods for stream learning in the presence of concept drift.

    Get PDF
    153 p.Applications that generate data in the form of fast streams from non-stationary environments, that is,those where the underlying phenomena change over time, are becoming increasingly prevalent. In thiskind of environments the probability density function of the data-generating process may change overtime, producing a drift. This causes that predictive models trained over these stream data become obsoleteand do not adapt suitably to the new distribution. Specially in online learning scenarios, there is apressing need for new algorithms that adapt to this change as fast as possible, while maintaining goodperformance scores. Examples of these applications include making inferences or predictions based onfinancial data, energy demand and climate data analysis, web usage or sensor network monitoring, andmalware/spam detection, among many others.Online learning and concept drift are two of the most hot topics in the recent literature due to theirrelevance for the so-called Big Data paradigm, where nowadays we can find an increasing number ofapplications based on training data continuously available, named as data streams. Thus, learning in nonstationaryenvironments requires adaptive or evolving approaches that can monitor and track theunderlying changes, and adapt a model to accommodate those changes accordingly. In this effort, Iprovide in this thesis a comprehensive state-of-the-art approaches as well as I identify the most relevantopen challenges in the literature, while focusing on addressing three of them by providing innovativeperspectives and methods.This thesis provides with a complete overview of several related fields, and tackles several openchallenges that have been identified in the very recent state of the art. Concretely, it presents aninnovative way to generate artificial diversity in ensembles, a set of necessary adaptations andimprovements for spiking neural networks in order to be used in online learning scenarios, and finally, adrift detector based on this former algorithm. All of these approaches together constitute an innovativework aimed at presenting new perspectives and methods for the field

    New perspectives and methods for stream learning in the presence of concept drift.

    Get PDF
    153 p.Applications that generate data in the form of fast streams from non-stationary environments, that is,those where the underlying phenomena change over time, are becoming increasingly prevalent. In thiskind of environments the probability density function of the data-generating process may change overtime, producing a drift. This causes that predictive models trained over these stream data become obsoleteand do not adapt suitably to the new distribution. Specially in online learning scenarios, there is apressing need for new algorithms that adapt to this change as fast as possible, while maintaining goodperformance scores. Examples of these applications include making inferences or predictions based onfinancial data, energy demand and climate data analysis, web usage or sensor network monitoring, andmalware/spam detection, among many others.Online learning and concept drift are two of the most hot topics in the recent literature due to theirrelevance for the so-called Big Data paradigm, where nowadays we can find an increasing number ofapplications based on training data continuously available, named as data streams. Thus, learning in nonstationaryenvironments requires adaptive or evolving approaches that can monitor and track theunderlying changes, and adapt a model to accommodate those changes accordingly. In this effort, Iprovide in this thesis a comprehensive state-of-the-art approaches as well as I identify the most relevantopen challenges in the literature, while focusing on addressing three of them by providing innovativeperspectives and methods.This thesis provides with a complete overview of several related fields, and tackles several openchallenges that have been identified in the very recent state of the art. Concretely, it presents aninnovative way to generate artificial diversity in ensembles, a set of necessary adaptations andimprovements for spiking neural networks in order to be used in online learning scenarios, and finally, adrift detector based on this former algorithm. All of these approaches together constitute an innovativework aimed at presenting new perspectives and methods for the field

    A Study of Synchronization Techniques for Optical Communication Systems

    Get PDF
    The study of synchronization techniques and related topics in the design of high data rate, deep space, optical communication systems was reported. Data cover: (1) effects of timing errors in narrow pulsed digital optical systems, (2) accuracy of microwave timing systems operating in low powered optical systems, (3) development of improved tracking systems for the optical channel and determination of their tracking performance, (4) development of usable photodetector mathematical models for application to analysis and performance design in communication receivers, and (5) study application of multi-level block encoding to optical transmission of digital data

    Novelty, distillation, and federation in machine learning for medical imaging

    Get PDF
    The practical application of deep learning methods in the medical domain has many challenges. Pathologies are diverse and very few examples may be available for rare cases. Where data is collected it may lie in multiple institutions and cannot be pooled for practical and ethical reasons. Deep learning is powerful for image segmentation problems but ultimately its output must be interpretable at the patient level. Although clearly not an exhaustive list, these are the three problems tackled in this thesis. To address the rarity of pathology I investigate novelty detection algorithms to find outliers from normal anatomy. The problem is structured as first finding a low-dimension embedding and then detecting outliers in that embedding space. I evaluate for speed and accuracy several unsupervised embedding and outlier detection methods. Data consist of Magnetic Resonance Imaging (MRI) for interstitial lung disease for which healthy and pathological patches are available; only the healthy patches are used in model training. I then explore the clinical interpretability of a model output. I take related work by the Canon team — a model providing voxel-level detection of acute ischemic stroke signs — and deliver the Alberta Stroke Programme Early CT Score (ASPECTS, a measure of stroke severity). The data are acute head computed tomography volumes of suspected stroke patients. I convert from the voxel level to the brain region level and then to the patient level through a series of rules. Due to the real world clinical complexity of the problem, there are at each level — voxel, region and patient — multiple sources of “truth”; I evaluate my results appropriately against these truths. Finally, federated learning is used to train a model on data that are divided between multiple institutions. I introduce a novel evolution of this algorithm — dubbed “soft federated learning” — that avoids the central coordinating authority, and takes into account domain shift (covariate shift) and dataset size. I first demonstrate the key properties of these two algorithms on a series of MNIST (handwritten digits) toy problems. Then I apply the methods to the BraTS medical dataset, which contains MRI brain glioma scans from multiple institutions, to compare these algorithms in a realistic setting

    Width variations in river meandering evolution and chute cutoff process

    Get PDF
    Many models have been proposed to simulate and understand the long-term evolution of meandering rivers. Nevertheless, some modeling problem still needs to be solved, e.g., the stability of long-term simulations when width variations are accounted for. The present thesis proposes a physics-statistical based approach to simulate the river bank evolution, such that erosion and deposition processes act independently, with a specific shear stress threshold for each of them. In addition, the width evolution is linked with a river-specific parametric probability distribution. The analysis of a representative sample of meandering configurations, extracted from Lidar images, indicate that Generalized Extreme Values (GEV) probability density function nicely describe the along channel cross-section width distribution. For a given river, the parameters of the distribution keep almost constant in time, with significant variations observed only as after cutoff events that significantly sharpen the length of the river. The constraint of the river width based on the assumption of a GEV probability distribution ensures as the river moves throughout the floodplain adapting its width, the stability of long-term simulations. The application of the model to a reach of the Ucayali river appears to satisfactorily reproduce the planform evolution of the river and yields realistic values of the cross-section widths. The second topic considered in the thesis is the formation of chute cutoffs, which produce substantial and non-local changes in the river planform, thereby affecting the morphological evolution. The occurrence of this type of cutoffs is one of the less predictable events in the evolution of rivers, as a multiplicity of control factors are involved in their formation and maintenance. Significant contributions have appeared in the literature in the recent years, which shed light on the complex mechanisms that first lead to the incision of chutes through the floodplain, and that eventually determines the fate of both the cutoff bend and the new channel. However, the subject is not yet settled, and a systematic physic-based framework is still missing. In this thesis, two different forcing factors leading to chute cutoffs are highlighted, the channelized flow inertia and the topographic and sedimentary heterogeneity of the floodplain. Using two hydrodynamic models, the general features of the processes leading to chute cutoffs are investigated by assessing a few representative case studies

    Impedance spectroscopy techniques for condition monitoring of polymer electrolyte membrane fuel cells

    Get PDF
    Energy continues to remain the spine of all human development. As we continue to make advances in various levels, the need for energy in quantity, and even more recently, quality, continues to increase. The fuel cell presents itself as a promising prospect to solve one of mankind’s current challenge - clean energy. The fuel cell is essentially an electrochemical conversion system which takes in fuel supply to produce electricity. Some key features make the fuel cell attractive as a power source. Firstly, its efficiency in practical applications is approximately 50% compared to the typical efficiency of 40% for a typical internal combustion engine [1]. Secondly, unlike the systems such as the internal combustion engine that typically releases carbon-monoxide which is a major greenhouse gas, the typical fuel cell system, produces just water and heat, alongside the useful electrical energy. These characteristics make it attractive as a clean energy supply capable of replacing the fossil-based supplies that are currently the mainstay. Unfortunately, the fuel cell is far cry from an ideal system. Despite significant advantages of the fuel cell as a power supply, various challenges still exist which have hindered its widespread acceptance and deployment. The fuel cell at its core is a highly multi-physics system and its operational intricacies makes it highly prone to a series of fault conditions. This begs the question of durability - an important requirement of a viable power source. Another challenge is the fact that humanity currently struggles with an efficient method of producing hydrogen which is the fuel of choice for the fuel cell. Given the promises of the fuel cell however, research efforts continue to increase to further improve its viability as an energy source competitive enough to meet mankind’s need of clean energy. This work presents results bordering on efficient diagnostic approaches for the fuel cell, aimed at improving the durability of the fuel cell. Particularly, two techniques targeted at improving the popular Electrochemical Impedance Spectroscopy (EIS) are presented. Conventional EIS takes significant amount of time, rendering it unsuitable for real-time diagnostics. Multi-frequency perturbation signals have been proposed to address this challenge. These however introduces concerns surrounding the accuracy of the resulting impedance measurement. Part of this work addresses some of the challenges with the fuel cell multi-sine impedance spectroscopy, such as measurement accuracy, by defining an optimized signal synthesis formulation. The proposed approach is validated in simulation and compared to the popular exponential frequency distribution approach using the appropriately defined error metric. Secondly, the chirp – as a frequency rich signal, is investigated as an alternative perturbation signal. Consequently, the use of the wavelet transform as an analysis tool of choice is presented. The characteristic nature of the chirp signal makes a broadband frequency sweep over time possible, hence enabling a faster impedance estimation. The resulting decomposition is harnessed for impedance calculation. The approach is tested in simulation and results for equivalent circuits are presented. It is shown that the resulting impedance spectrum well approximates the theoretical values. To further validate both techniques in practice, a low-cost active load is designed and built. The active load enables the injection of an arbitrary signal using the load modulation technique. The device is tested and benchmarked against commercial frequency response analyzer (FRA) using the conventional single sine EIS technique. Both approaches developed – the improved multi-sine scheme and the chirp signal perturbation are demonstrated with the aid of the active load on a single cell fuel cell station. Outcomes of the experiment show significant accuracy from the two techniques in comparison with results obtained from the FRA equipment which implements the single sine technique. In addition, the two schemes enabled impedance results to be taken in a few seconds, compared to conventional single sine EIS which takes several minutes. Impedance measurements are also carried out in the presence of two prominent faulty conditions – flooding and drying, using the developed techniques. This demonstrates the capability of the proposed system to perform real-time diagnostics of the PEMFC using impedance information

    Colour local feature fusion for image matching and recognition

    Get PDF
    This thesis investigates the use of colour information for local image feature extraction. The work is motivated by the inherent limitation of the most widely used state of the art local feature techniques, caused by their disregard of colour information. Colour contains important information that improves the description of the world around us, and by disregarding it; chromatic edges may be lost and thus decrease the level of saliency and distinctiveness of the resulting grayscale image. This thesis addresses the question of whether colour can improve the distinctive and descriptive capabilities of local features, and if this leads to better performances in image feature matching and object recognition applications. To ensure that the developed local colour features are robust to general imaging conditions and capable for real-world applications, this work utilises the most prominent photometric colour invariant gradients from the literature. The research addresses several limitations of previous studies that used colour invariants, by implementing robust local colour features in the form of a Harris-Laplace interest region detection and a SIFT description which characterises the detected image region. Additionally, a comprehensive and rigorous evaluation is performed, that compares the largest number of colour invariants of any previous study. This research provides for the first time, conclusive findings on the capability of the chosen colour invariants for practical real-world computer vision tasks. The last major aspect of the research involves the proposal of a feature fusion extraction strategy, that uses grayscale intensity and colour information conjointly. Two separate fusion approaches are implemented and evaluated, one for local feature matching tasks and another approach for object recognition. Results from the fusion analysis strongly indicate, that the colour invariants contain unique and useful information that can enhance the performance of techniques that use grayscale only based features

    Nonlinear System Identification and Its Applications in Fault Detection and Diagnosis

    Get PDF
    corecore