268 research outputs found

    Representation learning for uncertainty-aware clinical decision support

    Get PDF
    Over the last decade, there has been an increasing trend towards digitalization in healthcare, where a growing amount of patient data is collected and stored electronically. These recorded data are known as electronic health records. They are the basis for state-of-the-art research on clinical decision support so that better patient care can be delivered with the help of advanced analytical techniques like machine learning. Among various technical fields in machine learning, representation learning is about learning good representations from raw data to extract useful information for downstream prediction tasks. Deep learning, a crucial class of methods in representation learning, has achieved great success in many fields such as computer vision and natural language processing. These technical breakthroughs would presumably further advance the research and development of data analytics in healthcare. This thesis addresses clinically relevant research questions by developing algorithms based on state-of-the-art representation learning techniques. When a patient visits the hospital, a physician will suggest a treatment in a deterministic manner. Meanwhile, uncertainty comes into play when the past statistics of treatment decisions from various physicians are analyzed, as they would possibly suggest different treatments, depending on their training and experiences. The uncertainty in clinical decision-making processes is the focus of this thesis. The models developed for supporting these processes will therefore have a probabilistic nature. More specifically, the predictions are predictive distributions in regression tasks and probability distributions over, e.g., different treatment decisions, in classification tasks. The first part of the thesis is concerned with prescriptive analytics to provide treatment recommendations. Apart from patient information and treatment decisions, the outcome after the respective treatment is included in learning treatment suggestions. The problem setting is known as learning individualized treatment rules and is formulated as a contextual bandit problem. A general framework for learning individualized treatment rules using data from observational studies is presented based on state-of-the-art representation learning techniques. From various offline evaluation methods, it is shown that the treatment policy in our proposed framework can demonstrate better performance than both physicians and competitive baselines. Subsequently, the uncertainty-aware regression models in diagnostic and predictive analytics are studied. Uncertainty-aware deep kernel learning models are proposed, which allow the estimation of the predictive uncertainty by a pipeline of neural networks and a sparse Gaussian process. By considering the input data structure, respective models are developed for diagnostic medical image data and sequential electronic health records. Various pre-training methods from representation learning are adapted to investigate their impacts on the proposed models. Through extensive experiments, it is shown that the proposed models delivered better performance than common architectures in most cases. More importantly, uncertainty-awareness of the proposed models is illustrated by systematically expressing higher confidence in more accurate predictions and less confidence in less accurate ones. The last part of the thesis is about missing data imputation in descriptive analytics, which provides essential evidence for subsequent decision-making processes. Rather than traditional mean and median imputation, a more advanced solution based on generative adversarial networks is proposed. The presented method takes the categorical nature of patient features into consideration, which enables the stabilization of the adversarial training. It is shown that the proposed method can better improve the predictive accuracy compared to traditional imputation baselines

    State–of–the–art report on nonlinear representation of sources and channels

    Get PDF
    This report consists of two complementary parts, related to the modeling of two important sources of nonlinearities in a communications system. In the first part, an overview of important past work related to the estimation, compression and processing of sparse data through the use of nonlinear models is provided. In the second part, the current state of the art on the representation of wireless channels in the presence of nonlinearities is summarized. In addition to the characteristics of the nonlinear wireless fading channel, some information is also provided on recent approaches to the sparse representation of such channels

    Multi-view machine learning methods to uncover brain-behaviour associations

    Get PDF
    The heterogeneity of neurological and mental disorders has been a key confound in disease understanding and treatment outcome prediction, as the study of patient populations typically includes multiple subgroups that do not align with the diagnostic categories. The aim of this thesis is to investigate and extend classical multivariate methods, such as Canonical Correlation Analysis (CCA), and latent variable models, e.g., Group Factor Analysis (GFA), to uncover associations between brain and behaviour that may characterize patient populations and subgroups of patients. In the first contribution of this thesis, we applied CCA to investigate brain-behaviour associations in a sample of healthy and depressed adolescents and young adults. We found two positive-negative brain-behaviour modes of covariation, capturing externalisation/ internalisation symptoms and well-being/distress. In the second contribution of the thesis, I applied sparse CCA to the same dataset to present a regularised approach to investigate brain-behaviour associations in high dimensional datasets. Here, I compared two approaches to optimise the regularisation parameters of sparse CCA and showed that the choice of the optimisation strategy might have an impact on the results. In the third contribution, I extended the GFA model to mitigate some limitations of CCA, such as handling missing data. I applied the extended GFA model to investigate links between high dimensional brain imaging and non-imaging data from the Human Connectome Project, and predict non-imaging measures from brain functional connectivity. The results were consistent between complete and incomplete data, and replicated previously reported findings. In the final contribution of this thesis, I proposed two extensions of GFA to uncover brain behaviour associations that characterize subgroups of subjects in an unsupervised and supervised way, as well as explore within-group variability at the individual level. These extensions were demonstrated using a dataset of patients with genetic frontotemporal dementia. In summary, this thesis presents multi-view methods that can be used to deepen our understanding about the latent dimensions of disease in mental/neurological disorders and potentially enable patient stratification

    Robust speech recognition with spectrogram factorisation

    Get PDF
    Communication by speech is intrinsic for humans. Since the breakthrough of mobile devices and wireless communication, digital transmission of speech has become ubiquitous. Similarly distribution and storage of audio and video data has increased rapidly. However, despite being technically capable to record and process audio signals, only a fraction of digital systems and services are actually able to work with spoken input, that is, to operate on the lexical content of speech. One persistent obstacle for practical deployment of automatic speech recognition systems is inadequate robustness against noise and other interferences, which regularly corrupt signals recorded in real-world environments. Speech and diverse noises are both complex signals, which are not trivially separable. Despite decades of research and a multitude of different approaches, the problem has not been solved to a sufficient extent. Especially the mathematically ill-posed problem of separating multiple sources from a single-channel input requires advanced models and algorithms to be solvable. One promising path is using a composite model of long-context atoms to represent a mixture of non-stationary sources based on their spectro-temporal behaviour. Algorithms derived from the family of non-negative matrix factorisations have been applied to such problems to separate and recognise individual sources like speech. This thesis describes a set of tools developed for non-negative modelling of audio spectrograms, especially involving speech and real-world noise sources. An overview is provided to the complete framework starting from model and feature definitions, advancing to factorisation algorithms, and finally describing different routes for separation, enhancement, and recognition tasks. Current issues and their potential solutions are discussed both theoretically and from a practical point of view. The included publications describe factorisation-based recognition systems, which have been evaluated on publicly available speech corpora in order to determine the efficiency of various separation and recognition algorithms. Several variants and system combinations that have been proposed in literature are also discussed. The work covers a broad span of factorisation-based system components, which together aim at providing a practically viable solution to robust processing and recognition of speech in everyday situations

    Bayesian Methods in Tensor Analysis

    Full text link
    Tensors, also known as multidimensional arrays, are useful data structures in machine learning and statistics. In recent years, Bayesian methods have emerged as a popular direction for analyzing tensor-valued data since they provide a convenient way to introduce sparsity into the model and conduct uncertainty quantification. In this article, we provide an overview of frequentist and Bayesian methods for solving tensor completion and regression problems, with a focus on Bayesian methods. We review common Bayesian tensor approaches including model formulation, prior assignment, posterior computation, and theoretical properties. We also discuss potential future directions in this field.Comment: 32 pages, 8 figures, 2 table

    Latent representation for the characterisation of mental diseases

    Get PDF
    Mención Internacional en el título de doctorMachine learning (ML) techniques are becoming crucial in the field of health and, in particular, in the analysis of mental diseases. These are usually studied with neuroimaging, which is characterised by a large number of input variables compared to the number of samples available. The main objective of this PhD thesis is to propose different ML techniques to analyse mental diseases from neuroimaging data including different extensions of these models in order to adapt them to the neuroscience scenario. In particular, this thesis focuses on using brainimaging latent representations, since they allow us to endow the problem with a reduced low dimensional representation while obtaining a better insight on the internal relations between the disease and the available data. This way, the main objective of this PhD thesis is to provide interpretable results that are competent with the state-of-the-art in the analysis of mental diseases. This thesis starts proposing a model based on classic latent representation formulations, which relies on a bagging process to obtain the relevance of each brainimaging voxel, Regularised Bagged Canonical Correlation Analysis (RB-CCA). The learnt relevance is combined with a statistical test to obtain a selection of features. What’s more, the proposal obtains a class-wise selection which, in turn, further improves the analysis of the effect of each brain area on the stages of the mental disease. In addition, RB-CCA uses the relevance measure to guide the feature extraction process by using it to penalise the least informative voxels for obtaining the low-dimensional representation. Results obtained on two databases for the characterisation of Alzheimer’s disease and Attention Deficit Hyperactivity Disorder show that the model is able to perform as well as or better than the baselines while providing interpretable solutions. Subsequently, this thesis continues with a second model that uses Bayesian approximations to obtain a latent representation. Specifically, this model focuses on providing different functionalities to build a common representation from different data sources and particularities. For this purpose, the proposed generative model, Sparse Semi-supervised Heterogeneous Interbattery Bayesian Factor Analysis (SSHIBA), can learn the feature relevance to perform feature selection, as well as automatically select the number of latent factors. In addition, it can also model heterogeneous data (real, multi-label and categorical), work with kernels and use a semi-supervised formulation, which naturally imputes missing values by sampling from the learnt distributions. Results using this model demonstrate the versatility of the formulation, which allows these extensions to be combined interchangeably, expanding the scenarios in which the model can be applied and improving the interpretability of the results. Finally, this thesis includes a comparison of the proposed models on the Alzheimer’s disease dataset, where both provide similar results in terms of performance; however, RB-CCA provides a more robust analysis of mental diseases that is more easily interpretable. On the other hand, while RB-CCA is more limited to specific scenarios, the SSHIBA formulation allows a wider variety of data to be combined and is easily adapted to more complex real-life scenarios.Las técnicas de aprendizaje automático (ML) están siendo cruciales en el campo de la salud y, en particular, en el análisis de las enfermedades mentales. Estas se estudian habitualmente con neuroimagen, que se caracteriza por un gran número de variables de entrada en comparación con el número de muestras disponibles. El objetivo principal de esta tesis doctoral es proponer diferentes técnicas de ML para el análisis de enfermedades mentales a partir de datos de neuroimagen incluyendo diferentes extensiones de estos modelos para adaptarlos al escenario de la neurociencia. En particular, esta tesis se centra en el uso de representaciones latentes de imagen cerebral, ya que permiten dotar al problema de una representación reducida de baja dimensión a la vez que obtienen una mejor visión de las relaciones internas entre la enfermedad mental y los datos disponibles. De este modo, el objetivo principal de esta tesis doctoral es proporcionar resultados interpretables y competentes con el estado del arte en el análisis de las enfermedades mentales. Esta tesis comienza proponiendo un modelo basado en formulaciones clásicas de representación latente, que se apoya en un proceso de bagging para obtener la relevancia de cada voxel de imagen cerebral, el Análisis de Correlación Canónica Regularizada con Bagging (RBCCA). La relevancia aprendida se combina con un test estadístico para obtener una selección de características. Además, la propuesta obtiene una selección por clases que, a su vez, mejora el análisis del efecto de cada área cerebral en los estadios de la enfermedad mental. Por otro lado, RB-CCA utiliza la medida de relevancia para guiar el proceso de extracción de características, utilizándola para penalizar los vóxeles menos relevantes para obtener la representación de baja dimensión. Los resultados obtenidos en dos bases de datos para la caracterización de la enfermedad de Alzheimer y el Trastorno por Déficit de Atención e Hiperactividad demuestran que el modelo es capaz de rendir igual o mejor que los baselines a la vez que proporciona soluciones interpretables. Posteriormente, esta tesis continúa con un segundo modelo que utiliza aproximaciones Bayesianas para obtener una representación latente. En concreto, este modelo se centra en proporcionar diferentes funcionalidades para construir una representación común a partir de diferentes fuentes de datos y particularidades. Para ello, el modelo generativo propuesto, Sparse Semisupervised Heterogeneous Interbattery Bayesian Factor Analysis (SSHIBA), puede aprender la relevancia de las características para realizar la selección de las mismas, así como seleccionar automáticamente el número de factores latentes. Además, también puede modelar datos heterogéneos (reales, multietiqueta y categóricos), trabajar con kernels y utilizar una formulación semisupervisada, que imputa naturalmente los valores perdidos mediante el muestreo de las distribuciones aprendidas. Los resultados obtenidos con este modelo demuestran la versatilidad de la formulación, que permite combinar indistintamente estas extensiones, ampliando los escenarios en los que se puede aplicar el modelo y mejorando la interpretabilidad de los resultados. Finalmente, esta tesis incluye una comparación de los modelos propuestos en el conjunto de datos de la enfermedad de Alzheimer, donde ambos proporcionan resultados similares en términos de rendimiento; sin embargo, RB-CCA proporciona un análisis más robusto de las enfermedades mentales que es más fácilmente interpretable. Por otro lado, mientras que RB-CCA está más limitado a escenarios específicos, la formulación SSHIBA permite combinar una mayor variedad de datos y se adapta fácilmente a escenarios más complejos de la vida real.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Manuel Martínez Ramón.- Secretario: Emilio Parrado Hernández.- Vocal: Sancho Salcedo San

    Sparse Predictive Modeling : A Cost-Effective Perspective

    Get PDF
    Many real life problems encountered in industry, economics or engineering are complex and difficult to model by conventional mathematical methods. Machine learning provides a wide variety of methods and tools for solving such problems by learning mathematical models from data. Methods from the field have found their way to applications such as medical diagnosis, financial forecasting, and web-search engines. The predictions made by a learned model are based on a vector of feature values describing the input to the model. However, predictions do not come for free in real world applications, since the feature values of the input have to be bought, measured or produced before the model can be used. Feature selection is a process of eliminating irrelevant and redundant features from the model. Traditionally, it has been applied for achieving interpretable and more accurate models, while the possibility of lowering prediction costs has received much less attention in the literature. In this thesis we consider novel feature selection techniques for reducing prediction costs. The contributions of this thesis are as follows. First, we propose several cost types characterizing the cost of performing prediction with a trained model. Particularly, we consider costs emerging from multitarget prediction problems as well as a number of cost types arising when the feature extraction process is structured. Second, we develop greedy regularized least-squares methods to maximize the predictive performance of the models under given budget constraints. Empirical evaluations are performed on numerous benchmark data sets as well as on a novel water quality analysis application. The results demonstrate that in settings where the considered cost types apply, the proposed methods lead to substantial cost savings compared to conventional methods
    • …
    corecore