12 research outputs found

    Distribution Regression: Theory and Application

    Get PDF
    In this dissertation we discuss the problem of distribution regression. That is, the problem of utilizing distributional covariates in predicting scalar outcomes. We first show an application in neuroimaging that relates functional connectivity measurements viewed as statistical distributions to outcomes. We consider 47 primary progressive aphasia (PPA) patients with various levels of language ability. These patients were randomly assigned to two treatment arms, tDCS (transcranial direct-current stimulation and language therapy) vs sham (language therapy only), in a clinical trial. We analyze the effect of direct stimulation on functional connectivity by treating connectivity measures as samples from individual distributions. As such, we estimate the density of correlations among the regions of interest (ROIs) and study the difference in the density post-intervention between treatment arms. This distributional approach gives the ability to drastically reduces the number of multiple comparisons compared to classic edge-wise analysis. In addition, it allows for the investigation of the impact of functional connectivity on the outcomes where the connectivity is not geometrically localized. We next propose and study the theoretical properties of a related functional expectation model, where we show that optimal information rate bounds can be achieved by a distributional Gaussian process regression, without estimating any individual densities. The model can perform closed form posterior inference via a Gaussian process prior on the regression function. We also propose a low-rank approximation method to accelerate the inference in real applications. In the next chapter, we attached a less related work that reviews state-of-art algorithms to accelerate the convergence of fixed-point iteration problems. Fixed point iteration algorithms have a wide range of applications in statistics and data science. We propose a modified restart Nesterov accelerated gradient algorithm that can also be used for black-box acceleration of general fixed-point iteration problems and show that works well in practice via investigation under six different tasks

    A Unified Framework for Gradient-based Hyperparameter Optimization and Meta-learning

    Get PDF
    Machine learning algorithms and systems are progressively becoming part of our societies, leading to a growing need of building a vast multitude of accurate, reliable and interpretable models which should possibly exploit similarities among tasks. Automating segments of machine learning itself seems to be a natural step to undertake to deliver increasingly capable systems able to perform well in both the big-data and the few-shot learning regimes. Hyperparameter optimization (HPO) and meta-learning (MTL) constitute two building blocks of this growing effort. We explore these two topics under a unifying perspective, presenting a mathematical framework linked to bilevel programming that captures existing similarities and translates into procedures of practical interest rooted in algorithmic differentiation. We discuss the derivation, applicability and computational complexity of these methods and establish several approximation properties for a class of objective functions of the underlying bilevel programs. In HPO, these algorithms generalize and extend previous work on gradient-based methods. In MTL, the resulting framework subsumes classic and emerging strategies and provides a starting basis from which to build and analyze novel techniques. A series of examples and numerical simulations offer insight and highlight some limitations of these approaches. Experiments on larger-scale problems show the potential gains of the proposed methods in real-world applications. Finally, we develop two extensions of the basic algorithms apt to optimize a class of discrete hyperparameters (graph edges) in an application to relational learning and to tune online learning rate schedules for training neural network models, an old but crucially important issue in machine learning

    Advances in electron microscopy with deep learning

    Get PDF
    Following decades of exponential increases in computational capability and widespread data availability, deep learning is readily enabling new science and technology. This thesis starts with a review of deep learning in electron microscopy, which offers a practical perspective aimed at developers with limited familiarity. To help electron microscopists get started with started with deep learning, large new electron microscopy datasets are introduced for machine learning. Further, new approaches to variational autoencoding are introduced to embed datasets in low-dimensional latent spaces, which are used as the basis of electron microscopy search engines. Encodings are also used to investigate electron microscopy data visualization by t-distributed stochastic neighbour embedding. Neural networks that process large electron microscopy images may need to be trained with small batch sizes to fit them into computer memory. Consequently, adaptive learning rate clipping is introduced to prevent learning being destabilized by loss spikes associated with small batch sizes. This thesis presents three applications of deep learning to electron microscopy. Firstly, electron beam exposure can damage some specimens, so generative adversarial networks were developed to complete realistic images from sparse spiral, gridlike, and uniformly spaced scans. Further, recurrent neural networks were trained by reinforcement learning to dynamically adapt sparse scans to specimens. Sparse scans can decrease electron beam exposure and scan time by 10-100× with minimal information loss. Secondly, a large encoder-decoder was developed to improve transmission electron micrograph signal-to-noise. Thirdly, conditional generative adversarial networks were developed to recover exit wavefunction phases from single images. Phase recovery with deep learning overcomes existing limitations as it is suitable for live applications and does not require microscope modification. To encourage further investigation, scientific publications and their source files, source code, pretrained models, datasets, and other research outputs covered by this thesis are openly accessible

    Detection of COVID-19 in X-Ray images using Neural Networks

    Get PDF
    Pandemie způsobena nemocí COVID-19 je velmi naléhavým problémem, který nadále ovlivňuje životy lidí po celém světě. K překonání této nemoci je nutné včas identifikovat a izolovat infikované pacienty, aby se zabránilo šíření viru. Tradiční detekční techniky založené na molekulární diagnostice, jako například RT-PCR, jsou nákladné, časově náročné a studie ukazují, že jejich spolehlivost značně kolísá. V této práci jsme zkoumali detekci nemoci COVID-19 v rentgenových snímcích hrudníku pomocí konvolučních neuronových sítí. Poznatky z provedené rešerše dále využíváme k implementaci prototypu pro provádění binární detekce a jeho následnému vyhodnocení na souboru otevřených datových repozitářů dostupných online. Tyto výsledky poté porovnáváme se stávajícími řešeními a modely. Naše navrhovaná jednoduchá architektura s názvem BaseNet dosahuje na zvolené testovací sadě dat přesnosti 95.50 % a senzitivity 93.00 %. Zmíněný BaseNet jsme dále spolu s několika dalšími vyladěnými architekturami spojili do souboru modelů, jejichž kombinovaná klasifikační přesnost je 99.50 % s naměřenou senzitivitou 98.50 %.The COVID-19 pandemic is a very pressing issue that continues to affect the lives of people around the globe. To combat and overcome the disease, it is necessary for infected patients to be quickly identified and isolated to prevent the virus from spreading. The traditional detection techniques based on molecular diagnosis, such as RT-PCR, are expensive, time-consuming, and their reliability has been shown to fluctuate. In this thesis, we research the detection of COVID-19 in chest X-ray images using convolutional neural networks. We use our findings to implement a prototype that performs binary detection of the disease, evaluate its performance on a collection of open data repositories available online, and compare its results to existing models. Our proposed light-weight architecture called the BaseNet achieves an accuracy of 95.50 % on the chosen test set, with a COVID-19 sensitivity of 93.00 %. We further assemble an ensemble of the BaseNet along with several other fine-tuned architectures, whose combined classification accuracy is 99.25 % with a measured sensitivity of 98.50 %

    Deep Learning in Medical Image Analysis

    Get PDF
    The accelerating power of deep learning in diagnosing diseases will empower physicians and speed up decision making in clinical environments. Applications of modern medical instruments and digitalization of medical care have generated enormous amounts of medical images in recent years. In this big data arena, new deep learning methods and computational models for efficient data processing, analysis, and modeling of the generated data are crucially important for clinical applications and understanding the underlying biological process. This book presents and highlights novel algorithms, architectures, techniques, and applications of deep learning for medical image analysis

    Predicting Financial Markets using Text on the Web

    Get PDF

    Statistical Machine Learning Methodology for Individualized Treatment Rule Estimation in Precision Medicine

    Get PDF
    Precision medicine aims to deliver optimal, individualized treatments for patients by accounting for their unique characteristics. With a foundation in reinforcement learning, decision theory, and causal inference, the field of precision medicine has seen many advancements in recent years. Significant focus has been placed on creating algorithms to estimate individualized treatment rules (ITRs), which map from patient covariates to the space of available treatments with the goal of maximizing patient outcome. In Chapter 1, we extend ITR estimation methodology in the scenario where variance of the outcome is heterogeneous with respect to treatment and covariates. Accordingly, we propose Stabilized Direct Learning (SD-Learning), which utilizes heteroscedasticity in the error term through a residual reweighting framework that models residual variance via flexible machine learning algorithms such as XGBoost and random forests. We also develop an internal cross-validation scheme which determines the best residual model among competing models. Further, we extend this methodology to multi-arm treatment scenarios. In Chapter 2, we develop ITR estimation methodology for situations where clinical decision-making involves balancing multiple outcomes of interest. Our proposed framework estimates an ITR which maximizes a combination of the multiple clinical outcomes, accounting for the fact that patients may ascribe importance to outcomes differently (utility heterogeneity). This approach employs inverse reinforcement learning (IRL) techniques through an expert-augmentation solution, whereby physicians provide input to guide the utility estimation and ITR learning processes. In Chapter 3, we apply an end-to-end precision medicine workflow to novel data from older adults with Type 1 Diabetes in order to understand the heterogeneous treatment effects of continuous glucose monitoring (CGM) and develop an interpretable ITR to reveal patients for which CGM confers a major safety benefit. The results from this analysis elucidate the demographic and clinical markers which moderate CGM's success, provide the basis for using diagnostic CGM to inform therapeutic CGM decisions, and serve to augment clinical decision-making. Finally, in Chapter 4, as a future research direction, we propose a deep autoencoder framework which simultaneously performs feature selection and ITR optimization, contributing to methodology built for direct consumption of unstructured, high-dimensional data in the precision medicine pipeline.Doctor of Philosoph

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Applications of Deep Neural Networks

    Full text link
    Deep learning is a group of exciting new technologies for neural networks. Through a combination of advanced training techniques and neural network architectural components, it is now possible to create neural networks that can handle tabular data, images, text, and audio as both input and output. Deep learning allows a neural network to learn hierarchies of information in a way that is like the function of the human brain. This course will introduce the student to classic neural network structures, Convolution Neural Networks (CNN), Long Short-Term Memory (LSTM), Gated Recurrent Neural Networks (GRU), General Adversarial Networks (GAN), and reinforcement learning. Application of these architectures to computer vision, time series, security, natural language processing (NLP), and data generation will be covered. High-Performance Computing (HPC) aspects will demonstrate how deep learning can be leveraged both on graphical processing units (GPUs), as well as grids. Focus is primarily upon the application of deep learning to problems, with some introduction to mathematical foundations. Readers will use the Python programming language to implement deep learning using Google TensorFlow and Keras. It is not necessary to know Python prior to this book; however, familiarity with at least one programming language is assumed
    corecore