7,488 research outputs found

    Uncertainty-Aware Attention for Reliable Interpretation and Prediction

    Get PDF
    Department of Computer Science and EngineeringAttention mechanism is effective in both focusing the deep learning models on relevant features and interpreting them. However, attentions may be unreliable since the networks that generate them are often trained in a weakly-supervised manner. To overcome this limitation, we introduce the notion of input-dependent uncertainty to the attention mechanism, such that it generates attention for each feature with varying degrees of noise based on the given input, to learn larger variance on instances it is uncertain about. We learn this Uncertainty-aware Attention (UA) mechanism using variational inference, and validate it on various risk prediction tasks from electronic health records on which our model significantly outperforms existing attention models. The analysis of the learned attentions shows that our model generates attentions that comply with clinicians' interpretation, and provide richer interpretation via learned variance. Further evaluation of both the accuracy of the uncertainty calibration and the prediction performance with "I don't know'' decision show that UA yields networks with high reliability as well.ope

    Learning deep patient representations for the teleICU

    Get PDF
    This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (pages 89-93).This thesis presents a method of extracting deep robust representations of teleICU clinical data using Transformer networks, inspired by recent machine learning literature in language modeling. The utility of these representations is evaluated in various prediction outcome tasks, in which they were able to outperform linear and neural baselines. Also examined are the probability distributions of various patient characteristics across the learned patient representation space; where corresponding high-level spatial structure suggests potential for use as a similarity metric or in combination with other patient similarity metrics. Finally, the code for the models developed is publicly provided as a starting point for further research.by Ini Oguntola.M. Eng.M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienc
    corecore