Search CORE

95 research outputs found

A Coreset-based, Tempered Variational Posterior for Accurate and Scalable Stochastic Gaussian Process Inference

Author: Elhadad Noémie
Ketenci Mert
Perotte Adler
Urteaga Iñigo
Publication venue
Publication date: 02/11/2023
Field of study

We present a novel stochastic variational Gaussian process (

\mathcal{GP}

) inference method, based on a posterior over a learnable set of weighted pseudo input-output points (coresets). Instead of a free-form variational family, the proposed coreset-based, variational tempered family for

\mathcal{GP}

s (CVTGP) is defined in terms of the

\mathcal{GP}

prior and the data-likelihood; hence, accommodating the modeling inductive biases. We derive CVTGP's lower bound for the log-marginal likelihood via marginalization of the proposed posterior over latent

\mathcal{GP}

coreset variables, and show it is amenable to stochastic optimization. CVTGP reduces the learnable parameter size to

\mathcal{O}(M)

, enjoys numerical stability, and maintains

\mathcal{O}(M^3)

time- and

\mathcal{O}(M^2)

space-complexity, by leveraging a coreset-based tempered posterior that, in turn, provides sparse and explainable representations of the data. Results on simulated and real-world regression problems with Gaussian observation noise validate that CVTGP provides better evidence lower-bound estimates and predictive root mean squared error than alternative stochastic

\mathcal{GP}

inference methods

arXiv.org e-Print Archive

Characterizing non-heroin opioid overdoses using electronic health records.

Author: Averitt Amelia J
Perotte Adler J
Slovis B. H.
Tariq Abdul A
Vawdrey David K
Publication venue: Jefferson Digital Commons
Publication date: 26/11/2019
Field of study

Introduction: The opioid epidemic is a modern public health emergency. Common interventions to alleviate the opioid epidemic aim to discourage excessive prescription of opioids. However, these methods often take place over large municipal areas (state-level) and may fail to address the diversity that exists within each opioid case (individual-level). An intervention to combat the opioid epidemic that takes place at the individual-level would be preferable. Methods: This research leverages computational tools and methods to characterize the opioid epidemic at the individual-level using the electronic health record data from a large, academic medical center. To better understand the characteristics of patients with opioid use disorder (OUD) we leveraged a self-controlled analysis to compare the healthcare encounters before and after an individual\u27s first overdose event recorded within the data. We further contrast these patients with matched, non-OUD controls to demonstrate the unique qualities of the OUD cohort. Results: Our research confirms that the rate of opioid overdoses in our hospital significantly increased between 2006 and 2015 (P \u3c 0.001), at an average rate of 9% per year. We further found that the period just prior to the first overdose is marked by conditions of pain or malignancy, which may suggest that overdose stems from pharmaceutical opioids prescribed for these conditions. Conclusions: Informatics-based methodologies, like those presented here, may play a role in better understanding those individuals who suffer from opioid dependency and overdose, and may lead to future research and interventions that could successfully prevent morbidity and mortality associated with this epidemic

Jefferson Digital Commons

Who are the Users of Speed Regulation Assistance? Comparing Driver Characteristics of Casual and Intensive System Users

Author: A. Perotte (578533)
D. J. Albers (293830)
E. Tabak (578532)
George Hripcsak (242191)
Noémie Elhadad (522968)
Publication venue: Iowa Research Online
Publication date: 18/06/2013
Field of study

Speed regulation assistance can contribute to road safety provided that drivers use the systems on a regular basis. With the objective to gain knowledge about drivers who use Cruise Control and the Speed Limiter, a comparison of the characteristics of casual and intensive users was performed with survey data. The results show that gender and annual mileage play a role for the usage frequency of Cruise Control, whereas the usage frequency of the Speed Limiter depends on age. Consistent effects of the car use for business matters and the use of other invehicle technologies were found on the usage frequency of both systems. The predominant motive to reduce speeding found for both systems corresponds with the objective of speed regulation assistance as a safety measure. It was complemented with a comfort benefit perceived by Cruise Control users

Iowa Research Online

FigShare

Preserving Differential Privacy in Convolutional Deep Belief Networks

Author: A Armato
A Bandura
A Gottlieb
A Ortiz
A Perotte
A Perotte
C Dwork
Dejing Dou
E Choi
EW Jamoom
G Arfken
G Hinton
GE Hinton
H Li
HY Xiong
J Ma
J Wu
J Zhang
M Hay
M Helmstaedter
M Roumia
M Vlcek
MKK Leung
N Phan
N Phan
NhatHai Phan
R Fang
R Miotto
S Bach
S Hochreiter
SM Plis
T Lee
TJ Rivlin
W Rudin
Xintao Wu
Y Bengio
Y LeCun
Y Lecun
Publication venue
Publication date: 01/01/2017
Field of study

The remarkable development of deep learning in medicine and healthcare domain presents obvious privacy issues, when deep neural networks are built on users' personal and highly sensitive data, e.g., clinical records, user profiles, biomedical images, etc. However, only a few scientific studies on preserving privacy in deep learning have been conducted. In this paper, we focus on developing a private convolutional deep belief network (pCDBN), which essentially is a convolutional deep belief network (CDBN) under differential privacy. Our main idea of enforcing epsilon-differential privacy is to leverage the functional mechanism to perturb the energy-based objective functions of traditional CDBNs, rather than their results. One key contribution of this work is that we propose the use of Chebyshev expansion to derive the approximate polynomial representation of objective functions. Our theoretical analysis shows that we can further derive the sensitivity and error bounds of the approximate polynomial representation. As a result, preserving differential privacy in CDBNs is feasible. We applied our model in a health social network, i.e., YesiWell data, and in a handwriting digit dataset, i.e., MNIST data, for human behavior prediction, human behavior classification, and handwriting digit recognition tasks. Theoretical analysis and rigorous experimental evaluations show that the pCDBN is highly effective. It significantly outperforms existing solutions

arXiv.org e-Print Archive

ScholarWorks@UARK

Crossref

UARK (University of Arkansas )

Predicting Multiple ICD-10 Codes from Brazilian-Portuguese Clinical Notes

Author: A Perotte
AEW Johnson
F Duarte
G Salton
J Huang
M Li
M Oleynik
M Subotin
P Bojanowski
PB Jensen
SVS Pakhomov
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/07/2020
Field of study

ICD coding from electronic clinical records is a manual, time-consuming and expensive process. Code assignment is, however, an important task for billing purposes and database organization. While many works have studied the problem of automated ICD coding from free text using machine learning techniques, most use records in the English language, especially from the MIMIC-III public dataset. This work presents results for a dataset with Brazilian Portuguese clinical notes. We develop and optimize a Logistic Regression model, a Convolutional Neural Network (CNN), a Gated Recurrent Unit Neural Network and a CNN with Attention (CNN-Att) for prediction of diagnosis ICD codes. We also report our results for the MIMIC-III dataset, which outperform previous work among models of the same families, as well as the state of the art. Compared to MIMIC-III, the Brazilian Portuguese dataset contains far fewer words per document, when only discharge summaries are used. We experiment concatenating additional documents available in this dataset, achieving a great boost in performance. The CNN-Att model achieves the best results on both datasets, with micro-averaged F1 score of 0.537 on MIMIC-III and 0.485 on our dataset with additional documents.Comment: Accepted at BRACIS 202

arXiv.org e-Print Archive

Crossref