4,028 research outputs found
Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline
From medical charts to national census, healthcare has traditionally operated
under a paper-based paradigm. However, the past decade has marked a long and
arduous transformation bringing healthcare into the digital age. Ranging from
electronic health records, to digitized imaging and laboratory reports, to
public health datasets, today, healthcare now generates an incredible amount of
digital information. Such a wealth of data presents an exciting opportunity for
integrated machine learning solutions to address problems across multiple
facets of healthcare practice and administration. Unfortunately, the ability to
derive accurate and informative insights requires more than the ability to
execute machine learning models. Rather, a deeper understanding of the data on
which the models are run is imperative for their success. While a significant
effort has been undertaken to develop models able to process the volume of data
obtained during the analysis of millions of digitalized patient records, it is
important to remember that volume represents only one aspect of the data. In
fact, drawing on data from an increasingly diverse set of sources, healthcare
data presents an incredibly complex set of attributes that must be accounted
for throughout the machine learning pipeline. This chapter focuses on
highlighting such challenges, and is broken down into three distinct
components, each representing a phase of the pipeline. We begin with attributes
of the data accounted for during preprocessing, then move to considerations
during model building, and end with challenges to the interpretation of model
output. For each component, we present a discussion around data as it relates
to the healthcare domain and offer insight into the challenges each may impose
on the efficiency of machine learning techniques.Comment: Healthcare Informatics, Machine Learning, Knowledge Discovery: 20
Pages, 1 Figur
Electron nuclear double resonance study of photostimulated luminescence active centers in CsBr:Eu2+ medical imaging plates
CsBr:Eu2+ needle image plates exhibit an electron-paramagnetic-resonance (EPR) spectrum at room temperature (RT), whose intensity is correlated with the photostimulated luminescence sensitivity of the plate. This EPR spectrum shows a strong temperature dependence: At RT it is owing to a single Eu2+ (S = 7/2) center with axial symmetry, whereas at T < 35 K the spectra can only be explained when two distinct centers are assumed to be present, a minority axial center and a majority center with nearly extremely rhombic symmetry. In this paper these low-temperature centers are studied with electron nuclear double resonance (ENDOR) spectroscopy, which reveals the presence of H-1 nuclei close to the central Eu2+ ions in the centers. Analysis of the angular dependence of the ENDOR spectra allows to propose models for these centers, providing an explanation for the observed difference in intensity between the spectral components and for their temperature dependence
Learning Tasks for Multitask Learning: Heterogenous Patient Populations in the ICU
Machine learning approaches have been effective in predicting adverse
outcomes in different clinical settings. These models are often developed and
evaluated on datasets with heterogeneous patient populations. However, good
predictive performance on the aggregate population does not imply good
performance for specific groups.
In this work, we present a two-step framework to 1) learn relevant patient
subgroups, and 2) predict an outcome for separate patient populations in a
multi-task framework, where each population is a separate task. We demonstrate
how to discover relevant groups in an unsupervised way with a
sequence-to-sequence autoencoder. We show that using these groups in a
multi-task framework leads to better predictive performance of in-hospital
mortality both across groups and overall. We also highlight the need for more
granular evaluation of performance when dealing with heterogeneous populations.Comment: KDD 201
- …