19 research outputs found
Learning the Probability of Activation in the Presence of Latent Spreaders
When an infection spreads in a community, an individual's probability of
becoming infected depends on both her susceptibility and exposure to the
contagion through contact with others. While one often has knowledge regarding
an individual's susceptibility, in many cases, whether or not an individual's
contacts are contagious is unknown. We study the problem of predicting if an
individual will adopt a contagion in the presence of multiple modes of
infection (exposure/susceptibility) and latent neighbor influence. We present a
generative probabilistic model and a variational inference method to learn the
parameters of our model. Through a series of experiments on synthetic data, we
measure the ability of the proposed model to identify latent spreaders, and
predict the risk of infection. Applied to a real dataset of 20,000 hospital
patients, we demonstrate the utility of our model in predicting the onset of a
healthcare associated infection using patient room-sharing and nurse-sharing
networks. Our model outperforms existing benchmarks and provides actionable
insights for the design and implementation of targeted interventions to curb
the spread of infection.Comment: To appear in AAA1-1
Fairness and robustness in anti-causal prediction
Robustness to distribution shift and fairness have independently emerged as
two important desiderata required of modern machine learning models. While
these two desiderata seem related, the connection between them is often unclear
in practice. Here, we discuss these connections through a causal lens, focusing
on anti-causal prediction tasks, where the input to a classifier (e.g., an
image) is assumed to be generated as a function of the target label and the
protected attribute. By taking this perspective, we draw explicit connections
between a common fairness criterion - separation - and a common notion of
robustness - risk invariance. These connections provide new motivation for
applying the separation criterion in anticausal settings, and inform old
discussions regarding fairness-performance tradeoffs. In addition, our findings
suggest that robustness-motivated approaches can be used to enforce separation,
and that they often work better in practice than methods designed to directly
enforce separation. Using a medical dataset, we empirically validate our
findings on the task of detecting pneumonia from X-rays, in a setting where
differences in prevalence across sex groups motivates a fairness mitigation.
Our findings highlight the importance of considering causal structure when
choosing and enforcing fairness criteria
Estimation of Bounds on Potential Outcomes For Decision Making
Estimation of individual treatment effects is commonly used as the basis for
contextual decision making in fields such as healthcare, education, and
economics. However, it is often sufficient for the decision maker to have
estimates of upper and lower bounds on the potential outcomes of decision
alternatives to assess risks and benefits. We show that, in such cases, we can
improve sample efficiency by estimating simple functions that bound these
outcomes instead of estimating their conditional expectations, which may be
complex and hard to estimate. Our analysis highlights a trade-off between the
complexity of the learning task and the confidence with which the learned
bounds hold. Guided by these findings, we develop an algorithm for learning
upper and lower bounds on potential outcomes which optimize an objective
function defined by the decision maker, subject to the probability that bounds
are violated being small. Using a clinical dataset and a well-known causality
benchmark, we demonstrate that our algorithm outperforms baselines, providing
tighter, more reliable bounds
Association Between the Medicare Hospice Benefit and Health Care Utilization and Costs for Patients With Poor-Prognosis Cancer
Importance More patients with cancer use hospice currently than ever before, but there are indications that care intensity outside of hospice is increasing, and length of hospice stay decreasing. Uncertainties regarding how hospice affects health care utilization and costs have hampered efforts to promote it.
Objective To compare utilization and costs of health care for patients with poor-prognosis cancers enrolled in hospice vs similar patients without hospice care.
Design, Setting, and Participants Matched cohort study of patients in hospice and nonhospice care using a nationally representative 20% sample of Medicare fee-for-service beneficiaries who died in 2011. Patients with poor-prognosis cancers (eg, brain, pancreatic, metastatic malignancies) enrolled in hospice before death were matched to similar patients who died without hospice care.
Exposures Period between hospice enrollment and death for hospice beneficiaries, and the equivalent period of nonhospice care before death for matched nonhospice patients.
Main Outcomes and Measures Health care utilization including hospitalizations and procedures, place of death, cost trajectories before and after hospice start, and cumulative costs, all during the last year of life.
Results Among 86 851 patients with poor-prognosis cancers, median time from first poor-prognosis diagnosis to death was 13 months (interquartile range [IQR], 3-34), and 51 924 patients (60%) entered hospice before death. Matching yielded a cohort balanced on age, sex, region, time from poor-prognosis diagnosis to death, and baseline care utilization, with 18 165 patients in the hospice group and 18 165 in the nonhospice group. After matching, 11% of nonhospice and 1% of hospice beneficiaries who had cancer-directed therapy after exposure were excluded. Median hospice duration was 11 days. After exposure, nonhospice beneficiaries had significantly more hospitalizations (65% [95% CI, 64%-66%], vs hospice with 42% [95% CI, 42%-43%]; risk ratio, 1.5 [95% CI, 1.5-1.6]), intensive care (36% [95% CI, 35%-37%], vs hospice with 15% [95% CI, 14%-15%]; risk ratio, 2.4 [95% CI, 2.3-2.5]), and invasive procedures (51% [95% CI, 50%-52%], vs hospice with 27% [95% CI, 26%-27%]; risk ratio, 1.9 [95% CI, 1.9-2.0]), largely for acute conditions not directly related to cancer; and 74% (95% CI, 74%-75%) of nonhospice beneficiaries died in hospitals and nursing facilities compared with 14% (95% CI, 14%-15%) of hospice beneficiaries. Costs for hospice and nonhospice beneficiaries were not significantly different at baseline, but diverged after hospice start. Total costs over the last year of life were 70 543-72 490) for nonhospice and 62 082-63 557) for hospice, a statistically significant difference of 7560-$9835).
Conclusions and Relevance In this sample of Medicare fee-for-service beneficiaries with poor-prognosis cancer, those receiving hospice care vs not (control), had significantly lower rates of hospitalization, intensive care unit admission, and invasive procedures at the end of life, along with significantly lower total costs during the last year of life.Economic
Emergency care in 59 low- and middle-income countries: a systematic review
Abstract Objective: To conduct a systematic review of emergency care in low- and middle-income countries (LMICs). Methods: We searched PubMed, CINAHL and World Health Organization (WHO) databases for reports describing facility-based emergency care and obtained unpublished data from a network of clinicians and researchers. We screened articles for inclusion based on their titles and abstracts in English or French. We extracted data on patient outcomes and demographics as well as facility and provider characteristics. Analyses were restricted to reports published from 1990 onwards. Findings: We identified 195 reports concerning 192 facilities in 59 countries. Most were academically-affiliated hospitals in urban areas. The median mortality within emergency departments was 1.8% (interquartile range, IQR: 0.2–5.1%). Mortality was relatively high in paediatric facilities (median: 4.8%; IQR: 2.3–8.4%) and in sub-Saharan Africa (median: 3.4%; IQR: 0.5–6.3%). The median number of patients was 30 000 per year (IQR: 10 296–60 000), most of whom were young (median age: 35 years; IQR: 6.9–41.0) and male (median: 55.7%; IQR: 50.0–59.2%). Most facilities were staffed either by physicians-in-training or by physicians whose level of training was unspecified. Very few of these providers had specialist training in emergency care. Conclusion: Available data on emergency care in LMICs indicate high patient loads and mortality, particularly in sub-Saharan Africa, where a substantial proportion of all deaths may occur in emergency departments. The combination of high volume and the urgency of treatment make emergency care an important area of focus for interventions aimed at reducing mortality in these settings
Machine learning and causality: Building efficient, and reliable models for decision-making
We explore relationships between machine learning (ML) and causal inference. We focus on improvements in each by borrowing ideas from one another.
ML has been successfully applied to many problems, but the lack of strong theoretical guarantees has led to many unexpected failures. Models that perform well on the training distribution tend to break down when applied to different distributions; small perturbations can “fool” the trained model and drastically change its predictions; arbitrary choices in the training algorithm lead to vastly different models; and so forth. On the other hand, while there has been tremendous progress in developing causal inference methods with strong theoretical guarantees, existing methods typically do not apply in practice since they assume an abundance of data. Working at the intersection of ML and causal inference, we directly address the lack of robustness in ML, and improve the statistical efficiency of causal inference techniques.
The motivation behind the work presented in this thesis is to improve methods for building predictive, and causal models that are used to guide decision making. Throughout, we focus mostly on decision making in the healthcare context. On the ML for causality side, we use ML tools and analysis techniques to develop statistically efficient causal models that can guide clinicians when choosing between two treatments. On the causality for ML side, we study how knowledge of the causal mechanisms that generate observed data can be used to efficiently regularize predictive models without introducing biases. In a clinical context, we show how causal knowledge can be used to build robust, and accurate models to predict the spread of contagious infections. In a non-clinical setting, we study how to use causal knowledge to train models that are robust to distribution shifts in the context of image classification.Ph.D
A Distillation Approach to Data Efficient Individual Treatment Effect Estimation
The potential for using machine learning algorithms as a tool for suggesting optimal interventions has fueled significant interest in developing methods for estimating heterogeneous or individual treatment effects (ITEs) from observational data. While several methods for estimating ITEs have been recently suggested, these methods assume no constraints on the availability of data at the time of deployment or test time. This assumption is unrealistic in settings where data acquisition is a significant part of the analysis pipeline, meaning data about a test case has to be collected in order to predict the ITE. In this work, we present Data Efficient Individual Treatment Effect Estimation (DEITEE), a method which exploits the idea that adjusting for confounding, and hence collecting information about confounders, is not necessary at test time. DEITEE allows the development of rich models that exploit all variables at train time but identifies a minimal set of variables required to estimate the ITE at test time. Using 77 semi-synthetic datasets with varying data generating processes, we show that DEITEE achieves significant reductions in the number of variables required at test time with little to no loss in accuracy. Using real data, we demonstrate the utility of our approach in helping soon-to-be mothers make planning and lifestyle decisions that will impact newborn health
Learning the probability of activation in the presence of latent spreaders
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.Cataloged from PDF version of thesis.Includes bibliographical references (pages 71-74).When an infection spreads among members of a community, an individual's probability of becoming infected depends on both his susceptibility to the infection and exposure to the disease through contact with others. While one often has knowledge regarding an individual's susceptibility, in many cases, whether or not an individual's contacts are contagious and spreading the infection is unknown or latent. We propose a new generative model in which we model the neighbors' spreader states and the individuals' exposure states as latent variables. Combined with an individual's characteristics, we estimate the risk of infection as a function of both exposure and susceptibility. We propose a variational inference algorithm to learn the model parameters. Through a series of experiments on simulated data, we measure the ability of the proposed model to identify latent spreaders, estimate exposure as a function of one's spreading neighbors, and predict the risk of infection. Our work can be helpful in both identifying potential asymptomatic carriers of infections, and in identifying characteristics that are associated with an increased likelihood of being an undiagnosed source of contagion.by Maggie Makar.S.M
Learning the probability of activation in the presence of latent spreaders
When an infection spreads in a community, an individual's probability of becoming infected depends on both her susceptibility and exposure to the contagion through contact with others. While one often has knowledge regarding an individual's susceptibility, in many cases, whether or not an individual's contacts are contagious is unknown. We study the problem of predicting if an individual will adopt a contagion in the presence of multiple modes of infection (exposure/susceptibility) and latent neighbor influence. We present a generative probabilistic model and a variational inference method to learn the parameters of our model. Through a series of experiments on synthetic data, we measure the ability of the proposed model to identify latent spreaders, and predict the risk of infection. Applied to a real dataset of 20,000 hospital patients, we demonstrate the utility of our model in predicting the onset of a healthcare associated infection using patient room-sharing and nurse-sharing networks. Our model outperforms existing benchmarks and provides actionable insights for the design and implementation of targeted interventions to curb the spread of infection.NSF (Award IIS-1553146)NIAID of the NIH (Grant U01AI124255)NIH (Award P50-0267666-0002