24 research outputs found
Robust Estimation of High-Dimensional Mean Regression
Data subject to heavy-tailed errors are commonly encountered in various
scientific fields, especially in the modern era with explosion of massive data.
To address this problem, procedures based on quantile regression and Least
Absolute Deviation (LAD) regression have been devel- oped in recent years.
These methods essentially estimate the conditional median (or quantile)
function. They can be very different from the conditional mean functions when
distributions are asymmetric and heteroscedastic. How can we efficiently
estimate the mean regression functions in ultra-high dimensional setting with
existence of only the second moment? To solve this problem, we propose a
penalized Huber loss with diverging parameter to reduce biases created by the
traditional Huber loss. Such a penalized robust approximate quadratic
(RA-quadratic) loss will be called RA-Lasso. In the ultra-high dimensional
setting, where the dimensionality can grow exponentially with the sample size,
our results reveal that the RA-lasso estimator produces a consistent estimator
at the same rate as the optimal rate under the light-tail situation. We further
study the computational convergence of RA-Lasso and show that the composite
gradient descent algorithm indeed produces a solution that admits the same
optimal rate after sufficient iterations. As a byproduct, we also establish the
concentration inequality for estimat- ing population mean when there exists
only the second moment. We compare RA-Lasso with other regularized robust
estimators based on quantile regression and LAD regression. Extensive
simulation studies demonstrate the satisfactory finite-sample performance of
RA-Lasso
Recommended from our members
Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions
Data subject to heavy-tailed errors are commonly encountered in various scientific fields. To address this problem, procedures based on quantile regression and Least Absolute Deviation (LAD) regression have been developed in recent years. These methods essentially estimate the conditional median (or quantile) function. They can be very different from the conditional mean functions, especially when distributions are asymmetric and heteroscedastic. How can we efficiently estimate the mean regression functions in ultra-high dimensional setting with existence of only the second moment? To solve this problem, we propose a penalized Huber loss with diverging parameter to reduce biases created by the traditional Huber loss. Such a penalized robust approximate quadratic (RA-quadratic) loss will be called RA-Lasso. In the ultra-high dimensional setting, where the dimensionality can grow exponentially with the sample size, our results reveal that the RA-lasso estimator produces a consistent estimator at the same rate as the optimal rate under the light-tail situation. We further study the computational convergence of RA-Lasso and show that the composite gradient descent algorithm indeed produces a solution that admits the same optimal rate after sufficient iterations. As a byproduct, we also establish the concentration inequality for estimating population mean when there exists only the second moment. We compare RA-Lasso with other regularized robust estimators based on quantile regression and LAD regression. Extensive simulation studies demonstrate the satisfactory finite-sample performance of RA-Lasso
Relapse or reinfection: Classification of malaria infection using transition likelihoods
In patients with Plasmodium vivax malaria treated with effective blood-stage therapy, the recurrent illness may occur due to relapse from latent liver-stage infection or reinfection from a new mosquito bite. Classification of the recurrent infection as either relapse or reinfection is critical when evaluating the efficacy of an anti-relapse treatment. Although one can use whether a shared genetic variant exists between baseline and recurrence genotypes to classify the outcome, little has been suggested to use both sharing and nonsharing variants to improve the classification accuracy. In this paper, we develop a novel classification criterion that utilizes transition likelihoods to distinguish relapse from reinfection. When tested in extensive simulation experiments with known outcomes, our classifier has superior operating characteristics. A real data set from 78 Cambodian P. vivax malaria patients was analyzed to demonstrate the practical use of our proposed method
Should All Patients With Pulmonary Hypertension Undergoing Non-Cardiac Surgery Be Managed by Cardiothoracic Fellowship-Trained Anesthesiologists?
Objectives To identify differences in practice patterns and outcomes related to the induction of general anesthesia for patients with pulmonary hypertension (PH) performed by anesthesiologists who have completed a cardiothoracic fellowship (CTA group) vs those who have not (non-CTA group). Design Retrospective study with propensity score matching. Setting Operating room. Participants All adult patients with PH undergoing general anesthesia requiring intubation at a single academic center over 5 years. Interventions Patient baseline characteristics, peri-induction management variables, post-induction mean arterial pressure (MAP), and other outcomes were compared between CTA and non-CTA groups. Methods and main results: Following propensity scoring matching, 402 patients were included in the final model, 100 in the CTA group and 302 in the non-CTA group. Also following matching, only cases of mild to moderate PH without right ventricular dysfunction remained in the analysis. Matched groups were overall statistically similar with respect to baseline characteristics; however, there was a greater incidence of higher ASA class (P = .025) and cardiology and thoracic procedures (P < .001) being managed by the CTA group. No statistical differences were identified in practice patterns or outcomes related to the induction of anesthesia between groups, except for longer hospital length of stay in the CTA group (P = .008). Conclusions These results provide early evidence to suggest the induction of general anesthesia of patients with non-severe PH disease can be comparably managed by either anesthesiologists with or without a cardiothoracic fellowship. However, these findings should be confirmed in a prospective study
The association of health literacy and blood pressure reduction in a cohort of patients with hypertension: The heart healthy lenoir trial
Lower health literacy is associated with poorer health outcomes. Few interventions poised to mitigate the impact of health literacy in hypertensive patients have been published. We tested if a multilevel quality improvement intervention could differentially improve Systolic Blood Pressure (SBP) more so in patients with low vs. higher health literacy
The Association of Health Literacy and Blood Pressure Reduction in a Cohort of Patients with Hypertension: The Heart Healthy Lenoir Trial
OBJECTIVE: Lower health literacy is associated with poorer health outcomes. Few interventions poised to mitigate the impact of health literacy in hypertensive patients have been published. We tested if a multilevel quality improvement intervention could differentially improve Systolic Blood Pressure (SBP) more so in patients with low vs. higher health literacy. METHODS: We conducted a non-randomized prospective cohort trial of 525 patients referred with uncontrolled hypertension. Stakeholder informed and health literacy sensitive strategies were implemented at the practice and patient level. Outcomes were assessed at 0, 6, 12, 18 and 24 months. RESULTS: At 12 months, the low and higher health literacy groups had statistically significant decreases in mean SBP (6.6 and 5.3 mmHg, respectively), but the between group difference was not significant (Δ 1.3 mm Hg, P=.067). At 24 months, the low and higher health literacy groups reductions were 8.1 and 4.6 mm Hg, respectively, again the between group difference was not significant (Δ 3.5 mm Hg, p = 0.25). CONCLUSIONS/PRACTICE IMPLICATIONS: A health literacy sensitive multi-level intervention may equally lower SBP in patients with low and higher health literacy. Practical health literacy appropriate tools and methods can be implemented in primary care settings using a quality improvement approach
Recommended from our members
Embracing the Blessing of Dimensionality in Factor Models.
Factor modeling is an essential tool for exploring intrinsic dependence structures among high-dimensional random variables. Much progress has been made for estimating the covariance matrix from a high-dimensional factor model. However, the blessing of dimensionality has not yet been fully embraced in the literature: much of the available data are often ignored in constructing covariance matrix estimates. If our goal is to accurately estimate a covariance matrix of a set of targeted variables, shall we employ additional data, which are beyond the variables of interest, in the estimation? In this article, we provide sufficient conditions for an affirmative answer, and further quantify its gain in terms of Fisher information and convergence rate. In fact, even an oracle-like result (as if all the factors were known) can be achieved when a sufficiently large number of variables is used. The idea of using data as much as possible brings computational challenges. A divide-and-conquer algorithm is thus proposed to alleviate the computational burden, and also shown not to sacrifice any statistical accuracy in comparison with a pooled analysis. Simulation studies further confirm our advocacy for the use of full data, and demonstrate the effectiveness of the above algorithm. Our proposal is applied to a microarray data example that shows empirical benefits of using more data. Supplementary materials for this article are available online