3,208 research outputs found
Robust high-dimensional data analysis using a weight shrinkage rule
In high-dimensional settings, a penalized least squares approach may lose its efficiency in both estimation and variable selection due to the existence of either outliers or heteroscedasticity. In this thesis, we propose a novel approach to perform robust high-dimensional data analysis in a penalized weighted least square framework. The main idea is to relate the irregularity of each observation to a weight vector and obtain the outlying status data-adaptively using a weight shrinkage rule. By usage of L-1 type regularization on both the coefficients and weight vectors, the proposed method is able to perform simultaneous variable selection and outliers detection efficiently. Eventually, this procedure results in estimators with potentially strong robustness and non-asymptotic consistency. We provide a unified link between the weight shrinkage rule and a robust M-estimation in general settings. We also establish the non-asymptotic oracle inequalities for the joint estimation of both the regression coefficients and weight vectors. These theoretical results allow the number of variables to far exceed the sample size. The performance of the proposed estimator is demonstrated in both simulation studies and real examples
Robust penalized regression for complex high-dimensional data
Robust high-dimensional data analysis has become an important and challenging task in complex Big Data analysis due to the high-dimensionality and data contamination. One of the most popular procedures is the robust penalized regression. In this dissertation, we address three typical robust ultra-high dimensional regression problems via penalized regression approaches. The first problem is related to the linear model with the existence of outliers, dealing with the outlier detection, variable selection and parameter estimation simultaneously. The second problem is related to robust high-dimensional mean regression with irregular settings such as the data contamination, data asymmetry and heteroscedasticity. The third problem is related to robust bi-level variable selection for the linear regression model with grouping structures in covariates. In Chapter 1, we introduce the background and challenges by overviews of penalized least squares methods and robust regression techniques. In Chapter 2, we propose a novel approach in a penalized weighted least squares framework to perform simultaneous variable selection and outlier detection. We provide a unified link between the proposed framework and a robust M-estimation in general settings. We also establish the non-asymptotic oracle inequalities for the joint estimation of both the regression coefficients and weight vectors. In Chapter 3, we establish a framework of robust estimators in high-dimensional regression models using Penalized Robust Approximated quadratic M estimation (PRAM). This framework allows general settings such as random errors lack of symmetry and homogeneity, or covariates are not sub-Gaussian. Theoretically, we show that, in the ultra-high dimension setting, the PRAM estimator has local estimation consistency at the minimax rate enjoyed by the LS-Lasso and owns the local oracle property, under certain mild conditions. In Chapter 4, we extend the study in Chapter 3 to robust high-dimensional data analysis with structured sparsity. In particular, we propose a framework of high-dimensional M-estimators for bi-level variable selection. This framework encourages bi-level sparsity through a computationally efficient two-stage procedure. It produces strong robust parameter estimators if some nonconvex redescending loss functions are applied. In theory, we provide sufficient conditions under which our proposed two-stage penalized M-estimator possesses simultaneous local estimation consistency and the bi-level variable selection consistency, if a certain nonconvex penalty function is used at the group level. The performances of the proposed estimators are demonstrated in both simulation studies and real examples. In Chapter 5, we provide some discussions and future work
The optimal design of the dual-purpose test
Traditional test development focused on one purpose of the test, either ranking test-takers or providing diagnostic profiles for test-takers. Embedding both the ranking and diagnostic purposes in one assessment instrument would be a great advancement to the test functionality and utility. Our understandings regarding how such dual-purpose test should be optimally design and analyzed, however, were dwarfed by the growing needs for it in practice. Potential psychometric challenges related to the dual-purpose testing were not fully addressed in the literature. The present study provided a systematic comparison of various plausible designing and analyzing paradigms for the dual-purpose test in conditions with varying test length and dimensionality of true abilities. Results suggested that in order to obtain accurate and reliable total score and subscores, the test should be designed with multidimensionality and at least 10 items per domain and analyzed using the multidimensional IRT model. Specifically, the unidimensional dual-purpose test was able to produce reliable and accuracy but not diagnostically meaningful scores. Subscores obtained from an essentially unidimensional test were either unable to provide added value to the total score according to the PRMSE criterion or homogeneous to each other according to disattenuated correlations. The idiosyncratic multidimensional design was able to yield accurate, reliable, and diagnostically useful scores, but the validity of the diagnostic subscores was questionable, whose correlation disagreed with the true correlational structure. Consequently, even though subscores were identified distinct from the total score according to the PRMSE criterion, they were still nearly identical to each other according to the disattenuated correlations. On the other hand, the principled multidimensional design showed slightly lower accuracy and reliability in scores due to the principled "simple structure" of test design, but this sacrifice of accuracy and reliability ensured the interpretability and validity of diagnostic subscores, whose empirical correlational structure approximated the true structure. Furthermore, with respect to calibration methods, unidimensional calibration was found failing to distinguish subscores, and thus failing to give subscores useful diagnostic information, even though the subscores sometimes appeared more accurate and reliable than those obtained with the other two calibrations. The confirmatory multidimensional calibration and separate unidimensional calibration delivered very comparable results. Finally, alternative scoring methods were found either inappropriate to use or offering insignificant improvements over the raw scores
Recommended from our members
Large-area epitaxial growth of curvature-stabilized ABC trilayer graphene.
The properties of van der Waals (vdW) materials often vary dramatically with the atomic stacking order between layers, but this order can be difficult to control. Trilayer graphene (TLG) stacks in either a semimetallic ABA or a semiconducting ABC configuration with a gate-tunable band gap, but the latter has only been produced by exfoliation. Here we present a chemical vapor deposition approach to TLG growth that yields greatly enhanced fraction and size of ABC domains. The key insight is that substrate curvature can stabilize ABC domains. Controllable ABC yields ~59% were achieved by tailoring substrate curvature levels. ABC fractions remained high after transfer to device substrates, as confirmed by transport measurements revealing the expected tunable ABC band gap. Substrate topography engineering provides a path to large-scale synthesis of epitaxial ABC-TLG and other vdW materials
Synthetic Lethality of Chk1 Inhibition Combined with p53 and/or p21 Loss During a DNA Damage Response in Normal and Tumor Cells
Cell cycle checkpoints ensure genome integrity and are frequently compromised in human cancers. A therapeutic strategy being explored takes advantage of checkpoint defects in p53-deficient tumors in order to sensitize them to DNA-damaging agents by eliminating Chk1-mediated checkpoint responses. Using mouse models, we demonstrated that p21 is a key determinant of how cells respond to the combination of DNA damage and Chk1 inhibition (combination therapy) in normal cells as well as in tumors. Loss of p21 sensitized normal cells to the combination therapy much more than did p53 loss and the enhanced lethality was partially blocked by CDK inhibition. In addition, basal pools of p21 (p53 independent) provided p53 null cells with protection from the combination therapy. Our results uncover a novel p53-independent function for p21 in protecting cells from the lethal effects of DNA damage followed by Chk1 inhibition. As p21 levels are low in a significant fraction of colorectal tumors, they are predicted to be particularly sensitive to the combination therapy. Results reported in this study support this prediction
Viridot: An automated virus plaque (immunofocus) counter for the measurement of serological neutralizing responses with application to dengue virus.
The gold-standard method for quantifying neutralizing antibody responses to many viruses, including dengue virus (DENV), is the plaque reduction neutralization test (PRNT, also called the immunofocus reduction neutralization test). The PRNT conducted on 96-well plates is high-throughput and requires a smaller volume of antiserum than on 6- or 24-well plates, but manual plaque counting is challenging and existing automated plaque counters are expensive or difficult to optimize. We have developed Viridot (Viridot package), a program for R with a user interface in shiny, that counts viral plaques of a variety of phenotypes, estimates neutralizing antibody titers, and performs other calculations of use to virologists. The Viridot plaque counter includes an automatic parameter identification mode (misses <10 plaques/well for 87% of diverse DENV strains [n = 1521]) and a mode that allows the user to fine-tune the parameters used for counting plaques. We compared standardized manual and Viridot plaque counting methods applied to the same wells by two analyses and found that Viridot plaque counts were as similar to the same analyst's manual count (Lin's concordance correlation coefficient, ρc = 0.99 [95% confidence interval: 0.99-1.00]) as manual counts between analysts (ρc = 0.99 [95% CI: 0.98-0.99]). The average ratio of neutralizing antibody titers based on manual counted plaques to Viridot counted plaques was 1.05 (95% CI: 0.98-1.14), similar to the average ratio of antibody titers based on manual plaque counts by the two analysts (1.06 [95% CI: 0.84-1.34]). Across diverse DENV and ZIKV strains (n = 14), manual and Viridot plaque counts were mostly consistent (range of ρc = 0.74 to 1.00) and the average ratio of antibody titers based on manual and Viridot counted plaques was close to 1 (0.94 [0.86-1.02]). Thus, Viridot can be used for plaque counting and neutralizing antibody titer estimation of diverse DENV strains and potentially other viruses on 96-well plates as well as for formalization of plaque-counting rules for standardization across experiments and analysts
The caspase-6–p62 axis modulates p62 droplets based autophagy in a dominant-negative manner
AbstractSQSTM1/p62, as a major autophagy receptor, forms droplets that are critical for cargo recognition, nucleation, and clearance. p62 droplets also function as liquid assembly platforms to allow the formation of autophagosomes at their surfaces. It is unknown how p62-droplet formation is regulated under physiological or pathological conditions. Here, we report that p62-droplet formation is selectively blocked by inflammatory toxicity, which induces cleavage of p62 by caspase-6 at a novel cleavage site D256, a conserved site across human, mouse, rat, and zebrafish. The N-terminal cleavage product is relatively stable, whereas the C-terminal product appears undetectable. Using a variety of cellular models, we show that the p62 N-terminal caspase-6 cleavage product (p62-N) plays a dominant-negative role to block p62-droplet formation. In vitro p62 phase separation assays confirm this observation. Dominant-negative regulation of p62-droplet formation by caspase-6 cleavage attenuates p62 droplets dependent autophagosome formation. Our study suggests a novel pathway to modulate autophagy through the caspase-6–p62 axis under certain stress stimuli.</jats:p
Recommended from our members
Multiple models and experiments underscore large uncertainty in soil carbon dynamics
Soils contain more carbon than plants or the atmosphere, and sensitivities of soil organic carbon (SOC) stocks to changing climate and plant productivity are a major uncertainty in global carbon cycle projections. Despite a consensus that microbial degradation and mineral stabilization processes control SOC cycling, no systematic synthesis of long-term warming and litter addition experiments has been used to test process-based microbe-mineral SOC models. We explored SOC responses to warming and increased carbon inputs using a synthesis of 147 field manipulation experiments and five SOC models with different representations of microbial and mineral processes. Model projections diverged but encompassed a similar range of variability as the experimental results. Experimental measurements were insufficient to eliminate or validate individual model outcomes. While all models projected that CO efflux would increase and SOC stocks would decline under warming, nearly one-third of experiments observed decreases in CO flux and nearly half of experiments observed increases in SOC stocks under warming. Long-term measurements of C inputs to soil and their changes under warming are needed to reconcile modeled and observed patterns. Measurements separating the responses of mineral-protected and unprotected SOC fractions in manipulation experiments are needed to address key uncertainties in microbial degradation and mineral stabilization mechanisms. Integrating models with experimental design will allow targeting of these uncertainties and help to reconcile divergence among models to produce more confident projections of SOC responses to global changes. 2
Ehrlichia chaffeensis Transcriptome in Mammalian and Arthropod Hosts Reveals Differential Gene Expression and Post Transcriptional Regulation
BACKGROUND: Human monocytotropic ehrlichiosis is an emerging life-threatening zoonosis caused by obligately intracellular bacterium, Ehrlichia chaffeensis. E. chaffeensis is transmitted by the lone star tick, Amblyomma americanum, and replicates in mononuclear phagocytes in mammalian hosts. Differences in the E. chaffeensis transcriptome in mammalian and arthropod hosts are unknown. Thus, we determined host-specific E. chaffeensis gene expression in human monocyte (THP-1) and in Amblyomma and Ixodes tick cell lines (AAE2 and ISE6) using a whole genome microarray. METHODOLOGY/PRINCIPAL FINDINGS: The majority (∼80%) of E. chaffeensis genes were expressed during infection in human and tick cells. There were few differences observed in E. chaffeensis gene expression between the vector Amblyomma and non-vector Ixodes tick cells, but extensive host-specific and differential gene expression profiles were detected between human and tick cells, including higher transcriptional activity in tick cells and identification of gene subsets that were differentially expressed in the two hosts. Differentially and host-specifically expressed ehrlichial genes encoded major immunoreactive tandem repeat proteins (TRP), the outer membrane protein (OMP-1) family, and hypothetical proteins that were 30-80 amino acids in length. Consistent with previous observations, high expression of p28 and OMP-1B genes was detected in human and tick cells, respectively. Notably, E. chaffeensis genes encoding TRP32 and TRP47 were highly upregulated in the human monocytes and expressed as proteins; however, although TRP transcripts were expressed in tick cells, the proteins were not detected in whole cell lysates demonstrating that TRP expression was post transcriptionally regulated. CONCLUSIONS/SIGNIFICANCE: Ehrlichia gene expression is highly active in tick cells, and differential gene expression among a wide variety of host-pathogen associated genes occurs. Furthermore, we demonstrate that genes associated with host-pathogen interactions are differentially expressed and regulated by post transcriptional mechanisms
The joint influence of marital status, interpregnancy interval, and neighborhood on small for gestational age birth: a retrospective cohort study
<p>Abstract</p> <p>Background</p> <p>Interpregnancy interval (IPI), marital status, and neighborhood are independently associated with birth outcomes. The joint contribution of these exposures has not been evaluated. We tested for effect modification between IPI and marriage, controlling for neighborhood.</p> <p>Methods</p> <p>We analyzed a cohort of 98,330 live births in Montréal, Canada from 1997–2001 to assess IPI and marital status in relation to small for gestational age (SGA) birth. Births were categorized as subsequent-born with <it>short </it>(<12 months), <it>intermediate </it>(12–35 months), or <it>long </it>(36+ months) IPI, or as firstborn. The data had a 2-level hierarchical structure, with births nested in 49 neighborhoods. We used multilevel logistic regression to obtain adjusted effect estimates.</p> <p>Results</p> <p>Marital status modified the association between IPI and SGA birth. Being unmarried relative to married was associated with SGA birth for all IPI categories, particularly for subsequent births with <it>short </it>(odds ratio [OR] 1.60, 95% confidence interval [CI] 1.31–1.95) and <it>intermediate </it>(OR 1.48, 95% CI 1.26–1.74) IPIs. Subsequent births had a lower likelihood of SGA birth than firstborns. <it>Intermediate </it>IPIs were more protective for married (OR 0.50, 95% CI 0.47–0.54) than unmarried mothers (OR 0.65, 95% CI 0.56–0.76).</p> <p>Conclusion</p> <p>Being unmarried increases the likelihood of SGA birth as the IPI shortens, and the protective effect of <it>intermediate </it>IPIs is reduced in unmarried mothers. Marital status should be considered in recommending particular IPIs as an intervention to improve birth outcomes.</p
- …