3,208 research outputs found

    Robust high-dimensional data analysis using a weight shrinkage rule

    Get PDF
    In high-dimensional settings, a penalized least squares approach may lose its efficiency in both estimation and variable selection due to the existence of either outliers or heteroscedasticity. In this thesis, we propose a novel approach to perform robust high-dimensional data analysis in a penalized weighted least square framework. The main idea is to relate the irregularity of each observation to a weight vector and obtain the outlying status data-adaptively using a weight shrinkage rule. By usage of L-1 type regularization on both the coefficients and weight vectors, the proposed method is able to perform simultaneous variable selection and outliers detection efficiently. Eventually, this procedure results in estimators with potentially strong robustness and non-asymptotic consistency. We provide a unified link between the weight shrinkage rule and a robust M-estimation in general settings. We also establish the non-asymptotic oracle inequalities for the joint estimation of both the regression coefficients and weight vectors. These theoretical results allow the number of variables to far exceed the sample size. The performance of the proposed estimator is demonstrated in both simulation studies and real examples

    Robust penalized regression for complex high-dimensional data

    Get PDF
    Robust high-dimensional data analysis has become an important and challenging task in complex Big Data analysis due to the high-dimensionality and data contamination. One of the most popular procedures is the robust penalized regression. In this dissertation, we address three typical robust ultra-high dimensional regression problems via penalized regression approaches. The first problem is related to the linear model with the existence of outliers, dealing with the outlier detection, variable selection and parameter estimation simultaneously. The second problem is related to robust high-dimensional mean regression with irregular settings such as the data contamination, data asymmetry and heteroscedasticity. The third problem is related to robust bi-level variable selection for the linear regression model with grouping structures in covariates. In Chapter 1, we introduce the background and challenges by overviews of penalized least squares methods and robust regression techniques. In Chapter 2, we propose a novel approach in a penalized weighted least squares framework to perform simultaneous variable selection and outlier detection. We provide a unified link between the proposed framework and a robust M-estimation in general settings. We also establish the non-asymptotic oracle inequalities for the joint estimation of both the regression coefficients and weight vectors. In Chapter 3, we establish a framework of robust estimators in high-dimensional regression models using Penalized Robust Approximated quadratic M estimation (PRAM). This framework allows general settings such as random errors lack of symmetry and homogeneity, or covariates are not sub-Gaussian. Theoretically, we show that, in the ultra-high dimension setting, the PRAM estimator has local estimation consistency at the minimax rate enjoyed by the LS-Lasso and owns the local oracle property, under certain mild conditions. In Chapter 4, we extend the study in Chapter 3 to robust high-dimensional data analysis with structured sparsity. In particular, we propose a framework of high-dimensional M-estimators for bi-level variable selection. This framework encourages bi-level sparsity through a computationally efficient two-stage procedure. It produces strong robust parameter estimators if some nonconvex redescending loss functions are applied. In theory, we provide sufficient conditions under which our proposed two-stage penalized M-estimator possesses simultaneous local estimation consistency and the bi-level variable selection consistency, if a certain nonconvex penalty function is used at the group level. The performances of the proposed estimators are demonstrated in both simulation studies and real examples. In Chapter 5, we provide some discussions and future work

    The optimal design of the dual-purpose test

    Get PDF
    Traditional test development focused on one purpose of the test, either ranking test-takers or providing diagnostic profiles for test-takers. Embedding both the ranking and diagnostic purposes in one assessment instrument would be a great advancement to the test functionality and utility. Our understandings regarding how such dual-purpose test should be optimally design and analyzed, however, were dwarfed by the growing needs for it in practice. Potential psychometric challenges related to the dual-purpose testing were not fully addressed in the literature. The present study provided a systematic comparison of various plausible designing and analyzing paradigms for the dual-purpose test in conditions with varying test length and dimensionality of true abilities. Results suggested that in order to obtain accurate and reliable total score and subscores, the test should be designed with multidimensionality and at least 10 items per domain and analyzed using the multidimensional IRT model. Specifically, the unidimensional dual-purpose test was able to produce reliable and accuracy but not diagnostically meaningful scores. Subscores obtained from an essentially unidimensional test were either unable to provide added value to the total score according to the PRMSE criterion or homogeneous to each other according to disattenuated correlations. The idiosyncratic multidimensional design was able to yield accurate, reliable, and diagnostically useful scores, but the validity of the diagnostic subscores was questionable, whose correlation disagreed with the true correlational structure. Consequently, even though subscores were identified distinct from the total score according to the PRMSE criterion, they were still nearly identical to each other according to the disattenuated correlations. On the other hand, the principled multidimensional design showed slightly lower accuracy and reliability in scores due to the principled "simple structure" of test design, but this sacrifice of accuracy and reliability ensured the interpretability and validity of diagnostic subscores, whose empirical correlational structure approximated the true structure. Furthermore, with respect to calibration methods, unidimensional calibration was found failing to distinguish subscores, and thus failing to give subscores useful diagnostic information, even though the subscores sometimes appeared more accurate and reliable than those obtained with the other two calibrations. The confirmatory multidimensional calibration and separate unidimensional calibration delivered very comparable results. Finally, alternative scoring methods were found either inappropriate to use or offering insignificant improvements over the raw scores

    Synthetic Lethality of Chk1 Inhibition Combined with p53 and/or p21 Loss During a DNA Damage Response in Normal and Tumor Cells

    Get PDF
    Cell cycle checkpoints ensure genome integrity and are frequently compromised in human cancers. A therapeutic strategy being explored takes advantage of checkpoint defects in p53-deficient tumors in order to sensitize them to DNA-damaging agents by eliminating Chk1-mediated checkpoint responses. Using mouse models, we demonstrated that p21 is a key determinant of how cells respond to the combination of DNA damage and Chk1 inhibition (combination therapy) in normal cells as well as in tumors. Loss of p21 sensitized normal cells to the combination therapy much more than did p53 loss and the enhanced lethality was partially blocked by CDK inhibition. In addition, basal pools of p21 (p53 independent) provided p53 null cells with protection from the combination therapy. Our results uncover a novel p53-independent function for p21 in protecting cells from the lethal effects of DNA damage followed by Chk1 inhibition. As p21 levels are low in a significant fraction of colorectal tumors, they are predicted to be particularly sensitive to the combination therapy. Results reported in this study support this prediction

    Viridot: An automated virus plaque (immunofocus) counter for the measurement of serological neutralizing responses with application to dengue virus.

    Get PDF
    The gold-standard method for quantifying neutralizing antibody responses to many viruses, including dengue virus (DENV), is the plaque reduction neutralization test (PRNT, also called the immunofocus reduction neutralization test). The PRNT conducted on 96-well plates is high-throughput and requires a smaller volume of antiserum than on 6- or 24-well plates, but manual plaque counting is challenging and existing automated plaque counters are expensive or difficult to optimize. We have developed Viridot (Viridot package), a program for R with a user interface in shiny, that counts viral plaques of a variety of phenotypes, estimates neutralizing antibody titers, and performs other calculations of use to virologists. The Viridot plaque counter includes an automatic parameter identification mode (misses <10 plaques/well for 87% of diverse DENV strains [n = 1521]) and a mode that allows the user to fine-tune the parameters used for counting plaques. We compared standardized manual and Viridot plaque counting methods applied to the same wells by two analyses and found that Viridot plaque counts were as similar to the same analyst's manual count (Lin's concordance correlation coefficient, ρc = 0.99 [95% confidence interval: 0.99-1.00]) as manual counts between analysts (ρc = 0.99 [95% CI: 0.98-0.99]). The average ratio of neutralizing antibody titers based on manual counted plaques to Viridot counted plaques was 1.05 (95% CI: 0.98-1.14), similar to the average ratio of antibody titers based on manual plaque counts by the two analysts (1.06 [95% CI: 0.84-1.34]). Across diverse DENV and ZIKV strains (n = 14), manual and Viridot plaque counts were mostly consistent (range of ρc = 0.74 to 1.00) and the average ratio of antibody titers based on manual and Viridot counted plaques was close to 1 (0.94 [0.86-1.02]). Thus, Viridot can be used for plaque counting and neutralizing antibody titer estimation of diverse DENV strains and potentially other viruses on 96-well plates as well as for formalization of plaque-counting rules for standardization across experiments and analysts

    The caspase-6–p62 axis modulates p62 droplets based autophagy in a dominant-negative manner

    Get PDF
    AbstractSQSTM1/p62, as a major autophagy receptor, forms droplets that are critical for cargo recognition, nucleation, and clearance. p62 droplets also function as liquid assembly platforms to allow the formation of autophagosomes at their surfaces. It is unknown how p62-droplet formation is regulated under physiological or pathological conditions. Here, we report that p62-droplet formation is selectively blocked by inflammatory toxicity, which induces cleavage of p62 by caspase-6 at a novel cleavage site D256, a conserved site across human, mouse, rat, and zebrafish. The N-terminal cleavage product is relatively stable, whereas the C-terminal product appears undetectable. Using a variety of cellular models, we show that the p62 N-terminal caspase-6 cleavage product (p62-N) plays a dominant-negative role to block p62-droplet formation. In vitro p62 phase separation assays confirm this observation. Dominant-negative regulation of p62-droplet formation by caspase-6 cleavage attenuates p62 droplets dependent autophagosome formation. Our study suggests a novel pathway to modulate autophagy through the caspase-6–p62 axis under certain stress stimuli.</jats:p

    Ehrlichia chaffeensis Transcriptome in Mammalian and Arthropod Hosts Reveals Differential Gene Expression and Post Transcriptional Regulation

    Get PDF
    BACKGROUND: Human monocytotropic ehrlichiosis is an emerging life-threatening zoonosis caused by obligately intracellular bacterium, Ehrlichia chaffeensis. E. chaffeensis is transmitted by the lone star tick, Amblyomma americanum, and replicates in mononuclear phagocytes in mammalian hosts. Differences in the E. chaffeensis transcriptome in mammalian and arthropod hosts are unknown. Thus, we determined host-specific E. chaffeensis gene expression in human monocyte (THP-1) and in Amblyomma and Ixodes tick cell lines (AAE2 and ISE6) using a whole genome microarray. METHODOLOGY/PRINCIPAL FINDINGS: The majority (∼80%) of E. chaffeensis genes were expressed during infection in human and tick cells. There were few differences observed in E. chaffeensis gene expression between the vector Amblyomma and non-vector Ixodes tick cells, but extensive host-specific and differential gene expression profiles were detected between human and tick cells, including higher transcriptional activity in tick cells and identification of gene subsets that were differentially expressed in the two hosts. Differentially and host-specifically expressed ehrlichial genes encoded major immunoreactive tandem repeat proteins (TRP), the outer membrane protein (OMP-1) family, and hypothetical proteins that were 30-80 amino acids in length. Consistent with previous observations, high expression of p28 and OMP-1B genes was detected in human and tick cells, respectively. Notably, E. chaffeensis genes encoding TRP32 and TRP47 were highly upregulated in the human monocytes and expressed as proteins; however, although TRP transcripts were expressed in tick cells, the proteins were not detected in whole cell lysates demonstrating that TRP expression was post transcriptionally regulated. CONCLUSIONS/SIGNIFICANCE: Ehrlichia gene expression is highly active in tick cells, and differential gene expression among a wide variety of host-pathogen associated genes occurs. Furthermore, we demonstrate that genes associated with host-pathogen interactions are differentially expressed and regulated by post transcriptional mechanisms

    The joint influence of marital status, interpregnancy interval, and neighborhood on small for gestational age birth: a retrospective cohort study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Interpregnancy interval (IPI), marital status, and neighborhood are independently associated with birth outcomes. The joint contribution of these exposures has not been evaluated. We tested for effect modification between IPI and marriage, controlling for neighborhood.</p> <p>Methods</p> <p>We analyzed a cohort of 98,330 live births in Montréal, Canada from 1997–2001 to assess IPI and marital status in relation to small for gestational age (SGA) birth. Births were categorized as subsequent-born with <it>short </it>(<12 months), <it>intermediate </it>(12–35 months), or <it>long </it>(36+ months) IPI, or as firstborn. The data had a 2-level hierarchical structure, with births nested in 49 neighborhoods. We used multilevel logistic regression to obtain adjusted effect estimates.</p> <p>Results</p> <p>Marital status modified the association between IPI and SGA birth. Being unmarried relative to married was associated with SGA birth for all IPI categories, particularly for subsequent births with <it>short </it>(odds ratio [OR] 1.60, 95% confidence interval [CI] 1.31–1.95) and <it>intermediate </it>(OR 1.48, 95% CI 1.26–1.74) IPIs. Subsequent births had a lower likelihood of SGA birth than firstborns. <it>Intermediate </it>IPIs were more protective for married (OR 0.50, 95% CI 0.47–0.54) than unmarried mothers (OR 0.65, 95% CI 0.56–0.76).</p> <p>Conclusion</p> <p>Being unmarried increases the likelihood of SGA birth as the IPI shortens, and the protective effect of <it>intermediate </it>IPIs is reduced in unmarried mothers. Marital status should be considered in recommending particular IPIs as an intervention to improve birth outcomes.</p
    corecore