93 research outputs found

    Unsupervised Bayesian linear unmixing of gene expression microarrays

    Get PDF
    Background: This paper introduces a new constrained model and the corresponding algorithm, called unsupervised Bayesian linear unmixing (uBLU), to identify biological signatures from high dimensional assays like gene expression microarrays. The basis for uBLU is a Bayesian model for the data samples which are represented as an additive mixture of random positive gene signatures, called factors, with random positive mixing coefficients, called factor scores, that specify the relative contribution of each signature to a specific sample. The particularity of the proposed method is that uBLU constrains the factor loadings to be non-negative and the factor scores to be probability distributions over the factors. Furthermore, it also provides estimates of the number of factors. A Gibbs sampling strategy is adopted here to generate random samples according to the posterior distribution of the factors, factor scores, and number of factors. These samples are then used to estimate all the unknown parameters. Results: Firstly, the proposed uBLU method is applied to several simulated datasets with known ground truth and compared with previous factor decomposition methods, such as principal component analysis (PCA), non negative matrix factorization (NMF), Bayesian factor regression modeling (BFRM), and the gradient-based algorithm for general matrix factorization (GB-GMF). Secondly, we illustrate the application of uBLU on a real time-evolving gene expression dataset from a recent viral challenge study in which individuals have been inoculated with influenza A/H3N2/Wisconsin. We show that the uBLU method significantly outperforms the other methods on the simulated and real data sets considered here. Conclusions: The results obtained on synthetic and real data illustrate the accuracy of the proposed uBLU method when compared to other factor decomposition methods from the literature (PCA, NMF, BFRM, and GB-GMF). The uBLU method identifies an inflammatory component closely associated with clinical symptom scores collected during the study. Using a constrained model allows recovery of all the inflammatory genes in a single factor

    Gene expression profiles link respiratory viral infection, platelet response to aspirin, and acute myocardial infarction

    Get PDF
    Background Influenza infection is associated with myocardial infarction (MI), suggesting that respiratory viral infection may induce biologic pathways that contribute to MI. We tested the hypotheses that 1) a validated blood gene expression signature of respiratory viral infection (viral GES) was associated with MI and 2) respiratory viral exposure changes levels of a validated platelet gene expression signature (platelet GES) of platelet function in response to aspirin that is associated with MI. Methods A previously defined viral GES was projected into blood RNA data from 594 patients undergoing elective cardiac catheterization and used to classify patients as having evidence of viral infection or not and tested for association with acute MI using logistic regression. A previously defined platelet GES was projected into blood RNA data from 81 healthy subjects before and after exposure to four respiratory viruses: Respiratory Syncytial Virus (RSV) (n=20), Human Rhinovirus (HRV) (n=20), Influenza A virus subtype H1N1 (H1N1) (n=24), Influenza A Virus subtype H3N2 (H3N2) (n=17). We tested for the change in platelet GES with viral exposure using linear mixed-effects regression and by symptom status. Results In the catheterization cohort, 32 patients had evidence of viral infection based upon the viral GES, of which 25% (8/32) had MI versus 12.2%(69/567) among those without evidence of viral infection (OR 2.3; CI [1.03-5.5], p=0.04). In the infection cohorts, only H1N1 exposure increased platelet GES over time (time course p-value = 1e-04). Conclusions A viral GES of non-specific, respiratory viral infection was associated with acute MI; 18% of the top 49 genes in the viral GES are involved with hemostasis and/or platelet aggregation. Separately, H1N1 exposure, but not exposure to other respiratory viruses, increased a platelet GES previously shown to be associated with MI. Together, these results highlight specific genes and pathways that link viral infection, platelet activation, and MI especially in the case of H1N1 influenza infection

    A host transcriptional signature for presymptomatic detection of infection in humans exposed to influenza H1N1 or H3N2.

    Get PDF
    There is great potential for host-based gene expression analysis to impact the early diagnosis of infectious diseases. In particular, the influenza pandemic of 2009 highlighted the challenges and limitations of traditional pathogen-based testing for suspected upper respiratory viral infection. We inoculated human volunteers with either influenza A (A/Brisbane/59/2007 (H1N1) or A/Wisconsin/67/2005 (H3N2)), and assayed the peripheral blood transcriptome every 8 hours for 7 days. Of 41 inoculated volunteers, 18 (44%) developed symptomatic infection. Using unbiased sparse latent factor regression analysis, we generated a gene signature (or factor) for symptomatic influenza capable of detecting 94% of infected cases. This gene signature is detectable as early as 29 hours post-exposure and achieves maximal accuracy on average 43 hours (p = 0.003, H1N1) and 38 hours (p-value = 0.005, H3N2) before peak clinical symptoms. In order to test the relevance of these findings in naturally acquired disease, a composite influenza A signature built from these challenge studies was applied to Emergency Department patients where it discriminates between swine-origin influenza A/H1N1 (2009) infected and non-infected individuals with 92% accuracy. The host genomic response to Influenza infection is robust and may provide the means for detection before typical clinical symptoms are apparent

    Expanding the Understanding of Biases in Development of Clinical-Grade Molecular Signatures: A Case Study in Acute Respiratory Viral Infections

    Get PDF
    The promise of modern personalized medicine is to use molecular and clinical information to better diagnose, manage, and treat disease, on an individual patient basis. These functions are predominantly enabled by molecular signatures, which are computational models for predicting phenotypes and other responses of interest from high-throughput assay data. Data-analytics is a central component of molecular signature development and can jeopardize the entire process if conducted incorrectly. While exploratory data analysis may tolerate suboptimal protocols, clinical-grade molecular signatures are subject to vastly stricter requirements. Closing the gap between standards for exploratory versus clinically successful molecular signatures entails a thorough understanding of possible biases in the data analysis phase and developing strategies to avoid them.Using a recently introduced data-analytic protocol as a case study, we provide an in-depth examination of the poorly studied biases of the data-analytic protocols related to signature multiplicity, biomarker redundancy, data preprocessing, and validation of signature reproducibility. The methodology and results presented in this work are aimed at expanding the understanding of these data-analytic biases that affect development of clinically robust molecular signatures.Several recommendations follow from the current study. First, all molecular signatures of a phenotype should be extracted to the extent possible, in order to provide comprehensive and accurate grounds for understanding disease pathogenesis. Second, redundant genes should generally be removed from final signatures to facilitate reproducibility and decrease manufacturing costs. Third, data preprocessing procedures should be designed so as not to bias biomarker selection. Finally, molecular signatures developed and applied on different phenotypes and populations of patients should be treated with great caution

    Using gene expression profiles from peripheral blood to identify asymptomatic responses to acute respiratory viral infections

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A recent study reported that gene expression profiles from peripheral blood samples of healthy subjects prior to viral inoculation were indistinguishable from profiles of subjects who received viral challenge but remained asymptomatic and uninfected. If true, this implies that the host immune response does not have a molecular signature. Given the high sensitivity of microarray technology, we were intrigued by this result and hypothesize that it was an artifact of data analysis.</p> <p>Findings</p> <p>Using acute respiratory viral challenge microarray data, we developed a molecular signature that for the first time allowed for an accurate differentiation between uninfected subjects prior to viral inoculation and subjects who remained asymptomatic after the viral challenge.</p> <p>Conclusions</p> <p>Our findings suggest that molecular signatures can be used to characterize immune responses to viruses and may improve our understanding of susceptibility to viral infection with possible implications for vaccine development.</p

    Systemic inflammatory response syndrome in adult patients with nosocomial bloodstream infections due to enterococci

    Get PDF
    BACKGROUND: Enterococci are the third leading cause of nosocomial bloodstream infection (BSI). Vancomycin resistant enterococci are common and provide treatment challenges; however questions remain about VRE's pathogenicity and its direct clinical impact. This study analyzed the inflammatory response of Enterococcal BSI, contrasting infections from vancomycin-resistant and vancomycin-susceptible isolates. METHODS: We performed a historical cohort study on 50 adults with enterococcal BSI to evaluate the associated systemic inflammatory response syndrome (SIRS) and mortality. We examined SIRS scores 2 days prior through 14 days after the first positive blood culture. Vancomycin resistant (n = 17) and susceptible infections (n = 33) were compared. Variables significant in univariate analysis were entered into a logistic regression model to determine the affect on mortality. RESULTS: 60% of BSI were caused by E. faecalis and 34% by E. faecium. 34% of the isolates were vancomycin resistant. Mean APACHE II (A2) score on the day of BSI was 16. Appropriate antimicrobials were begun within 24 hours in 52%. Septic shock occurred in 62% and severe sepsis in an additional 18%. Incidence of organ failure was as follows: respiratory 42%, renal 48%, hematologic 44%, hepatic 26%. Crude mortality was 48%. Progression to septic shock was associated with death (OR 14.9, p < .001). There was no difference in A2 scores on days -2, -1 and 0 between the VRE and VSE groups. Maximal SIR (severe sepsis, septic shock or death) was seen on day 2 for VSE BSI vs. day 8 for VRE. No significant difference was noted in the incidence of organ failure, 7-day or overall mortality between the two groups. Univariate analysis revealed that AP2>18 at BSI onset, and respiratory, cardiovascular, renal, hematologic and hepatic failure were associated with death, but time to appropriate therapy >24 hours, age, and infection due to VRE were not. Multivariate analysis revealed that hematologic (OR 8.4, p = .025) and cardiovascular failure (OR 7.5, p = 032) independently predicted death. CONCLUSION: In patients with enterococcal BSI, (1) the incidence of septic shock and organ failure is high, (2) patients with VRE BSI are not more acutely ill prior to infection than those with VSE BSI, and (3) the development of hematologic or cardiovascular failure independently predicts death

    Integrating Factor Analysis and a Transgenic Mouse Model to Reveal a Peripheral Blood Predictor of Breast Tumors

    Get PDF
    Abstract Background Transgenic mouse tumor models have the advantage of facilitating controlled in vivo oncogenic perturbations in a common genetic background. This provides an idealized context for generating transcriptome-based diagnostic models while minimizing the inherent noisiness of high-throughput technologies. However, the question remains whether models developed in such a setting are suitable prototypes for useful human diagnostics. We show that latent factor modeling of the peripheral blood transcriptome in a mouse model of breast cancer provides the basis for using computational methods to link a mouse model to a prototype human diagnostic based on a common underlying biological response to the presence of a tumor. Methods We used gene expression data from mouse peripheral blood cell (PBC) samples to identify significantly differentially expressed genes using supervised classification and sparse ANOVA. We employed these transcriptome data as the starting point for developing a breast tumor predictor from human peripheral blood mononuclear cells (PBMCs) by using a factor modeling approach. Results The predictor distinguished breast cancer patients from healthy individuals in a cohort of patients independent from that used to build the factors and train the model with 89% sensitivity, 100% specificity and an area under the curve (AUC) of 0.97 using Youden's J-statistic to objectively select the model's classification threshold. Both permutation testing of the model and evaluating the model strategy by swapping the training and validation sets highlight its stability. Conclusions We describe a human breast tumor predictor based on the gene expression of mouse PBCs. This strategy overcomes many of the limitations of earlier studies by using the model system to reduce noise and identify transcripts associated with the presence of a breast tumor over other potentially confounding factors. Our results serve as a proof-of-concept for using an animal model to develop a blood-based diagnostic, and it establishes an experimental framework for identifying predictors of solid tumors, not only in the context of breast cancer, but also in other types of cancer.</p

    Systemic Signature of the Lung Response to Respiratory Syncytial Virus Infection

    Get PDF
    Respiratory Syncytial Virus is a frequent cause of severe bronchiolitis in children. To improve our understanding of systemic host responses to RSV, we compared BALB/c mouse gene expression responses at day 1, 2, and 5 during primary RSV infection in lung, bronchial lymph nodes, and blood. We identified a set of 53 interferon-associated and innate immunity genes that give correlated responses in all three murine tissues. Additionally, we identified blood gene signatures that are indicative of acute infection, secondary immune response, and vaccine-enhanced disease, respectively. Eosinophil-associated ribonucleases were characteristic for the vaccine-enhanced disease blood signature. These results indicate that it may be possible to distinguish protective and unfavorable patient lung responses via blood diagnostics
    corecore