66 research outputs found

    Studies on the relationships between oligonucleotide probe properties and hybridization signal intensities

    Get PDF
    Microarray technology is a commonly used tool in biomedical research for assessing global gene expression, surveying DNA sequence variations, and studying alternative gene splicing. Given the wide range of applications of this technology, comprehensive understanding of its underlying mechanisms is of importance. The focus of this work is on contributions from microarray probe properties (probe secondary structure: ?Gss, probe-target binding energy: ?G, probe-target mismatch) to the signal intensity. The benefits of incorporating or ignoring these properties to the process of microarray probe design and selection, as well as to microarray data preprocessing and analysis, are reported. Four related studies are described in this thesis. In the first, probe secondary structure was found to account for up to 3% of all variation on Affymetrix microarrays. In the second, a dinucleotide affinity model was developed and found to enhance the detection of differentially expressed genes when implemented as a background correction procedure in GeneChip preprocessing algorithms. This model is consistent with physical models of binding affinity of the probe target pair, which depends on the nearest-neighbor stacking interactions in addition to base-pairing. In the remaining studies, the importance of incorporating biophysical factors in both the design and the analysis of microarrays ‘percent bound’, predicted by equilibrium models of hybridization, is a useful factor in predicting and assessing the behavior of long oligonucleotide probes. However, a universal probe-property-independent three-parameter Langmuir model has also been tested, and this simple model has been shown to be as, or more, effective as complex, computationally expensive models developed for microarray target concentration estimation. The simple, platform-independent model can equal or even outperform models that explicitly incorporate probe properties, such as the model incorporating probe percent bound developed in Chapter Three. This suggests that with a “spiked-in” concentration series targeting as few as 5-10 genes, reliable estimation of target concentration can be achieved for the entire microarray

    Background correction using dinucleotide affinities improves the performance of GCRMA

    Get PDF
    BACKGROUND: High-density short oligonucleotide microarrays are a primary research tool for assessing global gene expression. Background noise on microarrays comprises a significant portion of the measured raw data, which can have serious implications for the interpretation of the generated data if not estimated correctly. RESULTS: We introduce an approach to calculate probe affinity based on sequence composition, incorporating nearest-neighbor (NN) information. Our model uses position-specific dinucleotide information, instead of the original single nucleotide approach, and adds up to 10% to the total variance explained (R(2)) when compared to the previously published model. We demonstrate that correcting for background noise using this approach enhances the performance of the GCRMA preprocessing algorithm when applied to control datasets, especially for detecting low intensity targets. CONCLUSION: Modifying the previously published position-dependent affinity model to incorporate dinucleotide information significantly improves the performance of the model. The dinucleotide affinity model enhances the detection of differentially expressed genes when implemented as a background correction procedure in GeneChip preprocessing algorithms. This is conceptually consistent with physical models of binding affinity, which depend on the nearest-neighbor stacking interactions in addition to base-pairing

    MedZIM: Mediation analysis for Zero-Inflated Mediators with applications to microbiome data

    Full text link
    The human microbiome can contribute to the pathogenesis of many complex diseases such as cancer and Alzheimer's disease by mediating disease-leading causal pathways. However, standard mediation analysis is not adequate in the context of microbiome data due to the excessive number of zero values in the data. Zero-valued sequencing reads, commonly observed in microbiome studies, arise for technical and/or biological reasons. Mediation analysis approaches for analyzing zero-inflated mediators are still lacking largely because of challenges raised by the zero-inflated data structure: (a) disentangling the mediation effect induced by the point mass at zero; and (b) identifying the observed zero-valued data points that are actually not zero (i.e., false zeros). We develop a novel mediation analysis method under the potential-outcomes framework to fill this gap. We show that the mediation effect of the microbiome can be decomposed into two components that are inherent to the two-part nature of zero-inflated distributions. The first component corresponds to the mediation effect attributable to a unit-change over the positive relative abundance and the second component corresponds to the mediation effect attributable to discrete binary change of the mediator from zero to a non-zero state. With probabilistic models to account for observing zeros, we also address the challenge with false zeros. A comprehensive simulation study and the applications in two real microbiome studies demonstrate that our approach outperforms existing mediation analysis approaches.Comment: Corresponding: Zhigang L

    Structural Insights into Endobiotic Reactivation by Human Gut Microbiome-Encoded Sulfatases

    Get PDF
    Phase II drug metabolism inactivates xenobiotics and endobiotics through the addition of either a glucuronic acid or sulfate moiety prior to excretion, often via the gastrointestinal tract. While the human gut microbial β-glucuronidase enzymes that reactivate glucuronide conjugates in the intestines are becoming well characterized and even controlled by targeted inhibitors, the sulfatases encoded by the human gut microbiome have not been comprehensively examined. Gut microbial sulfatases are poised to reactivate xenobiotics and endobiotics, which are then capable of undergoing enterohepatic recirculation or exerting local effects on the gut epithelium. Here, using protein structure-guided methods, we identify 728 distinct microbiome-encoded sulfatase proteins from the 4.8 million unique proteins present in the Human Microbiome Project Stool Sample database and 1766 gut microbial sulfatases from the 9.9 million sequences in the Integrated Gene Catalogue. We purify a representative set of these sulfatases, elucidate crystal structures, and pinpoint unique structural motifs essential to endobiotic sulfate processing. Gut microbial sulfatases differentially process sulfated forms of the neurotransmitters serotonin and dopamine, and the hormones melatonin, estrone, dehydroepiandrosterone, and thyroxine in a manner dependent both on variabilities in active site architecture and on markedly distinct oligomeric states. Taken together, these data provide initial insights into the structural and functional diversity of gut microbial sulfatases, providing a path toward defining the roles these enzymes play in health and disease

    Stochastic changes over time and not founder effects drive cage effects in microbial community assembly in a mouse model

    Get PDF
    Maternal transmission and cage effects are powerful confounding factors in microbiome studies. To assess the consequences of cage microenvironment on the mouse gut microbiome, two groups of germ-free (GF) wild-type (WT) mice, one gavaged with a microbiota harvested from adult WT mice and another allowed to acquire the microbiome from the cage microenvironment, were monitored using Illumina 16S rRNA sequencing over a period of 8 weeks. Our results revealed that cage effects in WT mice moved from GF to specific pathogen free (SPF) conditions take several weeks to develop and are not eliminated by the initial gavage treatment. Initial gavage influenced, but did not eliminate a successional pattern in which Proteobacteria became less abundant over time. An analysis in which 16S rRNA sequences are mapped to the closest sequenced whole genome suggests that the functional potential of microbial genomes changes significantly over time shifting from an emphasis on pathogenesis and motility early in community assembly to metabolic processes at later time points. Functionally, mice allowed to naturally acquire a microbial community from their cage, but not mice gavaged with a common biome, exhibit a cage effect in Dextran Sulfate Sodium-induced inflammation. Our results argue that while there are long-term effects of the founding community, these effects are mitigated by cage microenvironment and successional community assembly over time, which must both be explicitly considered in the interpretation of microbiome mouse experiments

    Microbial genomic analysis reveals the essential role of inflammation in bacteria-induced colorectal cancer

    Get PDF
    Enterobacteria, especially Escherichia coli, are abundant in patients with inflammatory bowel disease or colorectal cancer (CRC). However, it is unclear whether cancer is promoted by inflammation-induced expansion of E. coli and/or changes in expression of specific microbial genes. Here we use longitudinal (2, 12 and 20 weeks) 16S rRNA sequencing of luminal microbiota from ex-germ free mice to show that inflamed Il10−/− mice maintain a higher abundance of Enterobacteriaceae than healthy wild-type mice. Experiments with mono-colonized Il10−/− mice reveal that host inflammation is necessary for E. coli cancer-promoting activity. RNA-sequence analysis indicates significant changes in E. coli gene catalogue in Il10−/− mice, with changes mostly driven by adaptation to the intestinal environment. Expression of specific genes present in the tumor-promoting E. coli pks island are modulated by inflammation/CRC development. Thus, progression of inflammation in Il10−/− mice supports Enterobacteriaceae and alters a small subset of microbial genes important for tumor development

    VSL#3 probiotic modifies mucosal microbial composition but does not reduce colitis-associated colorectal cancer

    Get PDF
    Although probiotics have shown success in preventing the development of experimental colitis-associated colorectal cancer (CRC), beneficial effects of interventional treatment are relatively unknown. Here we show that interventional treatment with VSL#3 probiotic alters the luminal and mucosally-adherent microbiota, but does not protect against inflammation or tumorigenesis in the azoxymethane (AOM)/Il10−/− mouse model of colitis-associated CRC. VSL#3 (109 CFU/animal/day) significantly enhanced tumor penetrance, multiplicity, histologic dysplasia scores, and adenocarcinoma invasion relative to VSL#3-untreated mice. Illumina 16S sequencing demonstrated that VSL#3 significantly decreased (16-fold) the abundance of a bacterial taxon assigned to genus Clostridium in the mucosally-adherent microbiota. Mediation analysis by linear models suggested that this taxon was a contributing factor to increased tumorigenesis in VSL#3-fed mice. We conclude that VSL#3 interventional therapy can alter microbial community composition and enhance tumorigenesis in the AOM/Il10−/− model

    Accurate Estimates of Microarray Target Concentration from a Simple Sequence-Independent Langmuir Model

    Get PDF
    Background: Microarray technology is a commonly used tool for assessing global gene expression. Many models for estimation of target concentration based on observed microarray signal have been proposed, but, in general, these models have been complex and platform-dependent. Principal Findings: We introduce a universal Langmuir model for estimation of absolute target concentration from microarray experiments. We find that this sequence-independent model, characterized by only three free parameters, yields excellent predictions for four microarray platforms, including Affymetrix, Agilent, Illumina and a custom-printed microarray. The model also accurately predicts concentration for the MAQC data sets. This approach significantly reduces the computational complexity of quantitative target concentration estimates. Conclusions: Using a simple form of the Langmuir isotherm model, with a minimum of parameters and assumptions, and without explicit modeling of individual probe properties, we were able to recover absolute transcript concentrations with high R 2 on four different array platforms. The results obtained here suggest that with a ‘‘spiked-in’ ’ concentration serie

    SmartSet Virtual Studio Solution: Validation Phase Test Results

    Get PDF
    The vision in the SmartSet project is to develop a low cost virtual studio solution that, despite being ten times less than the cost of comparable solutions on the market, will have the same quality of high cost solutions currently used by larger broadcast media companies, but with a simple and limited functionality. The project will increase the competitiveness of the European creative industries, particularly in the broadcast media sector. The SmartSet project objectives include mapping and prioritising the user requirements for the virtual studio solution to be developed. This report is based on the user consultation process with the end users and stakeholders of the SmartSet project to determine the functionality requirements for product development and integration. The research set out to detail a range of user requirements which will feed into the virtual studio specification
    corecore