19 research outputs found

    The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison

    Get PDF
    BACKGROUND: Analysis of DNA microarray data takes as input spot intensity measurements from scanner software and returns differential expression of genes between two conditions, together with a statistical significance assessment. This process typically consists of two steps: data normalization and identification of differentially expressed genes through statistical analysis. The Expresso microarray experiment management system implements these steps with a two-stage, log-linear ANOVA mixed model technique, tailored to individual experimental designs. The complement of tools in TM4, on the other hand, is based on a number of preset design choices that limit its flexibility. In the TM4 microarray analysis suite, normalization, filter, and analysis methods form an analysis pipeline. TM4 computes integrated intensity values (IIV) from the average intensities and spot pixel counts returned by the scanner software as input to its normalization steps. By contrast, Expresso can use either IIV data or median intensity values (MIV). Here, we compare Expresso and TM4 analysis of two experiments and assess the results against qRT-PCR data. RESULTS: The Expresso analysis using MIV data consistently identifies more genes as differentially expressed, when compared to Expresso analysis with IIV data. The typical TM4 normalization and filtering pipeline corrects systematic intensity-specific bias on a per microarray basis. Subsequent statistical analysis with Expresso or a TM4 t-test can effectively identify differentially expressed genes. The best agreement with qRT-PCR data is obtained through the use of Expresso analysis and MIV data. CONCLUSION: The results of this research are of practical value to biologists who analyze microarray data sets. The TM4 normalization and filtering pipeline corrects microarray-specific systematic bias and complements the normalization stage in Expresso analysis. The results of Expresso using MIV data have the best agreement with qRT-PCR results. In one experiment, MIV is a better choice than IIV as input to data normalization and statistical analysis methods, as it yields as greater number of statistically significant differentially expressed genes; TM4 does not support the choice of MIV input data. Overall, the more flexible and extensive statistical models of Expresso achieve more accurate analytical results, when judged by the yardstick of qRT-PCR data, in the context of an experimental design of modest complexity

    Resampling-based tests of functional categories in gene expression studies

    Get PDF
    DNA microarrays allow researchers to measure the coexpression of thousands of genes, and are commonly used to identify changes in expression either across experimental conditions or in association with some clinical outcome. With increasing availability of gene annotation, researchers have begun to ask global questions of functional genomics that explore the interactions of genes in cellular processes and signaling pathways. A common hypothesis test for gene categories is constructed as a post hoc analysis performed once a list of significant genes is identified, using classically derived tests for 2x2 contingency tables. We note several drawbacks to this approach including the violation of an independence assumption by the correlation in expression that exists among genes. To test gene categories in a more appropriate manner, we propose a flexible, permutation-based framework, termed SAFE (for Significance Analysis of Function and Expression). SAFE is a two-stage approach, whereby gene-specific statistics are calculated for the association between expression and the response of interest and then a global statistic is used to detect a shift within a gene category to more extreme associations. Significance is assessed by repeatedly permuting whole arrays whereby the correlation between all genes is held constant and accounted for. This permutation scheme also preserves the relatedness of categories containing overlapping genes, such that error rate estimates can be readily obtained for multiple dependent tests. Through a detailed survey of gene category tests and simulations based on real microarray, we demonstrate how SAFE generates appropriate Type I error rates as compared to other methods. Under a more rigorously defined null hypothesis, permutation-based tests of gene categories are shown to be conservative by inducing a special case with a maximum variance for the test statistic. A bootstrap-based approach to hypothesis testing is incorporated into the SAFE framework providing better coverage and improved power under a defined class of alternatives. Lastly, we extend the SAFE framework to consider gene categories in a probabilistic manner. This allows for a hypothesis test of co-regulation, using models of transcription factor binding sites to score for the presence of motifs in the upstream regions of genes

    Whole-Genome Analysis of the SHORT-ROOT Developmental Pathway in Arabidopsis

    Get PDF
    Stem cell function during organogenesis is a key issue in developmental biology. The transcription factor SHORT-ROOT (SHR) is a critical component in a developmental pathway regulating both the specification of the root stem cell niche and the differentiation potential of a subset of stem cells in the Arabidopsis root. To obtain a comprehensive view of the SHR pathway, we used a statistical method called meta-analysis to combine the results of several microarray experiments measuring the changes in global expression profiles after modulating SHR activity. Meta-analysis was first used to identify the direct targets of SHR by combining results from an inducible form of SHR driven by its endogenous promoter, ectopic expression, followed by cell sorting and comparisons of mutant to wild-type roots. Eight putative direct targets of SHR were identified, all with expression patterns encompassing subsets of the native SHR expression domain. Further evidence for direct regulation by SHR came from binding of SHR in vivo to the promoter regions of four of the eight putative targets. A new role for SHR in the vascular cylinder was predicted from the expression pattern of several direct targets and confirmed with independent markers. The meta-analysis approach was then used to perform a global survey of the SHR indirect targets. Our analysis suggests that the SHR pathway regulates root development not only through a large transcription regulatory network but also through hormonal pathways and signaling pathways using receptor-like kinases. Taken together, our results not only identify the first nodes in the SHR pathway and a new function for SHR in the development of the vascular tissue but also reveal the global architecture of this developmental pathway

    Probe Level Analysis of Affymetrix Microarray Data

    Get PDF
    The analysis of Affymetrix GeneChipĀ® data is a complex, multistep process. Most often, methodscondense the multiple probe level intensities into single probeset level measures (such as RobustMulti-chip Average (RMA), dChip and Microarray Suite version 5.0 (MAS5)), which are thenfollowed by application of statistical tests to determine which genes are differentially expressed. An alternative approach is a probe-level analysis, which tests for differential expression directly using the probe-level data. Probe-level models offer the potential advantage of more accurately capturing sources of variation in microarray experiments. However, this has not been thoroughly investigated, since current research efforts have largely focused on the development of improved expression summary methods. This research project will review current approaches to analysis of probe-level data and discuss extensions of two examples, the S-Score and the Random Variance Model (RVM). The S-Score is a probe-level algorithm based on an error model in which the detected signal is proportional to the probe pair signal for highly expressed genes, but approaches a background level (rather than 0) for genes with low levels of expression. Initial results with the S-Score have been promising, but the method has been limited to two-chip comparisons. This project presents extensions to the S-Score that permit comparisons of multiple chips and borrowing of information across probes to increase statistical power. The RVM is a probeset-level algorithm that models the variance of the probeset intensities as a random sample from a common distribution to borrow information across genes. This project presents extensions to the RVM for probe-level data, using multivariate statistical theory to model the covariance among probes in a probeset. Both of these methods show the advantages of probe-level, rather than probeset-level, analysis in detecting differential gene expression for Afymetrix GeneChip data. Future research will focus on refining the probe-level models of both the S-Score and RVM algorithms to increase the sensitivity and specificity of microarray experiments

    Characterization of the rat Atg16l1 gene and its role in autophagy and disease

    Get PDF
    Crohn's disease (CD) is a potentially life-threatening inflammatory condition of the gastrointestinal tract affecting approximately 1.6 million Americans. Previous studies confirm the important role of genetics in IBD. More than 160 genetic alleles have been linked to CD, one of which lies in the autophagy-related 16-like 1 (ATG16L1) gene. In humans, a threonine to alanine amino acid variant at position 300 (T300A) of the evolutionarily conserved autophagy-related 16-like 1 (ATG16L1) protein is correlated with increased predisposition to CD. Using CRISPR-Cas9 technology, our laboratory developed the first reported genetically modified rat model of CD by inserting the T300A variant into the rat genome. An additional rat model with a knock-out mutation of the Atg16l1 gene was also developed to perform loss-of-function analyses. This dissertation research characterizes the wild type (WT) and T300A susceptibility variant Atg16l1 genes in the rat as well as understanding the mechanistic function of rat Atg16l1 in autophagy. Prior to this research, the rat Atg16l1 gene had two known and one predicted splice variants; however, no further characterization of this gene had been done in wild type animals. Through collection and amplification of DNA from select rat tissues, we confirmed four splice variants and revealed that they exist in different combinations depending on the tissue. In addition, in vitro work revealed all splice variants could produce protein. Additional phenotypic characterization found that, like non-diseased intestinal tissue from human CD patients, Paneth cells exhibited abnormal granulation patterns. From this study we were able to determine that the T300A rat model faithfully recapitulates pre-disease signs seen in humans. However, in order to address the usefulness of the model, we began to explore methods to incite CD signs. Very few studies evaluating the effect of known environmental triggers of CD on specific genetic susceptibility variants have been performed. I developed two exposure studies, one acute using high-dose nonsteroidal anti-inflammatory (NSAID) and one chronic using both low-dose NSAID or ad libitum Western diet formulated rodent feed. These studies confirmed that rats heterozygous (HET) for the T300A variant are more susceptible to NSAID toxicity than WT littermates, and our model does express mild histologic changes comparable to CD lesions in human CD patients as compared to WT littermates when exposed to low-dose NSAID or Western diet. These studies support the T300A rat model as a valuable tool for both acute and chronic environmental studies of IBD. In addition to animal model studies, we also performed in vitro work to evaluate the effect of each WT rat splice variant on autophagy. By transfecting HEK293 cells with one of each of the four WT rat variants, we have begun to understand how each Atg16l1 variant effects autophagic flux. This information is crucial to understanding the underlying mechanism of autophagy and how autophagy functions in different tissues of the body. This research sets the foundation for using the Atg161 T300A rat model in CD research and will help elucidate the role of Atg16ll1 in autophagy. A better understanding of the T300A variant and the influence Atg16l1 on autophagy will facilitate the potential for future targets of therapeutics for CD and other autophagy-related diseases.Includes bibliographical references (pages 112-128)

    Regulators of growth plate maturation

    Get PDF
    Estrogen is known to play an important role in longitudinal bone growth and growth plate maturation, but the mechanism by which estrogens exert their effect is not fully understood. In this thesis this role is further explored. Chapter 1 contains a general introduction to longitudinal bone growth and the regulation of the growth plate in respect to relevant topics further studied in this thesis. Estrogen can act through a genomic or a nongenomic pathway. Both pathways are explored in rats at the onset of maturation in chapter 2. Estrogen stimulates VEGF expression in uterus and bone, which is an important growth factor for chondrocyte differentiation and chondrocytes survival in the growth plate. In chapter 3 the effect of estrogen on VEGF expression in the growth plate was studied in the rat and human growth plate. Another effect of estrogen is that it accelerates growth plate senescence. Senescence is one of the postulated intrinsic mechanisms by which the growth plate matures and finally fuses. In chapter 4 we investigated senescence in relation to proliferation, by investigating a cell cycle inhibitor p27Kip1. In animal models, catch-up growth is suggested to be caused by delayed growth plate senescence. In chapter 5 this hypothesis was further tested in humans. With puberty estrogen levels increase, the growth plate matures and at the end growth ceases with epiphyseal fusion through mechanisms not yet completely understood. In order to further explore growth plate maturation we subjected two growth plate tissues of the same patient, but with one year and one pubertal Tanner stage in between, to microarray analyses. Gene expression patterns and transcription factor binding sides in relation to pubertal maturation were studied in a longitudinal study within this single patient in chapter 6. In addition, we collected extra prepubertal and pubertal growth plate tissues and studied these samples with microarray techniques as well in chapter 7. In chapter 8 the process of epiphyseal fusion and apoptosis was studied in human growth plates. Animal models are frequently used but not fully representative for the human growth plate. Therefore we investigated a promising human in vitro model with multipotent mesenchymal stem cells (MSCs) that can differentiate into chondrocytes. MSCs can be isolated from various tissues. In chapter 9 we investigated the chondrogenic potential of MSCs from different origins and in chapter 10 we compared this model with the epiphyseal growth plate by analyzing gene expression patterns and pathways with micro-array analyses. Chapter 11 contains general conclusions and a discussion regarding the results.Afdeling kindergeneeskunde, Jurriaanse stichting, NVCB, Ipsen, Novo Nordisk, Greiner Bio-One, Ferring BV, Eli Lilly, Pfizer, NutriciaUBL - phd migration 201

    The dopaminergic network and genetic susceptibility to schizophrenia

    Get PDF
    Background: Schizophrenia is a disabling illness with unknown pathogenesis. Estimates of heritability suggest a substantial genetic contribution; however genetic studies to date have been equivocal. Uncovering liability loci may therefore require analyses of functionally related genes. Rooted in this assumption, this dissertation describes a series of studies investigating a genetic epidemiological foundation for the commonly cited hypothesis suggesting dopaminergic dysfunction in schizophrenia pathogenesis, i.e. the 'dopamine hypothesis'. Studies: The initial study investigated DRD3 and identified novel associations across the gene. The second study considered a larger network of dopaminergic genes in two independent Caucasian samples, detecting replicated associations and epistatic interactions. This study proposed a risk model for schizophrenia centered on the dopamine transporter. Study #3 investigated a dopamine precursor, phenylalanine hydroxylase, in four independent samples, identifying a single SNP (rs1522305) that was significantly replicated in two samples. Study #4 was motivated by the hypothesis of a shared genetic etiology for schizophrenia and bipolar disorder. This study comprehensively evaluated the dopaminergic network, selecting 431 'tag' SNPs from 40 genes among large schizophrenia and bipolar cohorts contrasted with adult controls. Across all genes 60% of nominally significant schizophrenia risk factors were also associated with bipolar disorder. The results supported DRD3 variations as risk factors for both disorders, confirmed several previously reported associations, and proposed new targets for future research. Conclusion: These results suggest dopaminergic gene variations could play an etiological role in the pathogenesis of schizophrenia and possibly bipolar 1 disorder. Additional replicate studies are warranted. Public Health Significance: Schizophrenia (SZ) is devastating. When the Global Burden of Disease study calculated disability adjusted life years, weighted for the severity of disability, they determined active psychosis seen in schizophrenia produces disability equal to quadriplegia. Schizophrenia has been estimated to be among the top ten causes of disability worldwide. As schizophrenia is common (roughly 1% point prevalence worldwide), the economic burden to society is substantial. Pathogenesis is unknown and treatment is palliative. Therefore understanding the genetic etiology could facilitate development of promising therapeutics

    Consumer risk perception with regards to food products

    Get PDF
    corecore