962 research outputs found

    Differential expression analysis for sequence count data

    Get PDF
    *Motivation:* High-throughput nucleotide sequencing provides quantitative readouts in assays for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq) or cell counting (barcode sequencing). Statistical inference of differential signal in such data requires estimation of their variability throughout the dynamic range. When the number of replicates is small, error modelling is needed to achieve statistical power.

*Results:* We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power. 

*Availability:* A free open-source R software package, _DESeq_, is available from the Bioconductor project and from "http://www-huber.embl.de/users/anders/DESeq":http://www-huber.embl.de/users/anders/DESeq

    Multi-parametric flow cytometric and genetic investigation of the peripheral B cell compartment in human type 1 diabetes.

    Get PDF
    The appearance of circulating islet-specific autoantibodies before disease diagnosis is a hallmark of human type 1 diabetes (T1D), and suggests a role for B cells in the pathogenesis of the disease. Alterations in the peripheral B cell compartment have been reported in T1D patients; however, to date, such studies have produced conflicting results and have been limited by sample size. In this study, we have performed a detailed characterization of the B cell compartment in T1D patients (n = 45) and healthy controls (n = 46), and assessed the secretion of the anti-inflammatory cytokine interleukin (IL)-10 in purified B cells from the same donors. Overall, we found no evidence for a profound alteration of the B cell compartment or in the production of IL-10 in peripheral blood of T1D patients. We also investigated age-related changes in peripheral B cell subsets and confirmed the sharp decrease with age of transitional CD19(+) CD27(-) CD24(hi) CD38(hi) B cells, a subset that has recently been ascribed a putative regulatory function. Genetic analysis of the B cell compartment revealed evidence for association of the IL2-IL21 T1D locus with IL-10 production by both memory B cells (P = 6·4 × 10(-4) ) and islet-specific CD4(+) T cells (P = 2·9 × 10(-3) ). In contrast to previous reports, we found no evidence for an alteration of the B cell compartment in healthy individuals homozygous for the non-synonymous PTPN22 Trp(620) T1D risk allele (rs2476601; Arg(620) Trp). The IL2-IL21 association we have identified, if confirmed, suggests a novel role for B cells in T1D pathogenesis through the production of IL-10, and reinforces the importance of IL-10 production by autoreactive CD4(+) T cells

    Mulcom: a multiple comparison statistical test for microarray data in Bioconductor

    Get PDF
    Many microarray experiments compare a common control group with several ”test ” groups, like in the case, for example of a time-course experiments where time zero serves as a common reference point. The MulCom package described here implements the Dunnett’s t-test, which has been specifically developed to handle multiple comparisons against a common reference, in a version tailored for genomic data analysis that we named MulCom (Multiple Comparisons) test. The implementation includes two test parameters, namely the t value and an optional minimal fold-change value, m, with automated, permutation-based estimation of False Discovery Rate (FDR) for parameter combinations of choice. The package permits automated optimization of the test parameters to obtain the maximum number of significant genes at a given FDR value. In this vignette we present the rationale, implementation and usage of the MulCom package, plus a practical application on a time-course microarra

    Identifying differential exon splicing using linear models and correlation coefficients

    Get PDF
    Background: With the availability of the Affymetrix exon arrays a number of tools have been developed to enable the analysis. These however can be expensive or have several pre-installation requirements. This led us to develop an analysis workflow for analysing differential splicing using freely available software packages that are already being widely used for gene expression analysis. The workflow uses the packages in the standard installation of R and Bioconductor (BiocLite) to identify differential splicing. We use the splice index method with the LIMMA framework. The main drawback with this approach is that it relies on accurate estimates of gene expression from the probe-level data. Methods such as RMA and PLIER may misestimate when a large proportion of exons are spliced. We therefore present the novel concept of a gene correlation coefficient calculated using only the probeset expression pattern within a gene. We show that genes with lower correlation coefficients are likely to be differentially spliced.Results: The LIMMA approach was used to identify several tissue-specific transcripts and splicing events that are supported by previous experimental studies. Filtering the data is necessary, particularly removing exons and genes that are not expressed in all samples and cross-hybridising probesets, in order to reduce the false positive rate. The LIMMA approach ranked genes containing single or few differentially spliced exons much higher than genes containing several differentially spliced exons. On the other hand we found the gene correlation coefficient approach better for identifying genes with a large number of differentially spliced exons.Conclusion: We show that LIMMA can be used to identify differential exon splicing from Affymetrix exon array data. Though further work would be necessary to develop the use of correlation coefficients into a complete analysis approach, the preliminary results demonstrate their usefulness for identifying differentially spliced genes. The two approaches work complementary as they can potentially identify different subsets of genes (single/few spliced exons vs. large transcript structure differences)

    A powerful method for detecting differentially expressed genes from GeneChip arrays that does not require replicates

    Get PDF
    BACKGROUND: Studies of differential expression that use Affymetrix GeneChip arrays are often carried out with a limited number of replicates. Reasons for this include financial considerations and limits on the available amount of RNA for sample preparation. In addition, failed hybridizations are not uncommon leading to a further reduction in the number of replicates available for analysis. Most existing methods for studying differential expression rely on the availability of replicates and the demand for alternative methods that require few or no replicates is high. RESULTS: We describe a statistical procedure for performing differential expression analysis without replicates. The procedure relies on a Bayesian integrated approach (BGX) to the analysis of Affymetrix GeneChips. The BGX method estimates a posterior distribution of expression for each gene and condition, from a simultaneous consideration of the available probe intensities representing the gene in a condition. Importantly, posterior distributions of expression are obtained regardless of the number of replicates available. We exploit these posterior distributions to create ranked gene lists that take into account the estimated expression difference as well as its associated uncertainty. We estimate the proportion of non-differentially expressed genes empirically, allowing an informed choice of cut-off for the ranked gene list, adapting an approach proposed by Efron. We assess the performance of the method, and compare it to those of other methods, on publicly available spike-in data sets, as well as in a proper biological setting. CONCLUSION: The method presented is a powerful tool for extracting information on differential expression from GeneChip expression studies with limited or no replicates

    Phage inducible islands in the gram-positive cocci

    Get PDF
    The SaPIs are a cohesive subfamily of extremely common phage-inducible chromosomal islands (PICIs) that reside quiescently at specific att sites in the staphylococcal chromosome and are induced by helper phages to excise and replicate. They are usually packaged in small capsids composed of phage virion proteins, giving rise to very high transfer frequencies, which they enhance by interfering with helper phage reproduction. As the SaPIs represent a highly successful biological strategy, with many natural Staphylococcus aureus strains containing two or more, we assumed that similar elements would be widespread in the Gram-positive cocci. On the basis of resemblance to the paradigmatic SaPI genome, we have readily identified large cohesive families of similar elements in the lactococci and pneumococci/streptococci plus a few such elements in Enterococcus faecalis. Based on extensive ortholog analyses, we found that the PICI elements in the four different genera all represent distinct but parallel lineages, suggesting that they represent convergent evolution towards a highly successful lifestyle. We have characterized in depth the enterococcal element, EfCIV583, and have shown that it very closely resembles the SaPIs in functionality as well as in genome organization, setting the stage for expansion of the study of elements of this type. In summary, our findings greatly broaden the PICI family to include elements from at least three genera of cocci

    A structurally distinct TGF-β mimic from an intestinal helminth parasite potently induces regulatory T cells

    Get PDF
    Helminth parasites defy immune exclusion through sophisticated evasion mechanisms, including activation of host immunosuppressive regulatory T (Treg) cells. The mouse parasite Heligmosomoides polygyrus can expand the host Treg population by secreting products that activate TGF-β signalling, but the identity of the active molecule is unknown. Here we identify an H. polygyrus TGF-β mimic (Hp-TGM) that replicates the biological and functional properties of TGF-β, including binding to mammalian TGF-β receptors and inducing mouse and human Foxp3+ Treg cells. Hp-TGM has no homology with mammalian TGF-β or other members of the TGF-β family, but is a member of the complement control protein superfamily. Thus, our data indicate that through convergent evolution, the parasite has acquired a protein with cytokine-like function that is able to exploit an endogenous pathway of immunoregulation in the host

    Transcriptome analyses of mouse and human mammary cell subpopulations reveal multiple conserved genes and pathways

    Get PDF
    INTRODUCTION: Molecular characterization of the normal epithelial cell types that reside in the mammary gland is an important step toward understanding pathways that regulate self-renewal, lineage commitment, and differentiation along the hierarchy. Here we determined the gene expression signatures of four distinct subpopulations isolated from the mouse mammary gland. The epithelial cell signatures were used to interrogate mouse models of mammary tumorigenesis and to compare with their normal human counterpart subsets to identify conserved genes and networks. METHODS: RNA was prepared from freshly sorted mouse mammary cell subpopulations (mammary stem cell (MaSC)-enriched, committed luminal progenitor, mature luminal and stromal cell) and used for gene expression profiling analysis on the Illumina platform. Gene signatures were derived and compared with those previously reported for the analogous normal human mammary cell subpopulations. The mouse and human epithelial subset signatures were then subjected to Ingenuity Pathway Analysis (IPA) to identify conserved pathways. RESULTS: The four mouse mammary cell subpopulations exhibited distinct gene signatures. Comparison of these signatures with the molecular profiles of different mouse models of mammary tumorigenesis revealed that tumors arising in MMTV-Wnt-1 and p53-/- mice were enriched for MaSC-subset genes, whereas the gene profiles of MMTV-Neu and MMTV-PyMT tumors were most concordant with the luminal progenitor cell signature. Comparison of the mouse mammary epithelial cell signatures with their human counterparts revealed substantial conservation of genes, whereas IPA highlighted a number of conserved pathways in the three epithelial subsets. CONCLUSIONS: The conservation of genes and pathways across species further validates the use of the mouse as a model to study mammary gland development and highlights pathways that are likely to govern cell-fate decisions and differentiation. It is noteworthy that many of the conserved genes in the MaSC population have been considered as epithelial-mesenchymal transition (EMT) signature genes. Therefore, the expression of these genes in tumor cells may reflect basal epithelial cell characteristics and not necessarily cells that have undergone an EMT. Comparative analyses of normal mouse epithelial subsets with murine tumor models have implicated distinct cell types in contributing to tumorigenesis in the different models

    Assessment and optimisation of normalisation methods for dual-colour antibody microarrays

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent advances in antibody microarray technology have made it possible to measure the expression of hundreds of proteins simultaneously in a competitive dual-colour approach similar to dual-colour gene expression microarrays. Thus, the established normalisation methods for gene expression microarrays, e.g. loess regression, can in principle be applied to protein microarrays. However, the typical assumptions of such normalisation methods might be violated due to a bias in the selection of the proteins to be measured. Due to high costs and limited availability of high quality antibodies, the current arrays usually focus on a high proportion of regulated targets. Housekeeping features could be used to circumvent this problem, but they are typically underrepresented on protein arrays. Therefore, it might be beneficial to select invariant features among the features already represented on available arrays for normalisation by a dedicated selection algorithm.</p> <p>Results</p> <p>We compare the performance of several normalisation methods that have been established for dual-colour gene expression microarrays. The focus is on an invariant selection algorithm, for which effective improvements are proposed. In a simulation study the performances of the different normalisation methods are compared with respect to their impact on the ability to correctly detect differentially expressed features. Furthermore, we apply the different normalisation methods to a pancreatic cancer data set to assess the impact on the classification power.</p> <p>Conclusions</p> <p>The simulation study and the data application demonstrate the superior performance of the improved invariant selection algorithms in comparison to other normalisation methods, especially in situations where the assumptions of the usual global loess normalisation are violated.</p

    Novel associations for hypothyroidism include known autoimmune risk loci

    Get PDF
    Hypothyroidism is the most common thyroid disorder, affecting about 5% of the general population. Here we present the first large genome-wide association study of hypothyroidism, in 2,564 cases and 24,448 controls from the customer base of 23andMe, Inc., a personal genetics company. We identify four genome-wide significant associations, two of which are well known to be involved with a large spectrum of autoimmune diseases: rs6679677 near _PTPN22_ and rs3184504 in _SH2B3_ (p-values 3.5e-13 and 3.0e-11, respectively). We also report associations with rs4915077 near _VAV3_ (p-value 8.3e-11), another gene involved in immune function, and rs965513 near _FOXE1_ (p-value 3.1e-14). Of these, the association with _PTPN22_ confirms a recent small candidate gene study, and _FOXE1_ was previously known to be associated with thyroid-stimulating hormone (TSH) levels. Although _SH2B3_ has been previously linked with a number of autoimmune diseases, this is the first report of its association with thyroid disease. The _VAV3_ association is novel. These results suggest heterogeneity in the genetic etiology of hypothyroidism, implicating genes involved in both autoimmune disorders and thyroid function. Using a genetic risk profile score based on the top association from each of the four genome-wide significant regions in our study, the relative risk between the highest and lowest deciles of genetic risk is 2.1
    corecore