13 research outputs found

    Correspondence regarding "Effect of active smoking on the human bronchial epithelium transcriptome"

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the work of Chari <it>et al. </it>entitled "Effect of active smoking on the human bronchial epithelium transcriptome" the authors use SAGE to identify candidate gene expression changes in bronchial brushings from never, former, and current smokers. These gene expression changes are categorized into those that are reversible or irreversible upon smoking cessation. A subset of these identified genes is validated on an independent cohort using RT-PCR. The authors conclude that their results support the notion of gene expression changes in the lungs of smokers which persist even after an individual has quit.</p> <p>Results</p> <p>This correspondence raises questions about the validity of the approach used by the authors to analyze their data. The majority of the reported results suffer deficiencies due to the methods used. The most fundamental of these are explained in detail: biases introduced during data processing, lack of correction for multiple testing, and an incorrect use of clustering for gene discovery. A randomly generated "null" dataset is used to show the consequences of these shortcomings.</p> <p>Conclusion</p> <p>Most of Chari <it>et al.</it>'s findings are consistent with what would be expected by chance alone. Although there is clear evidence of reversible changes in gene expression, the majority of those identified appear to be false positives. However, contrary to the authors' claims, no irreversible changes were identified. There is a broad consensus that genetic change due to smoking persists once an individual has quit smoking; unfortunately, this study lacks sufficient scientific rigour to support or refute this hypothesis or identify any specific candidate genes. The pitfalls of large-scale analysis, as exemplified here, may not be unique to Chari <it>et al</it>.</p

    A knowledge discovery object model API for Java

    Get PDF
    BACKGROUND: Biological data resources have become heterogeneous and derive from multiple sources. This introduces challenges in the management and utilization of this data in software development. Although efforts are underway to create a standard format for the transmission and storage of biological data, this objective has yet to be fully realized. RESULTS: This work describes an application programming interface (API) that provides a framework for developing an effective biological knowledge ontology for Java-based software projects. The API provides a robust framework for the data acquisition and management needs of an ontology implementation. In addition, the API contains classes to assist in creating GUIs to represent this data visually. CONCLUSIONS: The Knowledge Discovery Object Model (KDOM) API is particularly useful for medium to large applications, or for a number of smaller software projects with common characteristics or objectives. KDOM can be coupled effectively with other biologically relevant APIs and classes. Source code, libraries, documentation and examples are available at

    DiscoverySpace: an interactive data analysis application

    Get PDF
    DiscoverySpace is a graphical application for bioinformatics data analysis. Users can seamlessly traverse references between biological databases and draw together annotations in an intuitive tabular interface. Datasets can be compared using a suite of novel tools to aid in the identification of significant patterns. DiscoverySpace is of broad utility and its particular strength is in the analysis of serial analysis of gene expression (SAGE) data. The application is freely available online

    Statistical analysis and significance testing of serial analysis of gene expression data using a Poisson mixture model

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Serial analysis of gene expression (SAGE) is used to obtain quantitative snapshots of the transcriptome. These profiles are count-based and are assumed to follow a Binomial or Poisson distribution. However, tag counts observed across multiple libraries (for example, one or more groups of biological replicates) have additional variance that cannot be accommodated by this assumption alone. Several models have been proposed to account for this effect, all of which utilize a continuous prior distribution to explain the excess variance. Here, a Poisson mixture model, which assumes excess variability arises from sampling a mixture of distinct components, is proposed and the merits of this model are discussed and evaluated.</p> <p>Results</p> <p>The goodness of fit of the Poisson mixture model on 15 sets of biological SAGE replicates is compared to the previously proposed hierarchical gamma-Poisson (negative binomial) model, and a substantial improvement is seen. In further support of the mixture model, there is observed: 1) an increase in the number of mixture components needed to fit the expression of tags representing more than one transcript; and 2) a tendency for components to cluster libraries into the same groups. A confidence score is presented that can identify tags that are differentially expressed between groups of SAGE libraries. Several examples where this test outperforms those previously proposed are highlighted.</p> <p>Conclusion</p> <p>The Poisson mixture model performs well as a) a method to represent SAGE data from biological replicates, and b) a basis to assign significance when testing for differential expression between multiple groups of replicates. Code for the R statistical software package is included to assist investigators in applying this model to their own data.</p

    Whole transcriptome analysis reveals differential gene expression profile reflecting macrophage polarization in response to influenza A H5N1 virus infection

    No full text
    Abstract Background Avian influenza A H5N1 virus can cause lethal disease in humans. The virus can trigger severe pneumonia and lead to acute respiratory distress syndrome. Data from clinical, in vitro and in vivo suggest that virus-induced cytokine dysregulation could be a contributory factor to the pathogenesis of human H5N1 disease. However, the precise mechanism of H5N1 infection eliciting the unique host response are still not well understood. Methods To obtain a better understanding of the molecular events at the earliest time points, we used RNA-Seq to quantify and compare the host mRNA and miRNA transcriptomes induced by the highly pathogenic influenza A H5N1 (A/Vietnam/3212/04) or low virulent H1N1 (A/Hong Kong/54/98) viruses in human monocyte-derived macrophages at 1-, 3-, and 6-h post infection. Results Our data reveals that two macrophage populations corresponding to M1 (classically activated) and M2 (alternatively activated) macrophage subtypes respond distinctly to H5N1 virus infection when compared to H1N1 virus or mock infection, a distinction that could not be made from previous microarray studies. When this confounding variable is considered in our statistical model, a clear set of dysregulated genes and pathways emerges specifically in H5N1 virus-infected macrophages at 6-h post infection, whilst was not found with H1N1 virus infection. Furthermore, altered expression of genes in these pathways, which have been previously implicated in viral host response, occurs specifically in the M1 subtype. We observe a significant up-regulation of genes in the RIG-I-like receptor signaling pathway. In particular, interferons, and interferon-stimulated genes are broadly affected. The negative regulators of interferon signaling, the suppressors of cytokine signaling, SOCS-1 and SOCS-3, were found to be markedly up-regulated in the initial round of H5N1 virus replication. Elevated levels of these suppressors could lead to the eventual suppression of cellular antiviral genes, contributing to pathophysiology of H5N1 virus infection. Conclusions Our study provides important mechanistic insights into the understanding of H5N1 viral pathogenesis and the multi-faceted host immune responses. The dysregulated genes could be potential candidates as therapeutic targets for treating H5N1 disease

    Large-scale production of SAGE libraries from microdissected tissues, flow-sorted cells, and cell lines

    No full text
    We describe the details of a serial analysis of gene expression (SAGE) library construction and analysis platform that has enabled the generation of >298 high-quality SAGE libraries and >30 million SAGE tags primarily from sub-microgram amounts of total RNA purified from samples acquired by microdissection. Several RNA isolation methods were used to handle the diversity of samples processed, and various measures were applied to minimize ditag PCR carryover contamination. Modifications in the SAGE protocol resulted in improved cloning and DNA sequencing efficiencies. Bioinformatic measures to automatically assess DNA sequencing results were implemented to analyze the integrity of ditag structure, linker or cross-species ditag contamination, and yield of high-quality tags per sequence read. Our analysis of singleton tag errors resulted in a method for correcting such errors to statistically determine tag accuracy. From the libraries generated, we produced an essentially complete mapping of reliable 21-base-pair tags to the mouse reference genome sequence for a meta-library of ∼5 million tags. Our analyses led us to reject the commonly held notion that duplicate ditags are artifacts. Rather than the usual practice of discarding such tags, we conclude that they should be retained to avoid introducing bias into the results and thereby maintain the quantitative nature of the data, which is a major theoretical advantage of SAGE as a tool for global transcriptional profiling

    Enhancer hijacking activates GFI1 family oncogenes in medulloblastoma

    No full text
    Medulloblastoma is a highly malignant paediatric brain tumour currently treated with a combination of surgery, radiation and chemotherapy, posing a considerable burden of toxicity to the developing child. Genomics has illuminated the extensive intertumoral heterogeneity of medulloblastoma, identifying four distinct molecular subgroups. Group 3 and group 4 subgroup medulloblastomas account for most paediatric cases; yet, oncogenic drivers for these subtypes remain largely unidentified. Here we describe a series of prevalent, highly disparate genomic structural variants, restricted to groups 3 and 4, resulting in specific and mutually exclusive activation of the growth factor independent 1 family proto-oncogenes, GFI1 and GFI1B. Somatic structural variants juxtapose GFI1 or GFI1B coding sequences proximal to active enhancer elements, including super-enhancers, instigating oncogenic activity. Our results, supported by evidence from mouse models, identify GFI1 and GFI1B as prominent medulloblastoma oncogenes and implicate 'enhancer hijacking' as an efficient mechanism driving oncogene activation in a childhood cancer
    corecore