428 research outputs found

    Bias correction and Bayesian analysis of aggregate counts in SAGE libraries

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Tag-based techniques, such as SAGE, are commonly used to sample the mRNA pool of an organism's transcriptome. Incomplete digestion during the tag formation process may allow for multiple tags to be generated from a given mRNA transcript. The probability of forming a tag varies with its relative location. As a result, the observed tag counts represent a biased sample of the actual transcript pool. In SAGE this bias can be avoided by ignoring all but the 3' most tag but will discard a large fraction of the observed data. Taking this bias into account should allow more of the available data to be used leading to increased statistical power.</p> <p>Results</p> <p>Three new hierarchical models, which directly embed a model for the variation in tag formation probability, are proposed and their associated Bayesian inference algorithms are developed. These models may be applied to libraries at both the tag and aggregate level. Simulation experiments and analysis of real data are used to contrast the accuracy of the various methods. The consequences of tag formation bias are discussed in the context of testing differential expression. A description is given as to how these algorithms can be applied in that context.</p> <p>Conclusions</p> <p>Several Bayesian inference algorithms that account for tag formation effects are compared with the DPB algorithm providing clear evidence of superior performance. The accuracy of inferences when using a particular non-informative prior is found to depend on the expression level of a given gene. The multivariate nature of the approach easily allows both univariate and joint tests of differential expression. Calculations demonstrate the potential for false positive and negative findings due to variation in tag formation probabilities across samples when testing for differential expression.</p

    ProbFAST: Probabilistic Functional Analysis System Tool

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The post-genomic era has brought new challenges regarding the understanding of the organization and function of the human genome. Many of these challenges are centered on the meaning of differential gene regulation under distinct biological conditions and can be performed by analyzing the Multiple Differential Expression (MDE) of genes associated with normal and abnormal biological processes. Currently MDE analyses are limited to usual methods of differential expression initially designed for paired analysis.</p> <p>Results</p> <p>We proposed a web platform named ProbFAST for MDE analysis which uses Bayesian inference to identify key genes that are intuitively prioritized by means of probabilities. A simulated study revealed that our method gives a better performance when compared to other approaches and when applied to public expression data, we demonstrated its flexibility to obtain relevant genes biologically associated with normal and abnormal biological processes.</p> <p>Conclusions</p> <p>ProbFAST is a free accessible web-based application that enables MDE analysis on a global scale. It offers an efficient methodological approach for MDE analysis of a set of genes that are turned on and off related to functional information during the evolution of a tumor or tissue differentiation. ProbFAST server can be accessed at <url>http://gdm.fmrp.usp.br/probfast</url>.</p

    Microarrays for global expression constructed with a low redundancy set of 27,500 sequenced cDNAs representing an array of developmental stages and physiological conditions of the soybean plant

    Get PDF
    BACKGROUND: Microarrays are an important tool with which to examine coordinated gene expression. Soybean (Glycine max) is one of the most economically valuable crop species in the world food supply. In order to accelerate both gene discovery as well as hypothesis-driven research in soybean, global expression resources needed to be developed. The applications of microarray for determining patterns of expression in different tissues or during conditional treatments by dual labeling of the mRNAs are unlimited. In addition, discovery of the molecular basis of traits through examination of naturally occurring variation in hundreds of mutant lines could be enhanced by the construction and use of soybean cDNA microarrays. RESULTS: We report the construction and analysis of a low redundancy 'unigene' set of 27,513 clones that represent a variety of soybean cDNA libraries made from a wide array of source tissue and organ systems, developmental stages, and stress or pathogen-challenged plants. The set was assembled from the 5' sequence data of the cDNA clones using cluster analysis programs. The selected clones were then physically reracked and sequenced at the 3' end. In order to increase gene discovery from immature cotyledon libraries that contain abundant mRNAs representing storage protein gene families, we utilized a high density filter normalization approach to preferentially select more weakly expressed cDNAs. All 27,513 cDNA inserts were amplified by polymerase chain reaction. The amplified products, along with some repetitively spotted control or 'choice' clones, were used to produce three 9,728-element microarrays that have been used to examine tissue specific gene expression and global expression in mutant isolines. CONCLUSIONS: Global expression studies will be greatly aided by the availability of the sequence-validated and low redundancy cDNA sets described in this report. These cDNAs and ESTs represent a wide array of developmental stages and physiological conditions of the soybean plant. We also demonstrate that the quality of the data from the soybean cDNA microarrays is sufficiently reliable to examine isogenic lines that differ with respect to a mutant phenotype and thereby to define a small list of candidate genes potentially encoding or modulated by the mutant phenotype

    CTdatabase: a knowledge-base of high-throughput and curated data on cancer-testis antigens

    Get PDF
    The potency of the immune response has still to be harnessed effectively to combat human cancers. However, the discovery of T-cell targets in melanomas and other tumors has raised the possibility that cancer vaccines can be used to induce a therapeutically effective immune response against cancer. The targets, cancer-testis (CT) antigens, are immunogenic proteins preferentially expressed in normal gametogenic tissues and different histological types of tumors. Therapeutic cancer vaccines directed against CT antigens are currently in late-stage clinical trials testing whether they can delay or prevent recurrence of lung cancer and melanoma following surgical removal of primary tumors. CT antigens constitute a large, but ill-defined, family of proteins that exhibit a remarkably restricted expression. Currently, there is a considerable amount of information about these proteins, but the data are scattered through the literature and in several bioinformatic databases. The database presented here, CTdatabase (http://www.cta.lncc.br), unifies this knowledge to facilitate both the mining of the existing deluge of data, and the identification of proteins alleged to be CT antigens, but that do not have their characteristic restricted expression pattern. CTdatabase is more than a repository of CT antigen data, since all the available information was carefully curated and annotated with most data being specifically processed for CT antigens and stored locally. Starting from a compilation of known CT antigens, CTdatabase provides basic information including gene names and aliases, RefSeq accession numbers, genomic location, known splicing variants, gene duplications and additional family members. Gene expression at the mRNA level in normal and tumor tissues has been collated from publicly available data obtained by several different technologies. Manually curated data related to mRNA and protein expression, and antigen-specific immune responses in cancer patients are also available, together with links to PubMed for relevant CT antigen articles

    High-Throughput SuperSAGE for Digital Gene Expression Analysis of Multiple Samples Using Next Generation Sequencing

    Get PDF
    We established a protocol of the SuperSAGE technology combined with next-generation sequencing, coined “High-Throughput (HT-) SuperSAGE”. SuperSAGE is a method of digital gene expression profiling that allows isolation of 26-bp tag fragments from expressed transcripts. In the present protocol, index (barcode) sequences are employed to discriminate tags from different samples. Such barcodes allow researchers to analyze digital tags from transcriptomes of many samples in a single sequencing run by simply pooling the libraries. Here, we demonstrated that HT-SuperSAGE provided highly sensitive, reproducible and accurate digital gene expression data. By increasing throughput for analysis in HT-SuperSAGE, various applications are foreseen and several examples are provided in the present study, including analyses of laser-microdissected cells, biological replicates and tag extraction using different anchoring enzymes

    Suppression subtractive hybridization coupled with microarray analysis to examine differential expression of genes in virus infected cells

    Get PDF
    High throughput detection of differential expression of genes is an efficient means of identifying genes and pathways that may play a role in biological systems under certain experimental conditions. There exist a variety of approaches that could be used to identify groups of genes that change in expression in response to a particular stimulus or environment. We here describe the application of suppression subtractive hybridization (SSH) coupled with cDNA microarray analysis for isolation and identification of chicken transcripts that change in expression on infection of host cells with a paramyxovirus. SSH was used for initial isolation of differentially expressed transcripts, a large-scale validation of which was accomplished by microarray analysis. The data reveals a large group of regulated genes constituting many biochemical pathways that could serve as targets for future investigations to explore their role in paramyxovirus pathogenesis. The detailed methods described herein could be useful and adaptable to any biological system for studying changes in gene expression

    Circulating tumor DNA guided adjuvant chemotherapy in stage II colon cancer (MEDOCC-CrEATE):study protocol for a trial within a cohort study

    Get PDF
    BACKGROUND: Accurate detection of patients with minimal residual disease (MRD) after surgery for stage II colon cancer (CC) remains an urgent unmet clinical need to improve selection of patients who might benefit form adjuvant chemotherapy (ACT). Presence of circulating tumor DNA (ctDNA) is indicative for MRD and has high predictive value for recurrent disease. The MEDOCC-CrEATE trial investigates how many stage II CC patients with detectable ctDNA after surgery will accept ACT and whether ACT reduces the risk of recurrence in these patients. METHODS/DESIGN: MEDOCC-CrEATE follows the 'trial within cohorts' (TwiCs) design. Patients with colorectal cancer (CRC) are included in the Prospective Dutch ColoRectal Cancer cohort (PLCRC) and give informed consent for collection of clinical data, tissue and blood samples, and consent for future randomization. MEDOCC-CrEATE is a subcohort within PLCRC consisting of 1320 stage II CC patients without indication for ACT according to current guidelines, who are randomized 1:1 into an experimental and a control arm. In the experimental arm, post-surgery blood samples and tissue are analyzed for tissue-informed detection of plasma ctDNA, using the PGDx elio™ platform. Patients with detectable ctDNA will be offered ACT consisting of 8 cycles of capecitabine plus oxaliplatin while patients without detectable ctDNA and patients in the control group will standard follow-up according to guideline. The primary endpoint is the proportion of patients receiving ACT when ctDNA is detectable after resection. The main secondary outcome is 2-year recurrence rate (RR), but also includes 5-year RR, disease free survival, overall survival, time to recurrence, quality of life and cost-effectiveness. Data will be analyzed by intention to treat. DISCUSSION: The MEDOCC-CrEATE trial will provide insight into the willingness of stage II CC patients to be treated with ACT guided by ctDNA biomarker testing and whether ACT will prevent recurrences in a high-risk population. Use of the TwiCs design provides the opportunity to randomize patients before ctDNA measurement, avoiding ethical dilemmas of ctDNA status disclosure in the control group. TRIAL REGISTRATION: Netherlands Trial Register: NL6281/NTR6455 . Registered 18 May 2017, https://www.trialregister.nl/trial/6281
    corecore