7,647 research outputs found

    Tiling microarray analysis of rice chromosome 10 to identify the transcriptome and relate its expression to chromosomal architecture

    Get PDF
    BACKGROUND: Sequencing and annotation of the genome of rice (Oryza sativa) have generated gene models in numbers that top all other fully sequenced species, with many lacking recognizable sequence homology to known genes. Experimental evaluation of these gene models and identification of new models will facilitate rice genome annotation and the application of this knowledge to other more complex cereal genomes. RESULTS: We report here an analysis of the chromosome 10 transcriptome of the two major rice subspecies, japonica and indica, using oligonucleotide tiling microarrays. This analysis detected expression of approximately three-quarters of the gene models without previous experimental evidence in both subspecies. Cloning and sequence analysis of the previously unsupported models suggests that the predicted gene structure of nearly half of those models needs improvement. Coupled with comparative gene model mapping, the tiling microarray analysis identified 549 new models for the japonica chromosome, representing an 18% increase in the annotated protein-coding capacity. Furthermore, an asymmetric distribution of genome elements along the chromosome was found that coincides with the cytological definition of the heterochromatin and euchromatin domains. The heterochromatin domain appears to associate with distinct chromosome level transcriptional activities under normal and stress conditions. CONCLUSION: These results demonstrated the utility of genome tiling microarray in evaluating annotated rice gene models and in identifying novel transcriptional units. The tiling microarray sanalysis further revealed a chromosome-wide transcription pattern that suggests a role for transposable element-enriched heterochromatin in shaping global transcription in response to environmental changes in rice

    Array-CGH and breast cancer

    Get PDF
    The introduction of comparative genomic hybridization (CGH) in 1992 opened new avenues in genomic investigation; in particular, it advanced analysis of solid tumours, including breast cancer, because it obviated the need to culture cells before their chromosomes could be analyzed. The current generation of CGH analysis uses ordered arrays of genomic DNA sequences and is therefore referred to as array-CGH or matrix-CGH. It was introduced in 1998, and further increased the potential of CGH to provide insight into the fundamental processes of chromosomal instability and cancer. This review provides a critical evaluation of the data published on array-CGH and breast cancer, and discusses some of its expected future value and developments

    Genome-wide identification of specific oligonucleotides using artificial neural network and computational genomic analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome-wide identification of specific oligonucleotides (oligos) is a computationally-intensive task and is a requirement for designing microarray probes, primers, and siRNAs. An artificial neural network (ANN) is a machine learning technique that can effectively process complex and high noise data. Here, ANNs are applied to process the unique subsequence distribution for prediction of specific oligos.</p> <p>Results</p> <p>We present a novel and efficient algorithm, named the integration of ANN and BLAST (IAB) algorithm, to identify specific oligos. We establish the unique marker database for human and rat gene index databases using the hash table algorithm. We then create the input vectors, via the unique marker database, to train and test the ANN. The trained ANN predicted the specific oligos with high efficiency, and these oligos were subsequently verified by BLAST. To improve the prediction performance, the ANN over-fitting issue was avoided by early stopping with the best observed error and a k-fold validation was also applied. The performance of the IAB algorithm was about 5.2, 7.1, and 6.7 times faster than the BLAST search without ANN for experimental results of 70-mer, 50-mer, and 25-mer specific oligos, respectively. In addition, the results of polymerase chain reactions showed that the primers predicted by the IAB algorithm could specifically amplify the corresponding genes. The IAB algorithm has been integrated into a previously published comprehensive web server to support microarray analysis and genome-wide iterative enrichment analysis, through which users can identify a group of desired genes and then discover the specific oligos of these genes.</p> <p>Conclusion</p> <p>The IAB algorithm has been developed to construct SpecificDB, a web server that provides a specific and valid oligo database of the probe, siRNA, and primer design for the human genome. We also demonstrate the ability of the IAB algorithm to predict specific oligos through polymerase chain reaction experiments. SpecificDB provides comprehensive information and a user-friendly interface.</p

    Genome-wide identification of specific oligonucleotides using artificial neural network and computational genomic analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome-wide identification of specific oligonucleotides (oligos) is a computationally-intensive task and is a requirement for designing microarray probes, primers, and siRNAs. An artificial neural network (ANN) is a machine learning technique that can effectively process complex and high noise data. Here, ANNs are applied to process the unique subsequence distribution for prediction of specific oligos.</p> <p>Results</p> <p>We present a novel and efficient algorithm, named the integration of ANN and BLAST (IAB) algorithm, to identify specific oligos. We establish the unique marker database for human and rat gene index databases using the hash table algorithm. We then create the input vectors, via the unique marker database, to train and test the ANN. The trained ANN predicted the specific oligos with high efficiency, and these oligos were subsequently verified by BLAST. To improve the prediction performance, the ANN over-fitting issue was avoided by early stopping with the best observed error and a k-fold validation was also applied. The performance of the IAB algorithm was about 5.2, 7.1, and 6.7 times faster than the BLAST search without ANN for experimental results of 70-mer, 50-mer, and 25-mer specific oligos, respectively. In addition, the results of polymerase chain reactions showed that the primers predicted by the IAB algorithm could specifically amplify the corresponding genes. The IAB algorithm has been integrated into a previously published comprehensive web server to support microarray analysis and genome-wide iterative enrichment analysis, through which users can identify a group of desired genes and then discover the specific oligos of these genes.</p> <p>Conclusion</p> <p>The IAB algorithm has been developed to construct SpecificDB, a web server that provides a specific and valid oligo database of the probe, siRNA, and primer design for the human genome. We also demonstrate the ability of the IAB algorithm to predict specific oligos through polymerase chain reaction experiments. SpecificDB provides comprehensive information and a user-friendly interface.</p

    Evaluation of the similarity of gene expression data estimated with SAGE and Affymetrix GeneChips

    Get PDF
    BACKGROUND: Serial Analysis of Gene Expression (SAGE) and microarrays have found awidespread application, but much ambiguity exists regarding the evaluation of these technologies. Cross-platform utilization of gene expression data from the SAGE and microarray technology could reduce the need for duplicate experiments and facilitate a more extensive exchange of data within the research community. This requires a measure for the correspondence of the different gene expression platforms. To date, a number of cross-platform evaluations (including a few studies using SAGE and Affymetrix GeneChips) have been conducted showing a variable, but overall low, concordance. This study evaluates these overall measures and introduces the between-ratio difference as a concordance measure pergene. RESULTS: In this study, gene expression measurements of Unigene clusters represented by both Affymetrix GeneChips HG-U133A and SAGE were compared using two independent RNA samples. After matching of the data sets the final comparison contains a small data set of 1094 unique Unigene clusters, which is unbiased with respect to expression level. Different overall correlation approaches, like Up/Down classification, contingency tables and correlation coefficients were used to compare both platforms. In addition, we introduce a novel approach to compare two platforms based on the calculation of differences between expression ratios observed in each platform for each individual transcript. This approach results in a concordance measure per gene (with statistical probability value), as opposed to the commonly used overall concordance measures between platforms. CONCLUSION: We can conclude that intra-platform correlations are generally good, but that overall agreement between the two platforms is modest. This might be due to the binomially distributed sampling variation in SAGE tag counts, SAGE annotation errors and the intensity variation between probe sets of a single gene in Affymetrix GeneChips. We cannot identify or advice which platform performs better since both have their (dis)-advantages. Therefore it is strongly recommended to perform follow-up studies of interesting genes using additional techniques. The newly introduced between-ratio difference is a filtering-independent measure for between-platform concordance. Moreover, the between-ratio difference per gene can be used to detect transcripts with similar regulation on both platforms

    Features of mammalian microRNA promoters emerge from polymerase II chromatin immunoprecipitation data

    Get PDF
    Background: MicroRNAs (miRNAs) are short, non-coding RNA regulators of protein coding genes. miRNAs play a very important role in diverse biological processes and various diseases. Many algorithms are able to predict miRNA genes and their targets, but their transcription regulation is still under investigation. It is generally believed that intragenic miRNAs (located in introns or exons of protein coding genes) are co-transcribed with their host genes and most intergenic miRNAs transcribed from their own RNA polymerase II (Pol II) promoter. However, the length of the primary transcripts and promoter organization is currently unknown. Methodology: We performed Pol II chromatin immunoprecipitation (ChIP)-chip using a custom array surrounding regions of known miRNA genes. To identify the true core transcription start sites of the miRNA genes we developed a new tool (CPPP). We showed that miRNA genes can be transcribed from promoters located several kilobases away and that their promoters share the same general features as those of protein coding genes. Finally, we found evidence that as many as 26% of the intragenic miRNAs may be transcribed from their own unique promoters. Conclusion: miRNA promoters have similar features to those of protein coding genes, but miRNA transcript organization is more complex. Β© 2009 Corcoran et al

    Covalent DNA Modifications in Phage and Bacterial Dynamics

    Get PDF
    The microorganisms on and in the human body play a significant role in health and disease; however, little is known about how the interactions between these complex communities affect our wellbeing. This study examines how bacteria and phage interact through bacterial nucleases that restrict infection, such as restriction enzymes and CRISPR systems, and the covalent DNA modifications that neutralize them. Multiple targeted nucleases equip bacteria with an innate immune response against phage, and CRISPR systems provide an adaptive immune response. I report three main studies. 1) To study the human gut microbiome and virome (comprised predominately of phage), we collected fecal samples from a healthy individual over four years. From the fecal samples, total bacterial DNA and DNA from purified virus like particles (VLPs) were sequenced using Illumina and Pacific Bioscience single-molecule real-time (SMRT) sequencing to yield information about genome sequences and covalent modifications. Using computational methods we identified seven bacterial contigs and one phage contig with CRISPR arrays targeting phage contigs. This suggests that both bacteria and phage use CRISPR systems to compete with other phage. 2) Covalent DNA modifications are known to block the nuclease activity of restriction enzymes, however it was unknown if they can block the nuclease activity of CRISPR systems. To address this, we test if the CRISPR-Cas9 system could target wild type T4 phage and two T4 mutants. Wild type T4 modifies all its cytosines to glycosylated hydroxymethylcytosine (glc-HMC), and the two mutant T4 phage contain either hydroxymethylcytosine (HMC) or unmodified cystosines (C). These tests confirmed that glc-HMC and HMC in high concentrations can block CRISPR-Cas9. 3) To explore interactions between bacteria and phage further, we used covalent DNA modification data to link bacteria and phage pairs from the human gut microbiome, based on the idea that phage and bacterial DNAs in the same cell have been exposed to the same DNA modifying enzymes and thus share modification patterns. Overall, 443 modified motifs were shared between phage and bacteria, suggesting many possible phage-host pairs. In our data, 73% of phage genomes and 56% of bacterial genomes contained motifs that were completely modified, highlighting how ubiquitous and important the roll of DNA modifications are. These data allowed us to begin to specify the extent and types of interactions between phage and bacteria in longitudinal data. This work explores the complex interactions between bacteria and phage, a crucial step in understanding how these organisms contribute to human health and disease
    • …
    corecore