15 research outputs found

    Gene finding in the chicken genome

    Get PDF
    BACKGROUND: Despite the continuous production of genome sequence for a number of organisms, reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularly true for genomes for which there is not a large collection of known gene sequences, such as the recently published chicken genome. We used the chicken sequence to test comparative and homology-based gene-finding methods followed by experimental validation as an effective genome annotation method. RESULTS: We performed experimental evaluation by RT-PCR of three different computational gene finders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram was computed and each component of it was evaluated. The results showed that de novo comparative methods can identify up to about 700 chicken genes with no previous evidence of expression, and can correctly extend about 40% of homology-based predictions at the 5' end. CONCLUSIONS: De novo comparative gene prediction followed by experimental verification is effective at enhancing the annotation of the newly sequenced genomes provided by standard homology-based methods

    ASGS: an alternative splicing graph web service

    Get PDF
    Alternative transcript diversity manifests itself a prime cause of complexity in higher eukaryotes. The Alternative Splicing Graph Server (ASGS) is a web service facilitating the systematic study of alternatively spliced genes of higher eukaryotes by generating splicing graphs for the compact visual representation of transcript diversity from a single gene. Taking a set of transcripts in General Feature Format as input, ASGS identifies distinct reference and variable exons, generates a transcript splicing graph, an exon summary, splicing events classification and a single line graph to facilitate experimental analysis. This freely available web service can be accessed at

    AgBase: a unified resource for functional analysis in agriculture

    Get PDF
    Analysis of functional genomics (transcriptomics and proteomics) datasets is hindered in agricultural species because agricultural genome sequences have relatively poor structural and functional annotation. To facilitate systems biology in these species we have established the curated, web-accessible, public resource ‘AgBase’ (). We have improved the structural annotation of agriculturally important genomes by experimentally confirming the in vivo expression of electronically predicted proteins and by proteogenomic mapping. Proteogenomic data are available from the AgBase proteogenomics link. We contribute Gene Ontology (GO) annotations and we provide a two tier system of GO annotations for users. The ‘GO Consortium’ gene association file contains the most rigorous GO annotations based solely on experimental data. The ‘Community’ gene association file contains GO annotations based on expert community knowledge (annotations based directly from author statements and submitted annotations from the community) and annotations for predicted proteins. We have developed two tools for proteomics analysis and these are freely available on request. A suite of tools for analyzing functional genomics datasets using the GO is available online at the AgBase site. We encourage and publicly acknowledge GO annotations from researchers and provide an online mechanism for agricultural researchers to submit requests for GO annotations

    Using several pair-wise informant sequences for de novo prediction of alternatively spliced transcripts

    Get PDF
    BACKGROUND: As part of the ENCODE Genome Annotation Assessment Project (EGASP), we developed the MARS extension to the Twinscan algorithm. MARS is designed to find human alternatively spliced transcripts that are conserved in only one or a limited number of extant species. MARS is able to use an arbitrary number of informant sequences and predicts a number of alternative transcripts at each gene locus. RESULTS: MARS uses the mouse, rat, dog, opossum, chicken, and frog genome sequences as pairwise informant sources for Twinscan and combines the resulting transcript predictions into genes based on coding (CDS) region overlap. Based on the EGASP assessment, MARS is one of the more accurate dual-genome prediction programs. Compared to the GENCODE annotation, we find that predictive sensitivity increases, while specificity decreases, as more informant species are used. MARS correctly predicts alternatively spliced transcripts for 11 of the 236 multi-exon GENCODE genes that are alternatively spliced in the coding region of their transcripts. For these genes a total of 24 correct transcripts are predicted. CONCLUSION: The MARS algorithm is able to predict alternatively spliced transcripts without the use of expressed sequence information, although the number of loci in which multiple predicted transcripts match multiple alternatively spliced transcripts in the GENCODE annotation is relatively small

    AgBase: a functional genomics resource for agriculture

    Get PDF
    BACKGROUND: Many agricultural species and their pathogens have sequenced genomes and more are in progress. Agricultural species provide food, fiber, xenotransplant tissues, biopharmaceuticals and biomedical models. Moreover, many agricultural microorganisms are human zoonoses. However, systems biology from functional genomics data is hindered in agricultural species because agricultural genome sequences have relatively poor structural and functional annotation and agricultural research communities are smaller with limited funding compared to many model organism communities. DESCRIPTION: To facilitate systems biology in these traditionally agricultural species we have established "AgBase", a curated, web-accessible, public resource for structural and functional annotation of agricultural genomes. The AgBase database includes a suite of computational tools to use GO annotations. We use standardized nomenclature following the Human Genome Organization Gene Nomenclature guidelines and are currently functionally annotating chicken, cow and sheep gene products using the Gene Ontology (GO). The computational tools we have developed accept and batch process data derived from different public databases (with different accession codes), return all existing GO annotations, provide a list of products without GO annotation, identify potential orthologs, model functional genomics data using GO and assist proteomics analysis of ESTs and EST assemblies. Our journal database helps prevent redundant manual GO curation. We encourage and publicly acknowledge GO annotations from researchers and provide a service for researchers interested in GO and analysis of functional genomics data. CONCLUSION: The AgBase database is the first database dedicated to functional genomics and systems biology analysis for agriculturally important species and their pathogens. We use experimental data to improve structural annotation of genomes and to functionally characterize gene products. AgBase is also directly relevant for researchers in fields as diverse as agricultural production, cancer biology, biopharmaceuticals, human health and evolutionary biology. Moreover, the experimental methods and bioinformatics tools we provide are widely applicable to many other species including model organisms

    Re-Annotation Is an Essential Step in Systems Biology Modeling of Functional Genomics Data

    Get PDF
    One motivation of systems biology research is to understand gene functions and interactions from functional genomics data such as that derived from microarrays. Up-to-date structural and functional annotations of genes are an essential foundation of systems biology modeling. We propose that the first essential step in any systems biology modeling of functional genomics data, especially for species with recently sequenced genomes, is gene structural and functional re-annotation. To demonstrate the impact of such re-annotation, we structurally and functionally re-annotated a microarray developed, and previously used, as a tool for disease research. We quantified the impact of this re-annotation on the array based on the total numbers of structural- and functional-annotations, the Gene Annotation Quality (GAQ) score, and canonical pathway coverage. We next quantified the impact of re-annotation on systems biology modeling using a previously published experiment that used this microarray. We show that re-annotation improves the quantity and quality of structural- and functional-annotations, allows a more comprehensive Gene Ontology based modeling, and improves pathway coverage for both the whole array and a differentially expressed mRNA subset. Our results also demonstrate that re-annotation can result in a different knowledge outcome derived from previous published research findings. We propose that, because of this, re-annotation should be considered to be an essential first step for deriving value from functional genomics data

    A first-generation microsatellite-based genetic linkage map of the Siberian jay (Perisoreus infaustus): insights into avian genome evolution

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genomic resources for the majority of free-living vertebrates of ecological and evolutionary importance are scarce. Therefore, linkage maps with high-density genome coverage are needed for progress in genomics of wild species. The Siberian jay (<it>Perisoreus infaustus; Corvidae</it>) is a passerine bird which has been subject to lots of research in the areas of ecology and evolutionary biology. Knowledge of its genome structure and organization is required to advance our understanding of the genetic basis of ecologically important traits in this species, as well as to provide insights into avian genome evolution.</p> <p>Results</p> <p>We describe the first genetic linkage map of Siberian jay constructed using 117 microsatellites and a mapping pedigree of 349 animals representing five families from a natural population breeding in western Finland from the years 1975 to 2006. Markers were resolved into nine autosomal and a Z-chromosome-specific linkage group, 10 markers remaining unlinked. The best-position map with the most likely positions of all significantly linked loci had a total sex-average size of 862.8 cM, with an average interval distance of 9.69 cM. The female map covered 988.4 cM, whereas the male map covered only 774 cM. The Z-chromosome linkage group comprised six markers, three pseudoautosomal and three sex-specific loci, and spanned 10.6 cM in females and 48.9 cM in males. Eighty-one of the mapped loci could be ordered on a framework map with odds of >1000:1 covering a total size of 809.6 cM in females and 694.2 cM in males. Significant sex specific distortions towards reduced male recombination rates were revealed in the entire best-position map as well as within two autosomal linkage groups. Comparative mapping between Siberian jay and chicken anchored 22 homologous loci on 6 different linkage groups corresponding to chicken chromosomes Gga1, 2, 3, 4, 5, and Z. Quite a few cases of intra-chromosomal rearrangements within the autosomes and three cases of inter-chromosomal rearrangement between the Siberian jay autosomal linkage groups (LG1, LG2 and LG3) and the chicken sex chromosome GgaZ were observed, suggesting a conserved synteny, but changes in marker order, within autosomes during about 100 million years of avian evolution.</p> <p>Conclusion</p> <p>The constructed linkage map represents a valuable resource for intraspecific genomics of Siberian jay, as well as for avian comparative genomic studies. Apart from providing novel insights into sex-specific recombination rates and patterns, the described maps – from a previously genomically uncharacterized superfamily (Corvidae) of passerine birds – provide new insights into avian genome evolution. In combination with high-resolution data on quantitative trait variability from the study population, they also provide a foundation for QTL-mapping studies.</p

    A General Definition and Nomenclature for Alternative Splicing Events

    Get PDF
    Understanding the molecular mechanisms responsible for the regulation of the transcriptome present in eukaryotic cells is one of the most challenging tasks in the postgenomic era. In this regard, alternative splicing (AS) is a key phenomenon contributing to the production of different mature transcripts from the same primary RNA sequence. As a plethora of different transcript forms is available in databases, a first step to uncover the biology that drives AS is to identify the different types of reflected splicing variation. In this work, we present a general definition of the AS event along with a notation system that involves the relative positions of the splice sites. This nomenclature univocally and dynamically assigns a specific “AS code” to every possible pattern of splicing variation. On the basis of this definition and the corresponding codes, we have developed a computational tool (AStalavista) that automatically characterizes the complete landscape of AS events in a given transcript annotation of a genome, thus providing a platform to investigate the transcriptome diversity across genes, chromosomes, and species. Our analysis reveals that a substantial part—in human more than a quarter—of the observed splicing variations are ignored in common classification pipelines. We have used AStalavista to investigate and to compare the AS landscape of different reference annotation sets in human and in other metazoan species and found that proportions of AS events change substantially depending on the annotation protocol, species-specific attributes, and coding constraints acting on the transcripts. The AStalavista system therefore provides a general framework to conduct specific studies investigating the occurrence, impact, and regulation of AS
    corecore