118 research outputs found
An integrated computational pipeline and database to support whole-genome sequence annotation
We describe here our experience in annotating the Drosophila melanogaster genome sequence, in the course of which we developed several new open-source software tools and a database schema to support large-scale genome annotation. We have developed these into an integrated and reusable software system for whole-genome annotation. The key contributions to overall annotation quality are the marshalling of high-quality sequences for alignments and the design of a system with an adaptable and expandable flexible architecture
Recommended from our members
Apollo: a sequence annotation editor
The well-established inaccuracy of purely computational methods for annotating genome sequences necessitates an interactive tool to allow biological experts to refine these approximations by viewing and independently evaluating the data supporting each annotation. Apollo was developed to meet this need, enabling curators to inspect genome annotations closely and edit them. FlyBase biologists successfully used Apollo to annotate the Drosophila melanogaster genome and it is increasingly being used as a starting point for the development of customized annotation editing tools for other genome projects
Recommended from our members
Sequencing wild and cultivated cassava and related species reveals extensive interspecific hybridization and genetic diversity
Cassava (Manihot esculenta) provides calories and nutrition for more than half a billion people. It was domesticated by native Amazonian peoples through cultivation of the wild progenitor M. esculenta ssp. flabellifolia and is now grown in tropical regions worldwide. Here we provide a high-quality genome assembly for cassava with improved contiguity, linkage, and completeness; almost 97% of genes are anchored to chromosomes. We find that paleotetraploidy in cassava is shared with the related rubber tree Hevea, providing a resource for comparative studies. We also sequence a global collection of 58 Manihot accessions, including cultivated and wild cassava accessions and related species such as CearΓ‘ or India rubber (M. glaziovii), and genotype 268 African cassava varieties. We find widespread interspecific admixture, and detect the genetic signature of past cassava breeding programs. As a clonally propagated crop, cassava is especially vulnerable to pathogens and abiotic stresses. This genomic resource will inform future genome-enabled breeding efforts to improve this staple crop
Recommended from our members
Annotation of the Drosophila melanogaster euchromatic genome: a systematic review
BACKGROUND: The recent completion of the Drosophila melanogaster genomic sequence to high quality and the availability of a greatly expanded set of Drosophila cDNA sequences, aligning to 78% of the predicted euchromatic genes, afforded FlyBase the opportunity to significantly improve genomic annotations. We made the annotation process more rigorous by inspecting each gene visually, utilizing a comprehensive set of curation rules, requiring traceable evidence for each gene model, and comparing each predicted peptide to SWISS-PROT and TrEMBL sequences. RESULTS: Although the number of predicted protein-coding genes in Drosophila remains essentially unchanged, the revised annotation significantly improves gene models, resulting in structural changes to 85% of the transcripts and 45% of the predicted proteins. We annotated transposable elements and non-protein-coding RNAs as new features, and extended the annotation of untranslated (UTR) sequences and alternative transcripts to include more than 70% and 20% of genes, respectively. Finally, cDNA sequence provided evidence for dicistronic transcripts, neighboring genes with overlapping UTRs on the same DNA sequence strand, alternatively spliced genes that encode distinct, non-overlapping peptides, and numerous nested genes. CONCLUSIONS: Identification of so many unusual gene models not only suggests that some mechanisms for gene regulation are more prevalent than previously believed, but also underscores the complex challenges of eukaryotic gene prediction. At present, experimental data and human curation remain essential to generate high-quality genome annotations
Inovaçáes tecnológicas nas edificaçáes: papéis diferenciados para construtores e fornecedores
The \u3cem\u3eChlamydomonas\u3c/em\u3e Genome Reveals the Evolution of Key Animal and Plant Functions
Chlamydomonas reinhardtii is a unicellular green alga whose lineage diverged from land plants over 1 billion years ago. It is a model system for studying chloroplast-based photosynthesis, as well as the structure, assembly, and function of eukaryotic flagella (cilia), which were inherited from the common ancestor of plants and animals, but lost in land plants. We sequenced the βΌ120-megabase nuclear genome of Chlamydomonas and performed comparative phylogenomic analyses, identifying genes encoding uncharacterized proteins that are likely associated with the function and biogenesis of chloroplasts or eukaryotic flagella. Analyses of the Chlamydomonas genome advance our understanding of the ancestral eukaryotic cell, reveal previously unknown genes associated with photosynthetic and flagellar functions, and establish links between ciliopathy and the composition and function of flagella
MiR-10 Represses HoxB1a and HoxB3a in Zebrafish
BACKGROUND: The Hox genes are involved in patterning the anterior-posterior axis. In addition to the protein coding Hox genes, the miR-10, miR-196 and miR-615 families of microRNA genes are conserved within the vertebrate Hox clusters. The members of the miR-10 family are located at positions associated with Hox-4 paralogues. No function is yet known for this microRNA family but the genomic positions of its members suggest a role in anterior-posterior patterning. METHODOLOGY/PRINCIPAL FINDINGS: Using sensor constructs, overexpression and morpholino knockdown, we show in Zebrafish that miR-10 targets HoxB1a and HoxB3a and synergizes with HoxB4 in the repression of these target genes. Overexpression of miR-10 also induces specific phenotypes related to the loss of function of these targets. HoxB1a and HoxB3a have a dominant hindbrain expression domain anterior to that of miR-10 but overlap in a weaker expression domain in the spinal cord. In this latter domain, miR-10 knockdown results in upregulation of the target genes. In the case of a HoxB3a splice variant that includes miR-10c within its primary transcript, we show that the microRNA acts in an autoregulatory fashion. CONCLUSIONS/SIGNIFICANCE: We find that miR-10 acts to repress HoxB1a and HoxB3a within the spinal cord and show that this repression works cooperatively with HoxB4. As with the previously described interactions between miR-196 and HoxA7 and Hox-8 paralogues, the target genes are located in close proximity to the microRNA. We present a model in which we postulate a link between the clustering of Hox genes and post-transcriptional gene regulation. We speculate that the high density of transcription units and enhancers within the Hox clusters places constraints on the precision of the transcriptional control that can be achieved within these clusters and requires the involvement of post-transcriptional gene silencing to define functional domains of genes appropriately
The mir-51 Family of microRNAs Functions in Diverse Regulatory Pathways in Caenorhabditis elegans
The mir-51 family of microRNAs (miRNAs) in C. elegans are part of the deeply conserved miR-99/100 family. While loss of all six family members (mir-51-56) in C. elegans results in embryonic lethality, loss of individual mir-51 family members results in a suppression of retarded developmental timing defects associated with the loss of alg-1. The mechanism of this suppression of developmental timing defects is unknown. To address this, we characterized the function of the mir-51 family in the developmental timing pathway. We performed genetic analysis and determined that mir-51 family members regulate the developmental timing pathway in the L2 stage upstream of hbl-1. Loss of the mir-51 family member, mir-52, suppressed retarded developmental timing defects associated with the loss of let-7 family members and lin-46. Enhancement of precocious defects was observed for mutations in lin-14, hbl-1, and mir-48(ve33), but not later acting developmental timing genes. Interestingly, mir-51 family members showed genetic interactions with additional miRNA-regulated pathways, which are regulated by the let-7 and mir-35 family miRNAs, lsy-6, miR-240/786, and miR-1. Loss of mir-52 likely does not suppress miRNA-regulated pathways through an increase in miRNA biogenesis or miRNA activity. We found no increase in the levels of four mature miRNAs, let-7, miR-58, miR-62 or miR-244, in mir-52 or mir-52/53/54/55/56 mutant worms. In addition, we observed no increase in the activity of ectopic lsy-6 in the repression of a downstream target in uterine cells in worms that lack mir-52. We propose that the mir-51 family functions broadly through the regulation of multiple targets, which have not yet been identified, in diverse regulatory pathways in C. elegans
Ago2 Immunoprecipitation Identifies Predicted MicroRNAs in Human Embryonic Stem Cells and Neural Precursors
MicroRNAs are required for maintenance of pluripotency as well as differentiation, but since more microRNAs have been computationally predicted in genome than have been found, there are likely to be undiscovered microRNAs expressed early in stem cell differentiation.SOLiD ultra-deep sequencing identified >10(7) unique small RNAs from human embryonic stem cells (hESC) and neural-restricted precursors that were fit to a model of microRNA biogenesis to computationally predict 818 new microRNA genes. These predicted genomic loci are associated with chromatin patterns of modified histones that are predictive of regulated gene expression. 146 of the predicted microRNAs were enriched in Ago2-containing complexes along with 609 known microRNAs, demonstrating association with a functional RISC complex. This Ago2 IP-selected subset was consistently expressed in four independent hESC lines and exhibited complex patterns of regulation over development similar to previously-known microRNAs, including pluripotency-specific expression in both hESC and iPS cells. More than 30% of the Ago2 IP-enriched predicted microRNAs are new members of existing families since they share seed sequences with known microRNAs.Extending the classic definition of microRNAs, this large number of new microRNA genes, the majority of which are less conserved than their canonical counterparts, likely represent evolutionarily recent regulators of early differentiation. The enrichment in Ago2 containing complexes, the presence of chromatin marks indicative of regulated gene expression, and differential expression over development all support the identification of 146 new microRNAs active during early hESC differentiation
- β¦