214 research outputs found

    PlantRNA, a database for tRNAs of photosynthetic eukaryotes.

    Get PDF
    International audiencePlantRNA database (http://plantrna.ibmp.cnrs.fr/) compiles transfer RNA (tRNA) gene sequences retrieved from fully annotated plant nuclear, plastidial and mitochondrial genomes. The set of annotated tRNA gene sequences has been manually curated for maximum quality and confidence. The novelty of this database resides in the inclusion of biological information relevant to the function of all the tRNAs entered in the library. This includes 5'- and 3'-flanking sequences, A and B box sequences, region of transcription initiation and poly(T) transcription termination stretches, tRNA intron sequences, aminoacyl-tRNA synthetases and enzymes responsible for tRNA maturation and modification. Finally, data on mitochondrial import of nuclear-encoded tRNAs as well as the bibliome for the respective tRNAs and tRNA-binding proteins are also included. The current annotation concerns complete genomes from 11 organisms: five flowering plants (Arabidopsis thaliana, Oryza sativa, Populus trichocarpa, Medicago truncatula and Brachypodium distachyon), a moss (Physcomitrella patens), two green algae (Chlamydomonas reinhardtii and Ostreococcus tauri), one glaucophyte (Cyanophora paradoxa), one brown alga (Ectocarpus siliculosus) and a pennate diatom (Phaeodactylum tricornutum). The database will be regularly updated and implemented with new plant genome annotations so as to provide extensive information on tRNA biology to the research community

    The mitochondrial genome of Sinentomon erythranum (Arthropoda: Hexapoda: Protura): an example of highly divergent evolution

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The phylogenetic position of the Protura, traditionally considered the most basal hexapod group, is disputed because it has many unique morphological characters compared with other hexapods. Although mitochondrial genome information has been used extensively in phylogenetic studies, such information is not available for the Protura. This has impeded phylogenetic studies on this taxon, as well as the evolution of the arthropod mitochondrial genome.</p> <p>Results</p> <p>In this study, the mitochondrial genome of <it>Sinentomon erythranum </it>was sequenced, as the first proturan species to be reported. The genome contains a number of special features that differ from those of other hexapods and arthropods. As a very small arthropod mitochondrial genome, its 14,491 nucleotides encode 37 typical mitochondrial genes. Compared with other metazoan mtDNA, it has the most biased nucleotide composition with T = 52.4%, an extreme and reversed AT-skew of -0.351 and a GC-skew of 0.350. Two tandemly repeated regions occur in the A+T-rich region, and both could form stable stem-loop structures. Eighteen of the 22 tRNAs are greatly reduced in size with truncated secondary structures. The gene order is novel among available arthropod mitochondrial genomes. Rearrangements have involved in not only small tRNA genes, but also PCGs (protein-coding genes) and ribosome RNA genes. A large block of genes has experienced inversion and another nearby block has been reshuffled, which can be explained by the tandem duplication and random loss model. The most remarkable finding is that <it>trnL2(UUR) </it>is not located between <it>cox1 </it>and <it>cox2 </it>as observed in most hexapod and crustacean groups, but is between <it>rrnL </it>and <it>nad1 </it>as in the ancestral arthropod ground pattern. The "<it>cox1</it>-<it>cox2</it>" pattern was further confirmed in three more representative proturan species. The phylogenetic analyses based on the amino acid sequences of 13 mitochondrial PCGs suggest <it>S</it>. <it>erythranum </it>failed to group with other hexapod groups.</p> <p>Conclusions</p> <p>The mitochondrial genome of <it>S. erythranum </it>shows many different features from other hexapod and arthropod mitochondrial genomes. It underwent highly divergent evolution. The "<it>cox1</it>-<it>cox2</it>" pattern probably represents the ancestral state for all proturan mitogenomes, and suggests a long evolutionary history for the Protura.</p

    Phylogenetic, Genomic and Morphological Investigations of Three Lance Nematode Species (\u3ci\u3eHoplolaimus\u3c/i\u3e spp.)

    Get PDF
    Lance nematodes (Hoplolaimus spp.) are migratory ecto-endo plant-parasitic. They have been found from a wide range of the world that feed on the roots of a diversity of monocotyledonous and dicotyledonous plants, and have caused a great agricultural damage. Since more taxonomic knowledge and molecular references are demanded for the lance nematode phylogeny and population study, four chapters of lance nematode researches on three species were presented here: (1) A new species, Hoplolaimus smokyensis n. sp., was discovered from a mixed forest sample of maple (Acer sp.), hemlock (Tsuga sp.) and silverbell (Halesia carolina) from the Great Smoky Mountains National Park. It is characterized by possession of a lateral field with four incisures, an excretory pore posterior to the hemizonid, esophageal glands with three nuclei, phasmids anterior and posterior to the vulva, and the epiptygma absent. Phylogenetic analyses based on ribosomal and mitochondrial gene sequences also suggest H. smokyensis n. sp. to be an independent lineage distinct from all other reported Hoplolaimus species. (2) Additional morphological characteristics of Hoplolaimus columbus were described. Photos of its esophageal gland cell nuclei, a H. columbus male and abnormal female tails were presented. (3) The first complete de novo assembly of mitochondrial genome of Hoplolaimus columbus using Whole Genome Amplification and Illumina MiSeq technique was reported as a circularized DNA of 25228bp. The annotation results using two genetic codes were diagnosed and compared. Including H. columbus, phylogenetic relationships, gene content and gene order arrangement of 92 taxa nematodes were analyzed. (4) The phylogenetic informativeness of mitochondrial genes in Nematoda phylum is analyzed with two quantitative methods using mitochondrial genomes of 93 nematode species, including H. columbus and H. galeatus. Results from both methods agree with each other, indicate that the nad5 and nad4 contain higher informativeness than other candidates. Traditional markers like the cox1 and cytb genes contain medium informativeness. The nad4l and nad3 contain the lowest informativeness comparing with other protein-coding genes. Results also indicate that the phylogenetic informativeness is independent of the molecular sequence length of a phylogenetic marker. Concatenated-genes marker could present better phylogenetic informativeness if selected genes are higher informative

    Deep learning methods for mining genomic sequence patterns

    Get PDF
    Nowadays, with the growing availability of large-scale genomic datasets and advanced computational techniques, more and more data-driven computational methods have been developed to analyze genomic data and help to solve incompletely understood biological problems. Among them, deep learning methods, have been proposed to automatically learn and recognize the functional activity of DNA sequences from genomics data. Techniques for efficient mining genomic sequence pattern will help to improve our understanding of gene regulation, and thus accelerate our progress toward using personal genomes in medicine. This dissertation focuses on the development of deep learning methods for mining genomic sequences. First, we compare the performance between deep learning models and traditional machine learning methods in recognizing various genomic sequence patterns. Through extensive experiments on both simulated data and real genomic sequence data, we demonstrate that an appropriate deep learning model can be generally made for successfully recognizing various genomic sequence patterns. Next, we develop deep learning methods to help solve two specific biological problems, (1) inference of polyadenylation code and (2) tRNA gene detection and functional prediction. Polyadenylation is a pervasive mechanism that has been used by Eukaryotes for regulating mRNA transcription, localization, and translation efficiency. Polyadenylation signals in the plant are particularly noisy and challenging to decipher. A deep convolutional neural network approach DeepPolyA is proposed to predict poly(A) site from the plant Arabidopsis thaliana genomic sequences. It employs various deep neural network architectures and demonstrates its superiority in comparison with competing methods, including classical machine learning algorithms and several popular deep learning models. Transfer RNAs (tRNAs) represent a highly complex class of genes and play a central role in protein translation. There remains a de facto tool, tRNAscan-SE, for identifying tRNA genes encoded in genomes. Despite its popularity and success, tRNAscan-SE is still not powerful enough to separate tRNAs from pseudo-tRNAs, and a significant number of false positives can be output as a result. To address this issue, tRNA-DL, a hybrid combination of convolutional neural network and recurrent neural network approach is proposed. It is shown that the proposed method can help to reduce the false positive rate of the state-of-art tRNA prediction tool tRNAscan-SE substantially. Coupled with tRNAscan-SE, tRNA-DL can serve as a useful complementary tool for tRNA annotation. Taken together, the experiments and applications demonstrate the superiority of deep learning in automatic feature generation for characterizing genomic sequence patterns

    The DNA Sequence From A Cloned 15 Kilobase Fragment Of The Chlamydomonas Acidophila Mitochondrial Genome And RNA Transcript Production In Response To Cadmium

    Get PDF
    Chlamydomonas acidophila is a unicellular green alga of the order Chlamydomonadales. Our research efforts were allied along two lines: (1) Characterization of the C. acidophila mitochondrial genome (mtDNA) and (2) Elucidation of any molecular events responsible for C. acidophila\u27s heavy metal tolerance. The mitochondrial genomes of the protists have been underrepresented in the sequence databases. Among the protists, the alga genera Chlamydomonas shows a reduced mtDNA content with a highly rearranged gene structure. It was decided to sequence C. acidophila\u27s mtDNA to further elucidate the evolutionary paths among the Chlamydomonads and add to the protist sequence database. A 15 kb fragment of C. acidophila\u27s mtDNA was cloned and sequenced. The genes identified included apocytochrome b; partial sequences of subunits 2 and 5 and a complete subunit 1 of the NADH dehydrogenase complex; subunit 1 of the cytochrome oxidase complex; discontinuous and scrambled large and small subunit ribosomal rRNA; and four tRNAs whose anticodons specify tryptophan, glutamine, and 2 methionines (one of which appears to be a pseudogene). The mtDNA of C. acidophila, therefore, probably encodes a reduced gene coding capacity common among the Chlamydomonadales. In fact the basic gene order is colinear with that of C. eugametos. However, C. acidophila appears to have two distinctive features: (1) The reduced size of intergenic spacers, and (2) Non-synonymous insertion of a number of group I introns within the partial sequence. These differences suggest a recent divergence between C. acidophila and C. eugametos, and place them very close phylogenetically. It was also noticed that C. acidophila exhibits a higher tolerance for cadmium than do other Chlamydomonas species. Cadmium is a potent environmental toxin and carcinogen that is accumulating in the environment through anthropogenic and natural means. Knowledge of the characteristics of metal tolerant species has yielded valuable insights into the nature of cadmium tolerance, and may one day aid in the safe disposal of this metal. In an attempt to understand the role of mtDNA during cadmium exposure, a 5 kb Hind III fragment of mtDNA was cloned onto a pGem vector (pJB2). That fragment was hybridized to Northern blots of cadmium challenged C. acidophila cells, and a transcript of ∼300 bp in size was shown to increase during cadmium challenge. Restriction studies and DNA sequencing has revealed that the transcript was produced from a 1500 bp region and appears to be rRNA

    ChlamyCyc: an integrative systems biology database and web-portal for Chlamydomonas reinhardtii

    Get PDF
    BACKGROUND: The unicellular green alga Chlamydomonas reinhardtii is an important eukaryotic model organism for the study of photosynthesis and plant growth. In the era of modern high-throughput technologies there is an imperative need to integrate large-scale data sets from high-throughput experimental techniques using computational methods and database resources to provide comprehensive information about the molecular and cellular organization of a single organism. RESULTS: In the framework of the German Systems Biology initiative GoFORSYS, a pathway database and web-portal for Chlamydomonas (ChlamyCyc) was established, which currently features about 250 metabolic pathways with associated genes, enzymes, and compound information. ChlamyCyc was assembled using an integrative approach combining the recently published genome sequence, bioinformatics methods, and experimental data from metabolomics and proteomics experiments. We analyzed and integrated a combination of primary and secondary database resources, such as existing genome annotations from JGI, EST collections, orthology information, and MapMan classification. CONCLUSION: ChlamyCyc provides a curated and integrated systems biology repository that will enable and assist in systematic studies of fundamental cellular processes in Chlamydomonas. The ChlamyCyc database and web-portal is freely available under http://chlamycyc.mpimp-golm.mpg.de

    Nuclear mitochondrial DNA sequences in the rabbit genome

    Get PDF
    Numtogenesis is observable in the mammalian genomes resulting in the integration of mitochondrial segments into the nuclear genomes (numts). To identify numts in rabbit, we aligned mitochondrial and nuclear genomes. Alignment significance threshold was calculated and individual characteristics of numts were analysed. We found 153 numts in the nuclear genome. The GC content of numts were significantly lower than the GC content of their genomic flanking regions or the genome itself. The frequency of three mammalian-wide interspersed repeats were increased in the proximity of numts. The decreased GC content around numts strengthen the theory which supposes a link between DNA structural instability and numt integration

    Mass Spectrometry: An Ideal Method For Rna Modification Analysis

    Get PDF
    Currently there is no good way to measure and find the exact location of multiple RNA modifications. Existing technology can effectively find single varieties of modifications, but cannot identify co-occurrence. As the field of proteomics has shown, mass spectrometry is a powerful and versatile technique assessing broad ranges of chemical modifications in the context of the cellular environment. In this project I used our expertise in proteomics to build a mass spectrometry based platform for identifying RNA modifications. I have since set up both software and analytical platforms querying RNA modifications, and used this platform to survey human tRNA samples and identify what modifications there are, and where they occur

    The Chloroplast Genome of the Green Alga Schizomeris leibleinii (Chlorophyceae) Provides Evidence for Bidirectional DNA Replication from a Single Origin in the Chaetophorales

    Get PDF
    In the Chlorophyceae, the chloroplast genome is extraordinarily fluid in architecture and displays unique features relative to other groups of green algae. For the Chaetophorales, 1 of the 5 major lineages of the Chlorophyceae, it has been shown that the distinctive architecture of the 223,902-bp genome of Stigeoclonium helveticum is consistent with bidirectional DNA replication from a single origin. Here, we report the 182,759-bp chloroplast genome sequence of Schizomeris leibleinii, a member of the earliest diverging lineage of the Chaetophorales. Like its Stigeoclonium homolog, the Schizomeris genome lacks a large inverted repeat encoding the rRNA operon and displays a striking bias in coding regions that is associated with a bias in base composition along each strand. Our results support the notion that these two chaetophoralean genomes replicate bidirectionally from a putative origin located in the vicinity of the small subunit ribosomal RNA gene. Their shared structural characteristics were most probably inherited from the common ancestor of all chaetophoralean algae. Short dispersed repeats account for most of the 41-kb size variation between the Schizomeris and Stigeoclonium genomes, and there is no indication that homologous recombination between these repeated elements led to the observed gene rearrangements. A comparison of the extent of variation sustained by the Stigeoclonium and Schizomeris chloroplast DNAs (cpDNAs) with that observed for the cpDNAs of the chlamydomonadalean Chlamydomonas and Volvox suggests that gene rearrangements as well as changes in the abundance of intergenic and intron sequences occurred at a slower pace in the Chaetophorales than in the Chlamydomonadales

    Family-level sampling of mitochondrial genomes in coleoptera: compositional heterogeneity and phylogenetics

    Get PDF
    Mitochondrial genomes are readily sequenced with recent technology and thus evolutionary lineages can be sampled more densely. This permits better phylogenetic estimates and assessment of potential biases resulting from heterogeneity in nucleotide composition and rate of change. We gathered 245 mitochondrial sequences for the Coleoptera representing all 4 suborders, 15 superfamilies of Polyphaga, and altogether 97 families, including 159 newly sequenced full or partial mitogenomes. Compositional heterogeneity greatly affected 3rd codon positions, and to a lesser extent the 1st and 2nd positions, even after RY coding. Heterogeneity also affected the encoded protein sequence, in particular in the nad2, nad4, nad5 and nad6 genes. Credible tree topologies were obtained with the nhPhyML (‘non-homogeneous’) algorithm implementing a model for branch-specific equilibrium frequencies. Likelihood searches using RAxML were improved by data partitioning by gene and codon position. Finally, the PhyloBayes software, which allows different substitution processes for amino acid replacement at various sites, produced a tree that best matched known higher-level taxa and defined basal relationships in Coleoptera. After rooting with Neuropterida outgroups, suborder relationships were resolved as (Polyphaga (Myxophaga (Archostemata + Adephaga))). The infraorder relationships in Polyphaga were (Scirtiformia (Elateriformia (Staphyliniformia + Scarabaeiformia (Bostrichiformia (Cucujiformia)))). Polyphagan superfamilies were recovered as monophyla except Staphylinoidea (paraphyletic for Scarabaeiformia) and Cucujoidea, which can no longer be considered a valid taxon. The study shows that, whilst compositional heterogeneity is not universal, it cannot be eliminated for some mitochondrial genes, but dense taxon sampling and the use of appropriate Bayesian analyses can still produce robust phylogenetic trees
    • …
    corecore