95 research outputs found

    A novel bioinformatics tool for phylogenetic classification of genomic sequence fragments derived from mixed genomes of uncultured environmental microbes

    Get PDF
    A Self-Organizing Map (SOM) is an effective tool for clustering and visualizing high-dimensional complex data on a two-dimensional map. We modified the conventional SOM to genome informatics, making the learning process and resulting map independent of the order of data input, and developed a novel bioinformatics tool for phylogenetic classification of sequence fragments obtained from pooled genome samples of microorganisms in environmental samples allowing visualization of microbial diversity and the relative abundance of microorganisms on a map. First we constructed SOMs of tri- and tetranucleotide frequencies from a total of 3.3-Gb of sequences derived using 113 prokaryotic and 13 eukaryotic genomes, for which complete genome sequences are available. SOMs classified the 330000 10-kb sequences from these genomes mainly according to species without information on the species. Importantly, classification was possible without orthologous sequence sets and thus was useful for studies of novel sequences from poorly characterized species such as those living only under extreme conditions and which have attracted wide scientific and industrial attention. Using the SOM method, sequences that were derived from a single genome but cloned independently in a metagenome library could be reassociated in silico. The usefulness of SOMs in metagenome studies was also discussed

    Characterization of Genetic Signal Sequences with Batch-Learning SOM

    Get PDF
    An unsupervised clustering algorithm Kohonen's SOM is an effective tool for clustering and visualizing high-dimensional complex data on a single map. We previously modified the conventional SOM for genome informatics, making the learning process and resulting map independent of the order of data input on the basis of Batch Learning SOM (BL-SOM). We generated BL-SOMs for tetra- and pentanucleotide frequencies in 300,000 10-kb sequences from 13 eukaryotes for which almost complete genomic sequences are available. BL-SOM recognized species-specific characteristics of oligonucleotide frequencies in most 10-kb sequences, permitting species-specific classification of sequences without any information regarding the species. We next constructed BL-SOMs with tetra- and pentanucleotide frequencies in 37,086 full-length mouse cDNA sequences. With BL-SOM we also analyzed occurrence patterns of the oligonucleotides that are thought to be involved in transcriptional regulation on the human genome

    第2章 遺伝学からのアプローチ 1 戦争と平和と生命科学

    Get PDF

    Characterization of Genetic Signal Sequences with Batch-Learning SOM

    Get PDF
    An unsupervised clustering algorithm Kohonen's SOM is an effective tool for clustering and visualizing high-dimensional complex data on a single map. We previously modified the conventional SOM for genome informatics, making the learning process and resulting map independent of the order of data input on the basis of Batch Learning SOM (BL-SOM). We generated BL-SOMs for tetra- and pentanucleotide frequencies in 300,000 10-kb sequences from 13 eukaryotes for which almost complete genomic sequences are available. BL-SOM recognized species-specific characteristics of oligonucleotide frequencies in most 10-kb sequences, permitting species-specific classification of sequences without any information regarding the species. We next constructed BL-SOMs with tetra- and pentanucleotide frequencies in 37,086 full-length mouse cDNA sequences. With BL-SOM we also analyzed occurrence patterns of the oligonucleotides that are thought to be involved in transcriptional regulation on the human genome

    Comparative genomics of Glandirana rugosa using unsupervised AI reveals a high CG frequency

    Get PDF
    The Japanese wrinkled frog (Glandirana rugosa) is unique in having both XX-XY and ZZ-ZW types of sex chromosomes within the species. The genome sequencing and comparative genomics with other frogs should be important to understand mechanisms of turnover of sex chromosomes within one species or during a short period. In this study, we analyzed the newly sequenced genome of G. rugosa using a batch-learning self-organizing map which is unsupervised artificial intelligence for oligonucleotide compositions. To clarify genome characteristics of G. rugosa, we compared its short oligonucleotide compositions in all 1-Mb genomic fragments with those of other six frog species (Pyxicephalus adspersus, Rhinella marina, Spea multiplicata, Leptobrachium leishanense, Xenopus laevis, and Xenopus tropicalis). In G. rugosa, we found an Mb-level large size of repeat sequences having a high identity with the W chromosome of the African bullfrog (P. adspersus). Our study concluded that G. rugosa has unique genome characteristics with a high CG frequency, and its genome is assumed to heterochromatinize a large size of genome via methylataion of CG

    極限環境適応における微生物ゲノムの生命情報学

    Get PDF
    第6回極域科学シンポジウム分野横断セッション:[IB2] 地球環境変動の解析と地球生命システム学の構築11月19日(木) 統計数理研究所 セミナー室1(D305

    tRNADB-CE: tRNA gene database curated manually by experts

    Get PDF
    We constructed a new large-scale database of tRNA genes by analyzing 534 complete genomes of prokaryotes and 394 draft genomes in WGS (Whole Genome Shotgun) division in DDBJ/EMBL/GenBank and approximately 6.2 million DNA fragment sequences obtained from metagenomic analyses. This exhaustive search for tRNA genes was performed by running three computer programs to enhance completeness and accuracy of the prediction. Discordances of assignment among three programs were found for ∼4% of the total of tRNA gene candidates obtained from these prokaryote genomes analyzed. The discordant cases were manually checked by experts in the tRNA experimental field. In total, 144 061 tRNA genes were registered in the database ‘tRNADB-CE’, and the number of the genes was more than four times of that of the genes previously reported by the database from analyses of complete genomes with tRNAscan-SE program. The tRNADB-CE allows for browsing sequence information, cloverleaf structures and results of similarity searches among all tRNA genes. For each of the complete genomes, the number of tRNA genes for individual anticodons and the codon usage frequency in all protein genes and the positioning of individual tRNA genes in each genome can be browsed. tRNADB-CE can be accessed freely at http://trna.nagahama-i-bio.ac.jp

    8-oxoguanine causes spontaneous de novo germline mutations in mice.

    Get PDF
    Spontaneous germline mutations generate genetic diversity in populations of sexually reproductive organisms, and are thus regarded as a driving force of evolution. However, the cause and mechanism remain unclear. 8-oxoguanine (8-oxoG) is a candidate molecule that causes germline mutations, because it makes DNA more prone to mutation and is constantly generated by reactive oxygen species in vivo. We show here that endogenous 8-oxoG caused de novo spontaneous and heritable G to T mutations in mice, which occurred at different stages in the germ cell lineage and were distributed throughout the chromosomes. Using exome analyses covering 40.9 Mb of mouse transcribed regions, we found increased frequencies of G to T mutations at a rate of 2 × 10(-7) mutations/base/generation in offspring of Mth1/Ogg1/Mutyh triple knockout (TOY-KO) mice, which accumulate 8-oxoG in the nuclear DNA of gonadal cells. The roles of MTH1, OGG1, and MUTYH are specific for the prevention of 8-oxoG-induced mutation, and 99% of the mutations observed in TOY-KO mice were G to T transversions caused by 8-oxoG; therefore, we concluded that 8-oxoG is a causative molecule for spontaneous and inheritable mutations of the germ lineage cells
    corecore