258 research outputs found

    DDBJ dealing with mass data produced by the second generation sequencer

    Get PDF
    DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) collected and released 2 368 110 entries or 1 415 106 598 bases in the period from July 2007 to June 2008. The releases in this period include genome scale data of Bombyx mori, Oryzas latipes, Drosophila and Lotus japonicus. In addition, from this year we collected and released trace archive data in collaboration with National Center for Biotechnology Information (NCBI). The first release contains those of O. latipes and bacterial meta genomes in human gut. To cope with the current progress of sequencing technology, we also accepted and released more than 100 million of short reads of parasitic protozoa and their hosts that were produced by using a Solexa sequencer

    Type 2 Diabetes Mellitus and its comorbidity, Alzheimerā€™s disease: Identifying critical microRNA using machine learning

    Get PDF
    MicroRNAs (miRNAs) are critical regulators of gene expression in healthy and diseased states, and numerous studies have established their tremendous potential as a tool for improving the diagnosis of Type 2 Diabetes Mellitus (T2D) and its comorbidities. In this regard, we computationally identify novel top-ranked hub miRNAs that might be involved in T2D. We accomplish this via two strategies: 1) by ranking miRNAs based on the number of T2D differentially expressed genes (DEGs) they target, and 2) using only the common DEGs between T2D and its comorbidity, Alzheimerā€™s disease (AD) to predict and rank miRNA. Then classifier models are built using the DEGs targeted by each miRNA as features. Here, we show the T2D DEGs targeted by hsa-mir-1-3p, hsa-mir-16-5p, hsa-mir-124-3p, hsa-mir-34a-5p, hsa-let-7b-5p, hsa-mir-155-5p, hsa-mir-107, hsa-mir-27a-3p, hsa-mir-129-2-3p, and hsa-mir-146a-5p are capable of distinguishing T2D samples from the controls, which serves as a measure of confidence in the miRNAsā€™ potential role in T2D progression. Moreover, for the second strategy, we show other critical miRNAs can be made apparent through the diseaseā€™s comorbidities, and in this case, overall, the hsa-mir-103a-3p models work well for all the datasets, especially in T2D, while the hsa-mir-124-3p models achieved the best scores for the AD datasets. To the best of our knowledge, this is the first study that used predicted miRNAs to determine the features that can separate the diseased samples (T2D or AD) from the normal ones, instead of using conventional non-biology-based feature selection methods

    Mining biosynthetic gene clusters in Virgibacillus genomes

    Get PDF
    BACKGROUND: Biosynthetic gene clusters produce a wide range of metabolites with activities that are of interest to the pharmaceutical industry. Specific interest is shown towards those metabolites that exhibit antimicrobial activities against multidrug-resistant bacteria that have become a global health threat. Genera of the phylum Firmicutes are frequently identified as sources of such metabolites, but the biosynthetic potential of its Virgibacillus genus is not known. Here, we used comparative genomic analysis to determine whether Virgibacillus strains isolated from the Red Sea mangrove mud in Rabigh Harbor Lagoon, Saudi Arabia, may be an attractive source of such novel antimicrobial agents. RESULTS: A comparative genomics analysis based on Virgibacillus dokdonensis Bac330, Virgibacillus sp. Bac332 and Virgibacillus halodenitrificans Bac324 (isolated from the Red Sea) and six other previously reported Virgibacillus strains was performed. Orthology analysis was used to determine the core genomes as well as the accessory genome of the nine Virgibacillus strains. The analysis shows that the Red Sea strain Virgibacillus sp. Bac332 has the highest number of unique genes and genomic islands compared to other genomes included in this study. Focusing on biosynthetic gene clusters, we show how marine isolates, including those from the Red Sea, are more enriched with nonribosomal peptides compared to the other Virgibacillus species. We also found that most nonribosomal peptide synthases identified in the Virgibacillus strains are part of genomic regions that are potentially horizontally transferred. CONCLUSIONS: The Red Sea Virgibacillus strains have a large number of biosynthetic genes in clusters that are not assigned to known products, indicating significant potential for the discovery of novel bioactive compounds. Also, having more modular synthetase units suggests that these strains are good candidates for experimental characterization of previously identified bioactive compounds as well. Future efforts will be directed towards establishing the properties of the potentially novel compounds encoded by the Red Sea specific trans-AT PKS/NRPS cluster and the type III PKS/NRPS cluster

    Assessing constancy of substitution rates in viruses over evolutionary time

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Phylogenetic analyses reveal probable patterns of divergence of present day organisms from common ancestors. The points of divergence of lineages can be dated if a corresponding historical or fossil record exists. For many species, in particular viruses, such records are rare. Recently, Bayesian phylogenetic analysis using sequences from closely related organisms isolated at different times have been used to calibrate divergences. Phylogenetic analyses depend on the assumption that the average substitution rates that can be calculated from the data apply throughout the course of evolution. </p> <p>Results</p> <p>The present study tests this crucial assumption by charting the kinds of substitutions observed between pairs of sequences with different levels of total substitutions. Datasets of aligned sequences, both viral and non-viral, were assembled. For each pair of sequences in an aligned set, the distribution of nucleotide interchanges and the total number of changes were calculated. Data were binned according to total numbers of changes and plotted. The accumulation of the six possible interchange types in retroelements as a function of distance followed closely the expected hyperbolic relationship. For other datasets, however, significant deviations from this relationship were noted. A rapid initial accumulation of transition interchanges was frequent among the datasets and anomalous changes occurred at specific divergence levels. </p> <p>Conclusions</p> <p>The accumulation profiles suggested that substantial changes in frequencies of types of substitutions occur over the course of evolution and that such changes should be considered in evaluating and dating viral phylogenies.</p

    Reptile enamel matrix proteins: Selection, divergence, and functional constraint

    Full text link
    The three major enamel matrix proteins (EMPs): amelogenin (AMEL), ameloblastin (AMBN), and enamelin (ENAM), are intrinsically linked to tooth development in tetrapods. However, reptiles and mammals exhibit significant differences in dental patterning and development, potentially affecting how EMPs evolve in each group. In most reptiles, teeth are replaced continuously throughout life, while mammals have reduced replacement to only one or two generations. Reptiles also form structurally simple, aprismatic enamel while mammalian enamel is composed of highly organized hydroxyapatite prisms. These differences, combined with reported low sequence homology in reptiles, led us to predict that reptiles may experience lower selection pressure on their EMPs as compared with mammals. However, we found that like mammals, reptile EMPs are under moderate purifying selection, with some differences evident between AMEL, AMBN, and ENAM. We also demonstrate that sequence homology in reptile EMPs is closely associated with divergence times, with more recently diverged lineages exhibiting high homology, along with strong phylogenetic signal. Lastly, despite sequence divergence, none of the reptile species in our study exhibited mutations consistent with diseases that cause degeneration of enamel (e.g. amelogenesis imperfecta). Despite short tooth retention time and simplicity in enamel structure, reptile EMPs still exhibit purifying selection required to form durable enamel.We calculated the percent identity between amino acid sequences of ameloblastin from various reptile groups. Crocodilians exhibit the highest sequence identity, while identity across squamates was substantially lower. Upon closer examination of the individual squamate clades, however, we found that identity values are actually much higher in snakes, with much of the variation existing between the various lizard infraorders.HIGHLIGHTSReptile enamel matrix proteins are under moderate purifying selection despite polyphyodonty and simple enamel structure.Sequence identity in reptile enamel matrix proteins exhibit correlation with divergence times in spite of differences in substitution rates.Reptile amelogenin operates under a distinct selection regime compared with ameloblastin and enamelin.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/150577/1/jezb22857.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/150577/2/jezb22857-sup-0001-Supplementary_file.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/150577/3/jezb22857-sup-0007-Supplementary_file_S8-DAMBE-Saturation.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/150577/4/jezb22857-sup-0002-Supplementary_file_S1-SpeciesTable.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/150577/5/jezb22857-sup-0003-Supplementary_file_S2_Alignments.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/150577/6/jezb22857-sup-0008-Supplementary_File_S9.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/150577/7/jezb22857-sup-0005-Supplementary_file_S6.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/150577/8/jezb22857_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/150577/9/jezb22857-sup-0009-Supplementary_file_Reptiles.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/150577/10/jezb22857-sup-0006-Supplementary_file_S7-DIVERGE.pd

    Repetition as the Essence of Life on this Earth: Music and Genes

    Full text link
    While it is believed that life on this earth started as long ago as a few billion or more years ago, a number of true innovations in evolution appears to have been rather dis

    VarySysDB: a human genetic polymorphism database based on all H-InvDB transcripts

    Get PDF
    Creation of a vast variety of proteins is accomplished by genetic variation and a variety of alternative splicing transcripts. Currently, however, the abundant available data on genetic variation and the transcriptome are stored independently and in a dispersed fashion. In order to provide a research resource regarding the effects of human genetic polymorphism on various transcripts, we developed VarySysDB, a genetic polymorphism database based on 187 156 extensively annotated matured mRNA transcripts from 36 073 loci provided by H-InvDB. VarySysDB offers information encompassing published human genetic polymorphisms for each of these transcripts separately. This allows comparisons of effects derived from a polymorphism on different transcripts. The published information we analyzed includes single nucleotide polymorphisms and deletionā€“insertion polymorphisms from dbSNP, copy number variations from Database of Genomic Variants, short tandem repeats and single amino acid repeats from H-InvDB and linkage disequilibrium regions from D-HaploDB. The information can be searched and retrieved by features, functions and effects of polymorphisms, as well as by keywords. VarySysDB combines two kinds of viewers, GBrowse and Sequence View, to facilitate understanding of the positional relationship among polymorphisms, genome, transcripts, loci and functional domains. We expect that VarySysDB will yield useful information on polymorphisms affecting gene expression and phenotypes. VarySysDB is available at http://h-invitational.jp/varygene/

    Comparative genomics study reveals Red Sea Bacillus with characteristics associated with potential microbial cell factories (MCFs)

    Get PDF
    Ā© 2019, The Author(s). Recent advancements in the use of microbial cells for scalable production of industrial enzymes encourage exploring new environments for efficient microbial cell factories (MCFs). Here, through a comparison study, ten newly sequenced Bacillus species, isolated from the Rabigh Harbor Lagoon on the Red Sea shoreline, were evaluated for their potential use as MCFs. Phylogenetic analysis of 40 representative genomes with phylogenetic relevance, including the ten Red Sea species, showed that the Red Sea species come from several colonization events and are not the result of a single colonization followed by speciation. Moreover, clustering reactions in reconstruct metabolic networks of these Bacillus species revealed that three metabolic clades do not fit the phylogenetic tree, a sign of convergent evolution of the metabolism of these species in response to special environmental adaptation. We further showed Red Sea strains Bacillus paralicheniformis (Bac48) and B. halosaccharovorans (Bac94) had twice as much secreted proteins than the model strain B. subtilis 168. Also, Bac94 was enriched with genes associated with the Tat and Sec protein secretion system and Bac48 has a hybrid PKS/NRPS cluster that is part of a horizontally transferred genomic region. These properties collectively hint towards the potential use of Red Sea Bacillus as efficient protein secreting microbial hosts, and that this characteristic of these strains may be a consequence of the unique ecological features of the isolation environment
    • ā€¦
    corecore