238 research outputs found

    ProTISA: a comprehensive resource for translation initiation site annotation in prokaryotic genomes

    Get PDF
    Correct annotation of translation initiation site (TIS) is essential for both experiments and bioinformatics studies of prokaryotic translation initiation mechanism as well as understanding of gene regulation and gene structure. Here we describe a comprehensive database ProTISA, which collects TIS confirmed through a variety of available evidences for prokaryotic genomes, including Swiss-Prot experiments record, literature, conserved domain hits and sequence alignment between orthologous genes. Moreover, by combining the predictions from our recently developed TIS post-processor, ProTISA provides a refined annotation for the public database RefSeq. Furthermore, the database annotates the potential regulatory signals associated with translation initiation at the TIS upstream region. As of July 2007, ProTISA includes 440 microbial genomes with more than 390 000 confirmed TISs. The database is available at http://mech.ctb.pku.edu.cn/protis

    Comparative analysis of an experimental subcellular protein localization assay and in silico prediction methods

    Get PDF
    The subcellular localization of a protein can provide important information about its function within the cell. As eukaryotic cells and particularly mammalian cells are characterized by a high degree of compartmentalization, most protein activities can be assigned to particular cellular compartments. The categorization of proteins by their subcellular localization is therefore one of the essential goals of the functional annotation of the human genome. We previously performed a subcellular localization screen of 52 proteins encoded on human chromosome 21. In the current study, we compared the experimental localization data to the in silico results generated by nine leading software packages with different prediction resolutions. The comparison revealed striking differences between the programs in the accuracy of their subcellular protein localization predictions. Our results strongly suggest that the recently developed predictors utilizing multiple prediction methods tend to provide significantly better performance over purely sequence-based or homology-based predictions

    Ergatis: a web interface and scalable software system for bioinformatics workflows

    Get PDF
    Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users

    On evil and computational creativity

    Get PDF
    This paper touches upon the philosophical concept of evil in the context of creativity in general, and computational creativity in particular. In this work, dark creativity is introduced and linked to two important pre-requisites of creativity (i.e. freedom and constraints). A hybrid computational system is then presented; it includes one swarm intelligence algorithm, Stochastic Diffusion Search – mimicking the foraging behaviour of one species of ant, Leptothorax acervorum – and one physiological mechanism – imitating the behaviour of Human Immunodeficiency Virus. The aim is to outline an integration strategy deploying the search capabilities of the swarm intelligence algorithm and the destructive power of the digital virus. The swarm intelligence algorithm determines the colour attribute of the dynamic areas of interest within the input image, and the digital virus modifies the state of the input image, creating the projection of ‘evil’ over time (evil is used here as excessive use of underlying freedom). The paper concludes by exploring the significance of sensorimotor couplings and the impact of intentionality and genuine understanding of computational systems in the light of the philosophical concept of weak and strong computational creativity

    Single-nucleotide resolution analysis of the transcriptome structure of Clostridium beijerinckii NCIMB 8052 using RNA-Seq

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Clostridium beijerinckii </it>is an important solvent producing microorganism. The genome of <it>C. beijerinckii </it>NCIMB 8052 has recently been sequenced. Although transcriptome structure is important in order to reveal the functional and regulatory architecture of the genome, the physical structure of transcriptome for this strain, such as the operon linkages and transcript boundaries are not well understood.</p> <p>Results</p> <p>In this study, we conducted a single-nucleotide resolution analysis of the <it>C. beijerinckii </it>NCIMB 8052 transcriptome using high-throughput RNA-Seq technology. We identified the transcription start sites and operon structure throughout the genome. We confirmed the structure of important gene operons involved in metabolic pathways for acid and solvent production in <it>C. beijerinckii </it>8052, including <it>pta</it>-<it>ack</it>, <it>ptb</it>-<it>buk</it>, <it>hbd</it>-<it>etfA</it>-<it>etfB</it>-<it>crt </it>(<it>bcs</it>) and <it>ald</it>-<it>ctfA</it>-<it>ctfB</it>-<it>adc </it>(<it>sol</it>) operons; we also defined important operons related to chemotaxis/motility, transcriptional regulation, stress response and fatty acids biosynthesis along with others. We discovered 20 previously non-annotated regions with significant transcriptional activities and 15 genes whose translation start codons were likely mis-annotated. As a consequence, the accuracy of existing genome annotation was significantly enhanced. Furthermore, we identified 78 putative silent genes and 177 putative housekeeping genes based on normalized transcription measurement with the sequence data. We also observed that more than 30% of pseudogenes had significant transcriptional activities during the fermentation process. Strong correlations exist between the expression values derived from RNA-Seq analysis and microarray data or qRT-PCR results.</p> <p>Conclusions</p> <p>Transcriptome structural profiling in this research provided important supplemental information on the accuracy of genome annotation, and revealed additional gene functions and regulation in <it>C. beijerinckii</it>.</p

    Gene prediction in metagenomic fragments: A large scale machine learning approach

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Metagenomics is an approach to the characterization of microbial genomes via the direct isolation of genomic sequences from the environment without prior cultivation. The amount of metagenomic sequence data is growing fast while computational methods for metagenome analysis are still in their infancy. In contrast to genomic sequences of single species, which can usually be assembled and analyzed by many available methods, a large proportion of metagenome data remains as unassembled anonymous sequencing reads. One of the aims of all metagenomic sequencing projects is the identification of novel genes. Short length, for example, Sanger sequencing yields on average 700 bp fragments, and unknown phylogenetic origin of most fragments require approaches to gene prediction that are different from the currently available methods for genomes of single species. In particular, the large size of metagenomic samples requires fast and accurate methods with small numbers of false positive predictions.</p> <p>Results</p> <p>We introduce a novel gene prediction algorithm for metagenomic fragments based on a two-stage machine learning approach. In the first stage, we use linear discriminants for monocodon usage, dicodon usage and translation initiation sites to extract features from DNA sequences. In the second stage, an artificial neural network combines these features with open reading frame length and fragment GC-content to compute the probability that this open reading frame encodes a protein. This probability is used for the classification and scoring of gene candidates. With large scale training, our method provides fast single fragment predictions with good sensitivity and specificity on artificially fragmented genomic DNA. Additionally, this method is able to predict translation initiation sites accurately and distinguishes complete from incomplete genes with high reliability.</p> <p>Conclusion</p> <p>Large scale machine learning methods are well-suited for gene prediction in metagenomic DNA fragments. In particular, the combination of linear discriminants and neural networks is promising and should be considered for integration into metagenomic analysis pipelines. The data sets can be downloaded from the URL provided (see Availability and requirements section).</p

    Origin of Saxitoxin Biosynthetic Genes in Cyanobacteria

    Get PDF
    BACKGROUND:Paralytic shellfish poisoning (PSP) is a potentially fatal syndrome associated with the consumption of shellfish that have accumulated saxitoxin (STX). STX is produced by microscopic marine dinoflagellate algae. Little is known about the origin and spread of saxitoxin genes in these under-studied eukaryotes. Fortuitously, some freshwater cyanobacteria also produce STX, providing an ideal model for studying its biosynthesis. Here we focus on saxitoxin-producing cyanobacteria and their non-toxic sisters to elucidate the origin of genes involved in the putative STX biosynthetic pathway. METHODOLOGY/PRINCIPAL FINDINGS:We generated a draft genome assembly of the saxitoxin-producing (STX+) cyanobacterium Anabaena circinalis ACBU02 and searched for 26 candidate saxitoxin-genes (named sxtA to sxtZ) that were recently identified in the toxic strain Cylindrospermopsis raciborskii T3. We also generated a draft assembly of the non-toxic (STX-) sister Anabaena circinalis ACFR02 to aid the identification of saxitoxin-specific genes. Comparative phylogenomic analyses revealed that nine putative STX genes were horizontally transferred from non-cyanobacterial sources, whereas one key gene (sxtA) originated in STX+ cyanobacteria via two independent horizontal transfers followed by fusion. In total, of the 26 candidate saxitoxin-genes, 13 are of cyanobacterial provenance and are monophyletic among the STX+ taxa, four are shared amongst STX+ and STX-cyanobacteria, and the remaining nine genes are specific to STX+ cyanobacteria. CONCLUSIONS/SIGNIFICANCE:Our results provide evidence that the assembly of STX genes in ACBU02 involved multiple HGT events from different sources followed presumably by coordination of the expression of foreign and native genes in the common ancestor of STX+ cyanobacteria. The ability to produce saxitoxin was subsequently lost multiple independent times resulting in a nested relationship of STX+ and STX- strains among Anabaena circinalis strains

    Voronoi Tessellation Captures Very Early Clustering of Single Primary Cells as Induced by Interactions in Nascent Biofilms

    Get PDF
    Biofilms dominate microbial life in numerous aquatic ecosystems, and in engineered and medical systems, as well. The formation of biofilms is initiated by single primary cells colonizing surfaces from the bulk liquid. The next steps from primary cells towards the first cell clusters as the initial step of biofilm formation remain relatively poorly studied. Clonal growth and random migration of primary cells are traditionally considered as the dominant processes leading to organized microcolonies in laboratory grown monocultures. Using Voronoi tessellation, we show that the spatial distribution of primary cells colonizing initially sterile surfaces from natural streamwater community deviates from uniform randomness already during the very early colonisation. The deviation from uniform randomness increased with colonisation — despite the absence of cell reproduction — and was even more pronounced when the flow of water above biofilms was multidirectional and shear stress elevated. We propose a simple mechanistic model that captures interactions, such as cell-to-cell signalling or chemical surface conditioning, to simulate the observed distribution patterns. Model predictions match empirical observations reasonably well, highlighting the role of biotic interactions even already during very early biofilm formation despite few and distant cells. The transition from single primary cells to clustering accelerated by biotic interactions rather than by reproduction may be particularly advantageous in harsh environments — the rule rather than the exception outside the laboratory

    Genome Sequence of a Mesophilic Hydrogenotrophic Methanogen Methanocella paludicola, the First Cultivated Representative of the Order Methanocellales

    Get PDF
    We report complete genome sequence of a mesophilic hydrogenotrophic methanogen Methanocella paludicola, the first cultured representative of the order Methanocellales once recognized as an uncultured key archaeal group for methane emission in rice fields. The genome sequence of M. paludicola consists of a single circular chromosome of 2,957,635 bp containing 3004 protein-coding sequences (CDS). Genes for most of the functions known in the methanogenic archaea were identified, e.g. a full complement of hydrogenases and methanogenesis enzymes. The mixotrophic growth of M. paludicola was clarified by the genomic characterization and re-examined by the subsequent growth experiments. Comparative genome analysis with the previously reported genome sequence of RC-IMRE50, which was metagenomically reconstructed, demonstrated that about 70% of M. paludicola CDSs were genetically related with RC-IMRE50 CDSs. These CDSs included the genes involved in hydrogenotrophic methane production, incomplete TCA cycle, assimilatory sulfate reduction and so on. However, the genetic components for the carbon and nitrogen fixation and antioxidant system were different between the two Methanocellales genomes. The difference is likely associated with the physiological variability between M. paludicola and RC-IMRE50, further suggesting the genomic and physiological diversity of the Methanocellales methanogens. Comparative genome analysis among the previously determined methanogen genomes points to the genome-wide relatedness of the Methanocellales methanogens to the orders Methanosarcinales and Methanomicrobiales methanogens in terms of the genetic repertoire. Meanwhile, the unique evolutionary history of the Methanocellales methanogens is also traced in an aspect by the comparative genome analysis among the methanogens

    Coral disease outbreak at the remote Flower Garden Banks, Gulf of Mexico

    Get PDF
    East and West Flower Garden Bank (FGB) are part of Flower Garden Banks National Marine Sanctuary (FGBNMS) in the northwest Gulf of Mexico. This geographically-isolated reef system contains extensive coral communities with the highest coral cover (&gt;50%) in the continental United States due, in part, to their remoteness and depth, and have historically exhibited low incidence of coral disease and bleaching despite ocean warming. Yet in late August 2022, disease-like lesions on seven coral species were reported during routine monitoring surveys on East and West FGB (2.1–2.6% prevalence). A series of rapid response cruises were conducted in September and October 2022 focused on 1) characterizing signs and epidemiological aspects of the disease across FGB and within long-term monitoring sites, 2) treating affected coral colonies with Base 2B plus amoxicillin, and 3) collecting baseline images through photostations and photomosaics. Marginal and/or multi-focal lesions and tissue loss were observed, often associated with substantial fish and invertebrate predation, affecting the dominant coral species Pseudodiploria strigosa (7–8% lesion prevalence), Colpophyllia natans (11–18%), and Orbicella spp. (1%). Characterizing this disease event during its early epidemic phase at East and West FGB provides a critical opportunity to observe how coral disease functions in a relatively healthy coral ecosystem versus on reefs chronically affected by various stressors (e.g., Caribbean reefs adjacent to urban centers). Insights into the etiology, spread, and impacts of the disease can ultimately inform efforts to mitigate its effects on coral communities
    corecore