412 research outputs found

    A new computational method for the detection of horizontal gene transfer events

    Get PDF
    In recent years, the increase in the amounts of available genomic data has made it easier to appreciate the extent by which organisms increase their genetic diversity through horizontally transferred genetic material. Such transfers have the potential to give rise to extremely dynamic genomes where a significant proportion of their coding DNA has been contributed by external sources. Because of the impact of these horizontal transfers on the ecological and pathogenic character of the recipient organisms, methods are continuously sought that are able to computationally determine which of the genes of a given genome are products of transfer events. In this paper, we introduce and discuss a novel computational method for identifying horizontal transfers that relies on a gene's nucleotide composition and obviates the need for knowledge of codon boundaries. In addition to being applicable to individual genes, the method can be easily extended to the case of clusters of horizontally transferred genes. With the help of an extensive and carefully designed set of experiments on 123 archaeal and bacterial genomes, we demonstrate that the new method exhibits significant improvement in sensitivity when compared to previously published approaches. In fact, it achieves an average relative improvement across genomes of between 11 and 41% compared to the Codon Adaptation Index method in distinguishing native from foreign genes. Our method's horizontal gene transfer predictions for 123 microbial genomes are available online at

    A sensitive, support-vector-machine method for the detection of horizontal gene transfers in viral, archaeal and bacterial genomes

    Get PDF
    In earlier work, we introduced and discussed a generalized computational framework for identifying horizontal transfers. This framework relied on a gene's nucleotide composition, obviated the need for knowledge of codon boundaries and database searches, and was shown to perform very well across a wide range of archaeal and bacterial genomes when compared with previously published approaches, such as Codon Adaptation Index and C + G content. Nonetheless, two considerations remained outstanding: we wanted to further increase the sensitivity of detecting horizontal transfers and also to be able to apply the method to increasingly smaller genomes. In the discussion that follows, we present such a method, Wn-SVM, and show that it exhibits a very significant improvement in sensitivity compared with earlier approaches. Wn-SVM uses a one-class support-vector machine and can learn using rather small training sets. This property makes Wn-SVM particularly suitable for studying small-size genomes, similar to those of viruses, as well as the typically larger archaeal and bacterial genomes. We show experimentally that the new method results in a superior performance across a wide range of organisms and that it improves even upon our own earlier method by an average of 10% across all examined genomes. As a small-genome case study, we analyze the genome of the human cytomegalovirus and demonstrate that Wn-SVM correctly identifies regions that are known to be conserved and prototypical of all beta-herpesvirinae, regions that are known to have been acquired horizontally from the human host and, finally, regions that had not up to now been suspected to be horizontally transferred. Atypical region predictions for many eukaryotic viruses, including the α-, β- and γ-herpesvirinae, and 123 archaeal and bacterial genomes, have been made available online at

    Analysis of Multipath Routing—Part I: The Effect on the Packet Delivery Ratio

    Full text link

    OMPdb: a database of β-barrel outer membrane proteins from Gram-negative bacteria

    Get PDF
    We describe here OMPdb, which is currently the most complete and comprehensive collection of integral β-barrel outer membrane proteins from Gram-negative bacteria. The database currently contains 69 354 proteins, which are classified into 85 families, based mainly on structural and functional criteria. Although OMPdb follows the annotation scheme of Pfam, many of the families included in the database were not previously described or annotated in other publicly available databases. There are also cross-references to other databases, references to the literature and annotation for sequence features, like transmembrane segments and signal peptides. Furthermore, via the web interface, the user can not only browse the available data, but submit advanced text searches and run BLAST queries against the database protein sequences or domain searches against the collection of profile Hidden Markov Models that represent each family’s domain organization as well. The database is freely accessible for academic users at http://bioinformatics.biol.uoa.gr/OMPdb and we expect it to be useful for genome-wide analyses, comparative genomics as well as for providing training and test sets for predictive algorithms regarding transmembrane β-barrels

    DeepSig: Deep learning improves signal peptide detection in proteins

    Get PDF
    Motivation: The identification of signal peptides in protein sequences is an important step toward protein localization and function characterization. Results: Here, we present DeepSig, an improved approach for signal peptide detection and cleavage-site prediction based on deep learning methods. Comparative benchmarks performed on an updated independent dataset of proteins show that DeepSig is the current best performing method, scoring better than other available state-of-the-art approaches on both signal peptide detection and precise cleavage-site identification. Availability and implementation: DeepSig is available as both standalone program and web server at https://deepsig.biocomp.unibo.it. All datasets used in this study can be obtained from the same website

    Local heuristic for the refinement of multi-path routing in wireless mesh networks

    Full text link
    We consider wireless mesh networks and the problem of routing end-to-end traffic over multiple paths for the same origin-destination pair with minimal interference. We introduce a heuristic for path determination with two distinguishing characteristics. First, it works by refining an extant set of paths, determined previously by a single- or multi-path routing algorithm. Second, it is totally local, in the sense that it can be run by each of the origins on information that is available no farther than the node's immediate neighborhood. We have conducted extensive computational experiments with the new heuristic, using AODV and OLSR, as well as their multi-path variants, as underlying routing methods. For two different CSMA settings (as implemented by 802.11) and one TDMA setting running a path-oriented link scheduling algorithm, we have demonstrated that the new heuristic is capable of improving the average throughput network-wide. When working from the paths generated by the multi-path routing algorithms, the heuristic is also capable to provide a more evenly distributed traffic pattern

    OMiR: Identification of associations between OMIM diseases and microRNAs

    Get PDF
    AbstractA large number of loci for genetic diseases have been mapped on the human genome and a group of hereditary diseases among them have thus far proven unsuccessful to clone. It is conceivable that such "unclonable" diseases are not linked to abnormalities of protein coding genes (PCGs), but of non-coding RNAs (ncRNAs). We developed a novel approach termed OMiR (OMIM and miRNAs), to test whether microRNAs (miRNAs) exhibit any associations with mapped genetic diseases not yet associated with a PCG. We found that "orphan" genetic disease loci were proximal to miRNA loci more frequently than to loci for which the responsible protein coding gene is known, thus suggesting that miRNAs might be the elusive culprits. Our findings indicate that inclusion of miRNAs among the candidate genes to be considered could assist geneticists in their hunt for disease genes, particularly in the case of rare diseases

    PDBTM: Protein Data Bank of transmembrane proteins after 8 years

    Get PDF
    The PDBTM database (available at http://pdbtm .enzim.hu), the first comprehensive and up-to-date transmembrane protein selection of the Protein Data Bank, was launched in 2004. The database was created and has been continuously updated by the TMDET algorithm that is able to distinguish between transmembrane and non-transmembrane proteins using their 3D atomic coordinates only. The TMDET algorithm can locate the spatial positions of transmembrane proteins in lipid bilayer as well. During the last 8 years not only the size of the PDBTM database has been steadily growing from ~400 to 1700 entries but also new structural elements have been identified, in addition to the well-known a-helical bundle and b-barrel structures. Numerous ‘exotic’ transmembrane protein structures have been solved since the first release, which has made it necessary to define these new structural elements, such as membrane loops or interfacial helices in the database. This article reports the new features of the PDBTM database that have been added since its first release, and our current efforts to keep the database up-to-date and easy to use so that it may continue to serve as a fundamental resource for the scientific community
    corecore