155 research outputs found

    Oligonucleotide Frequencies of Barcoding Loci Can Discriminate Species across Kingdoms

    Get PDF
    Background: DNA barcoding refers to the use of short DNA sequences for rapid identification of species. Genetic distance or character attributes of a particular barcode locus discriminate the species. We report an efficient approach to analyze short sequence data for discrimination between species. Methodology and Principal Findings: A new approach, Oligonucleotide Frequency Range (OFR) of barcode loci for species discrimination is proposed. OFR of the loci that discriminates between species was characteristic of a species, i.e., the maxima and minima within a species did not overlap with that of other species. We compared the species resolution ability of different barcode loci using p-distance, Euclidean distance of oligonucleotide frequencies, nucleotide-character based approach and OFR method. The species resolution by OFR was either higher or comparable to the other methods. A short fragment of 126 bp of internal transcribed spacer region in ribosomal RNA gene was sufficient to discriminate a majority of the species using OFR. Conclusions/Significance: Oligonucleotide frequency range of a barcode locus can discriminate between species. Ability to discriminate species using very short DNA fragments may have wider applications in forensic and conservation studies

    Stratification of co-evolving genomic groups using ranked phylogenetic profiles

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Previous methods of detecting the taxonomic origins of arbitrary sequence collections, with a significant impact to genome analysis and in particular metagenomics, have primarily focused on compositional features of genomes. The evolutionary patterns of phylogenetic distribution of genes or proteins, represented by phylogenetic profiles, provide an alternative approach for the detection of taxonomic origins, but typically suffer from low accuracy. Herein, we present <it>rank-BLAST</it>, a novel approach for the assignment of protein sequences into genomic groups of the same taxonomic origin, based on the ranking order of phylogenetic profiles of target genes or proteins across the reference database.</p> <p>Results</p> <p>The rank-BLAST approach is validated by computing the phylogenetic profiles of all sequences for five distinct microbial species of varying degrees of phylogenetic proximity, against a reference database of 243 fully sequenced genomes. The approach - a combination of sequence searches, statistical estimation and clustering - analyses the degree of sequence divergence between sets of protein sequences and allows the classification of protein sequences according to the species of origin with high accuracy, allowing taxonomic classification of 64% of the proteins studied. In most cases, a main cluster is detected, representing the corresponding species. Secondary, functionally distinct and species-specific clusters exhibit different patterns of phylogenetic distribution, thus flagging gene groups of interest. Detailed analyses of such cases are provided as examples.</p> <p>Conclusion</p> <p>Our results indicate that the rank-BLAST approach can capture the taxonomic origins of sequence collections in an accurate and efficient manner. The approach can be useful both for the analysis of genome evolution and the detection of species groups in metagenomics samples.</p

    Revisiting detrended fluctuation analysis

    Get PDF
    Half a century ago Hurst introduced Rescaled Range (R/S) Analysis to study fluctuations in time series. Thousands of works have investigated or applied the original methodology and similar techniques, with Detrended Fluctuation Analysis becoming preferred due to its purported ability to mitigate nonstationaries. We show Detrended Fluctuation Analysis introduces artifacts for nonlinear trends, in contrast to common expectation, and demonstrate that the empirically observed curvature induced is a serious finite-size effect which will always be present. Explicit detrending followed by measurement of the diffusional spread of a signals' associated random walk is preferable, a surprising conclusion given that Detrended Fluctuation Analysis was crafted specifically to replace this approach. The implications are simple yet sweeping: there is no compelling reason to apply Detrended Fluctuation Analysis as it 1) introduces uncontrolled bias; 2) is computationally more expensive than the unbiased estimator; and 3) cannot provide generic or useful protection against nonstationaries

    Minimal Functional Sites Allow a Classification of Zinc Sites in Proteins

    Get PDF
    Zinc is indispensable to all forms of life as it is an essential component of many different proteins involved in a wide range of biological processes. Not differently from other metals, zinc in proteins can play different roles that depend on the features of the metal-binding site. In this work, we describe zinc sites in proteins with known structure by means of three-dimensional templates that can be automatically extracted from PDB files and consist of the protein structure around the metal, including the zinc ligands and the residues in close spatial proximity to the ligands. This definition is devised to intrinsically capture the features of the local protein environment that can affect metal function, and corresponds to what we call a minimal functional site (MFS). We used MFSs to classify all zinc sites whose structures are available in the PDB and combined this classification with functional annotation as available in the literature. We classified 77% of zinc sites into ten clusters, each grouping zinc sites with structures that are highly similar, and an additional 16% into seven pseudo-clusters, each grouping zinc sites with structures that are only broadly similar. Sites where zinc plays a structural role are predominant in eight clusters and in two pseudo-clusters, while sites where zinc plays a catalytic role are predominant in two clusters and in five pseudo-clusters. We also analyzed the amino acid composition of the coordination sphere of zinc as a function of its role in the protein, highlighting trends and exceptions. In a period when the number of known zinc proteins is expected to grow further with the increasing awareness of the cellular mechanisms of zinc homeostasis, this classification represents a valuable basis for structure-function studies of zinc proteins, with broad applications in biochemistry, molecular pharmacology and de novo protein design

    First passage events in biological systems with non-exponential inter-event times

    Get PDF
    It is often possible to model the dynamics of biological systems as a series of discrete transitions between a finite set of observable states (or compartments). When the residence times in each state, or inter-event times more generally, are exponentially distributed, then one can write a set of ordinary differential equations, which accurately describe the evolution of mean quantities. Non-exponential inter-event times can also be experimentally observed, but are more difficult to analyse mathematically. In this paper, we focus on the computation of first passage events and their probabilities in biological systems with non-exponential inter-event times. We show, with three case studies from Molecular Immunology, Virology and Epidemiology, that significant errors are introduced when drawing conclusions based on the assumption that inter-event times are exponentially distributed. Our approach allows these errors to be avoided with the use of phase-type distributions that approximate arbitrarily distributed inter-event times

    Amino Acid Residues Contributing to Function of the Heteromeric Insect Olfactory Receptor Complex

    Get PDF
    Olfactory receptors (Ors) convert chemical signals—the binding of odors and pheromones—to electrical signals through the depolarization of olfactory sensory neurons. Vertebrates Ors are G-protein-coupled receptors, stimulated by odors to produce intracellular second messengers that gate ion channels. Insect Ors are a heteromultimeric complex of unknown stoichiometry of two seven transmembrane domain proteins with no sequence similarity to and the opposite membrane topology of G-protein-coupled receptors. The functional insect Or comprises an odor- or pheromone-specific Or subunit and the Orco co-receptor, which is highly conserved in all insect species. The insect Or-Orco complex has been proposed to function as a novel type of ligand-gated nonselective cation channel possibly modulated by G-proteins. However, the Or-Orco proteins lack homology to any known family of ion channel and lack known functional domains. Therefore, the mechanisms by which odors activate the Or-Orco complex and how ions permeate this complex remain unknown. To begin to address the relationship between Or-Orco structure and function, we performed site-directed mutagenesis of all 83 conserved Glu, Asp, or Tyr residues in the silkmoth BmOr-1-Orco pheromone receptor complex and measured functional properties of mutant channels expressed in Xenopus oocytes. 13 of 83 mutations in BmOr-1 and BmOrco altered the reversal potential and rectification index of the BmOr-1-Orco complex. Three of the 13 amino acids (D299 and E356 in BmOr-1 and Y464 in BmOrco) altered both current-voltage relationships and K+ selectivity. We introduced the homologous Orco Y464 residue into Drosophila Orco in vivo, and observed variable effects on spontaneous and evoked action potentials in olfactory neurons that depended on the particular Or-Orco complex examined. Our results provide evidence that a subset of conserved Glu, Asp and Tyr residues in both subunits are essential for channel activity of the heteromeric insect Or-Orco complex

    Pairwise statistical significance of local sequence alignment using multiple parameter sets and empirical justification of parameter set change penalty

    Get PDF
    Background: Accurate estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison. Recently, a comparative study of pairwise statistical significance with database statistical significance was conducted. In this paper, we extend the earlier work on pairwise statistical significance by incorporating with it the use of multiple parameter sets. Results: Results for a knowledge discovery application of homology detection reveal that using multiple parameter sets for pairwise statistical significance estimates gives better coverage than using a single parameter set, at least at some error levels. Further, the results of pairwise statistical significance using multiple parameter sets are shown to be significantly better than database statistical significance estimates reported by BLAST and PSI-BLAST, and comparable and at times significantly better than SSEARCH. Using non-zero parameter set change penalty values give better performance than zero penalty. Conclusion: The fact that the homology detection performance does not degrade when using multiple parameter sets is a strong evidence for the validity of the assumption that the alignment score distribution follows an extreme value distribution even when using multiple parameter sets. Parameter set change penalty is a useful parameter for alignment using multiple parameter sets. Pairwise statistical significance using multiple parameter sets can be effectively used to determine the relatedness of a (or a few) pair(s) of sequences without performing a time-consuming database search

    The use of genomic signature distance between bacteriophages and their hosts displays evolutionary relationships and phage growth cycle determination

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Bacteriophage classification is mainly based on morphological traits and genome characteristics combined with host information and in some cases on phage growth lifestyle. A lack of molecular tools can impede more precise studies on phylogenetic relationships or even a taxonomic classification. The use of methods to analyze genome sequences without the requirement for homology has allowed advances in classification.</p> <p>Results</p> <p>Here, we proposed to use genome sequence signature to characterize bacteriophages and to compare them to their host genome signature in order to obtain host-phage relationships and information on their lifestyle. We analyze the host-phage relationships in the four most representative groups of Caudoviridae, the dsDNA group of phages. We demonstrate that the use of phage genomic signature and its comparison with that of the host allows a grouping of phages and is also able to predict the host-phage relationships (lytic <it>vs</it>. temperate).</p> <p>Conclusions</p> <p>We can thus condense, in relatively simple figures, this phage information dispersed over many publications.</p

    Structure of the pentameric ligand-gated ion channel ELIC cocrystallized with its competitive antagonist acetylcholine

    Get PDF
    ELIC, the pentameric ligand-gated ion channel from Erwinia chrysanthemi, is a prototype for Cys-loop receptors. Here we show that acetylcholine is a competitive antagonist for ELIC. We determine the acetylcholine–ELIC cocrystal structure to a 2.9-Å resolution and find that acetylcholine binding to an aromatic cage at the subunit interface induces a significant contraction of loop C and other structural rearrangements in the extracellular domain. The side chain of the pore-lining residue F247 reorients and the pore size consequently enlarges, but the channel remains closed. We attribute the inability of acetylcholine to activate ELIC primarily to weak cation-π and electrostatic interactions in the pocket, because an acetylcholine derivative with a simple quaternary-to-tertiary ammonium substitution activates the channel. This study presents a compelling case for understanding the structural underpinning of the functional relationship between agonism and competitive antagonism in the Cys-loop receptors, providing a new framework for developing novel therapeutic drugs

    Adaptive Contact Networks Change Effective Disease Infectiousness and Dynamics

    Get PDF
    Human societies are organized in complex webs that are constantly reshaped by a social dynamic which is influenced by the information individuals have about others. Similarly, epidemic spreading may be affected by local information that makes individuals aware of the health status of their social contacts, allowing them to avoid contact with those infected and to remain in touch with the healthy. Here we study disease dynamics in finite populations in which infection occurs along the links of a dynamical contact network whose reshaping may be biased based on each individual's health status. We adopt some of the most widely used epidemiological models, investigating the impact of the reshaping of the contact network on the disease dynamics. We derive analytical results in the limit where network reshaping occurs much faster than disease spreading and demonstrate numerically that this limit extends to a much wider range of time scales than one might anticipate. Specifically, we show that from a population-level description, disease propagation in a quickly adapting network can be formulated equivalently as disease spreading on a well-mixed population but with a rescaled infectiousness. We find that for all models studied here – SI, SIS and SIR – the effective infectiousness of a disease depends on the population size, the number of infected in the population, and the capacity of healthy individuals to sever contacts with the infected. Importantly, we indicate how the use of available information hinders disease progression, either by reducing the average time required to eradicate a disease (in case recovery is possible), or by increasing the average time needed for a disease to spread to the entire population (in case recovery or immunity is impossible)
    corecore