49 research outputs found

    Relationships between predicted moonlighting proteins, human diseases, and comorbidities from a network perspective

    No full text
    International audienceMoonlighting proteins are a subset of multifunctional proteins characterized by their multiple, independent, and unrelated biological functions. We recently set up a large-scale identification of moonlighting proteins using a protein-protein interaction (PPI) network approach. We established that 3% of the current human interactome is composed of predicted moonlighting proteins. We found that disease-related genes are over-represented among those candidates. Here, by comparing moonlighting candidates to non-candidates as groups, we further show that (7 they are significantly involved in more than one disease, (ii) they contribute to complex rather than monogenic diseases, (iii) the diseases in which they are involved are phenotypically different according to their annotations, finally, (iv) they are enriched for diseases pairs showing statistically significant comorbidity patterns based on Medicare records. Altogether, our results suggest that some observed comorbidities between phenotypically different diseases could be due to a shared protein involved in unrelated biological processes

    SECISaln, a web-based tool for the creation of structure-based alignments of eukaryotic SECIS elements

    Get PDF
    Summary: Selenoproteins contain the 21st amino acid selenocysteine which is encoded by an inframe UGA codon, usually read as a stop. In eukaryotes, its co-translational recoding requires the presence of an RNA stem–loop structure, the SECIS element in the 3 untranslated region of (UTR) selenoprotein mRNAs. Despite little sequence conservation, SECIS elements share the same overall secondary structure. Until recently, the lack of a significantly high number of selenoprotein mRNA sequences hampered the identification of other potential sequence conservation. In this work, the web-based tool SECISaln provides for the first time an extensive structure-based sequence alignment of SECIS elements resulting from the well-defined secondary structure of the SECIS RNA and the increased size of the eukaryotic selenoproteome. We have used SECISaln to improve our knowledge of SECIS secondary structure and to discover novel, conserved nucleotide positions and we believe it will be a useful tool for the selenoprotein and RNA scientific communities

    Multifunctional proteins revealed by overlapping clustering in protein interaction network

    Get PDF
    Motivation: Multifunctional proteins perform several functions. They are expected to interact specifically with distinct sets of partners, simultaneously or not, depending on the function performed. Current graph clustering methods usually allow a protein to belong to only one cluster, therefore impeding a realistic assignment of multifunctional proteins to clusters

    A comparison of random sequence reads versus 16S rDNA sequences for estimating the biodiversity of a metagenomic library

    Get PDF
    The construction of metagenomic libraries has permitted the study of microorganisms resistant to isolation and the analysis of 16S rDNA sequences has been used for over two decades to examine bacterial biodiversity. Here, we show that the analysis of random sequence reads (RSRs) instead of 16S is a suitable shortcut to estimate the biodiversity of a bacterial community from metagenomic libraries. We generated 10 010 RSRs from a metagenomic library of microorganisms found in human faecal samples. Then searched them using the program BLASTN against a prokaryotic sequence database to assign a taxon to each RSR. The results were compared with those obtained by screening and analysing the clones containing 16S rDNA sequences in the whole library. We found that the biodiversity observed by RSR analysis is consistent with that obtained by 16S rDNA. We also show that RSRs are suitable to compare the biodiversity between different metagenomic libraries. RSRs can thus provide a good estimate of the biodiversity of a metagenomic library and, as an alternative to 16S, this approach is both faster and cheaper

    Relaxation of Selective Constraints Causes Independent Selenoprotein Extinction in Insect Genomes

    Get PDF
    BACKGROUND: Selenoproteins are a diverse family of proteins notable for the presence of the 21st amino acid, selenocysteine. Until very recently, all metazoan genomes investigated encoded selenoproteins, and these proteins had therefore been believed to be essential for animal life. Challenging this assumption, recent comparative analyses of insect genomes have revealed that some insect genomes appear to have lost selenoprotein genes. METHODOLOGY/PRINCIPAL FINDINGS: In this paper we investigate in detail the fate of selenoproteins, and that of selenoprotein factors, in all available arthropod genomes. We use a variety of in silico comparative genomics approaches to look for known selenoprotein genes and factors involved in selenoprotein biosynthesis. We have found that five insect species have completely lost the ability to encode selenoproteins and that selenoprotein loss in these species, although so far confined to the Endopterygota infraclass, cannot be attributed to a single evolutionary event, but rather to multiple, independent events. Loss of selenoproteins and selenoprotein factors is usually coupled to the deletion of the entire no-longer functional genomic region, rather than to sequence degradation and consequent pseudogenisation. Such dynamics of gene extinction are consistent with the high rate of genome rearrangements observed in Drosophila. We have also found that, while many selenoprotein factors are concomitantly lost with the selenoproteins, others are present and conserved in all investigated genomes, irrespective of whether they code for selenoproteins or not, suggesting that they are involved in additional, non-selenoprotein related functions. CONCLUSIONS/SIGNIFICANCE: Selenoproteins have been independently lost in several insect species, possibly as a consequence of the relaxation in insects of the selective constraints acting across metazoans to maintain selenoproteins. The dispensability of selenoproteins in insects may be related to the fundamental differences in antioxidant defense between these animals and other metazoans.The work described here is funded by grants from the Spanish Ministery of Education and Science and from the BioSapiens European Network of Excellence to RG. CEC is reciepient of a pre-doctoral fellowship from the Spanish Ministery of Education and Science

    A next-generation sequencing method for overcoming the multiple gene copy problem in polyploid phylogenetics, applied to Poa grasses

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Polyploidy is important from a phylogenetic perspective because of its immense past impact on evolution and its potential future impact on diversification, survival and adaptation, especially in plants. Molecular population genetics studies of polyploid organisms have been difficult because of problems in sequencing multiple-copy nuclear genes using Sanger sequencing. This paper describes a method for sequencing a barcoded mixture of targeted gene regions using next-generation sequencing methods to overcome these problems.</p> <p>Results</p> <p>Using 64 3-bp barcodes, we successfully sequenced three chloroplast and two nuclear gene regions (each of which contained two gene copies with up to two alleles per individual) in a total of 60 individuals across 11 species of Australian <it>Poa </it>grasses. This method had high replicability, a low sequencing error rate (after appropriate quality control) and a low rate of missing data. Eighty-eight percent of the 320 gene/individual combinations produced sequence reads, and >80% of individuals produced sufficient reads to detect all four possible nuclear alleles of the homeologous nuclear loci with 95% probability.</p> <p>We applied this method to a group of sympatric Australian alpine <it>Poa </it>species, which we discovered to share an allopolyploid ancestor with a group of American <it>Poa </it>species. All markers revealed extensive allele sharing among the Australian species and so we recommend that the current taxonomy be re-examined. We also detected hypermutation in the <it>trn</it>H-<it>psb</it>A marker, suggesting it should not be used as a land plant barcode region. Some markers indicated differentiation between Tasmanian and mainland samples. Significant positive spatial genetic structure was detected at <100 km with chloroplast but not nuclear markers, which may be a result of restricted seed flow and long-distance pollen flow in this wind-pollinated group.</p> <p>Conclusions</p> <p>Our results demonstrate that 454 sequencing of barcoded amplicon mixtures can be used to reliably sample all alleles of homeologous loci in polyploid species and successfully investigate phylogenetic relationships among species, as well as to investigate phylogeographic hypotheses. This next-generation sequencing method is more affordable than and at least as reliable as bacterial cloning. It could be applied to any experiment involving sequencing of amplicon mixtures.</p

    The natural history of, and risk factors for, progressive Chronic Kidney Disease (CKD): the Renal Impairment in Secondary care (RIISC) study; rationale and protocol

    Get PDF

    Capturing the sounds of an urban greenspace

    Get PDF
    Acoustic data can be a source of important information about events and the environment in modern cities. To date, much of the focus has been on monitoring noise pollution, but the urban soundscape contains a rich variety of signals about both human and natural phenomena. We describe the CitySounds project, which has installed enclosed sensor kits at several locations across a heavily used urban greenspace in the city of Edinburgh. The acoustic monitoring components regularly capture short clips in real-time of both ultrasonic and audible noises, for example encompassing bats, birds and other wildlife, traffic, and human. The sounds are complemented by collecting other data from sensors, such as temperature and relative humidity. To ensure privacy and compliance with relevant legislation, robust methods render completely unintelligible any traces of voice or conversation that may incidentally be overheard by the sensors. We have adopted a variety of methods to encourage community engagement with the audio data and to communicate the richness of urban soundscapes to a general audience

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
    corecore