148 research outputs found

    MSOAR 2.0: Incorporating tandem duplications into ortholog assignment based on genome rearrangement

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Ortholog assignment is a critical and fundamental problem in comparative genomics, since orthologs are considered to be functional counterparts in different species and can be used to infer molecular functions of one species from those of other species. MSOAR is a recently developed high-throughput system for assigning one-to-one orthologs between closely related species on a genome scale. It attempts to reconstruct the evolutionary history of input genomes in terms of genome rearrangement and gene duplication events. It assumes that a gene duplication event inserts a duplicated gene into the genome of interest at a random location (<it>i.e.</it>, the random duplication model). However, in practice, biologists believe that genes are often duplicated by tandem duplications, where a duplicated gene is located next to the original copy (<it>i.e.</it>, the tandem duplication model).</p> <p>Results</p> <p>In this paper, we develop MSOAR 2.0, an improved system for one-to-one ortholog assignment. For a pair of input genomes, the system first focuses on the tandemly duplicated genes of each genome and tries to identify among them those that were duplicated after the speciation (<it>i.e.</it>, the so-called inparalogs), using a simple phylogenetic tree reconciliation method. For each such set of tandemly duplicated inparalogs, all but one gene will be deleted from the concerned genome (because they cannot possibly appear in any one-to-one ortholog pairs), and MSOAR is invoked. Using both simulated and real data experiments, we show that MSOAR 2.0 is able to achieve a better sensitivity and specificity than MSOAR. In comparison with the well-known genome-scale ortholog assignment tool InParanoid, Ensembl ortholog database, and the orthology information extracted from the well-known whole-genome multiple alignment program MultiZ, MSOAR 2.0 shows the highest sensitivity. Although the specificity of MSOAR 2.0 is slightly worse than that of InParanoid in the real data experiments, it is actually better than that of InParanoid in the simulation tests.</p> <p>Conclusions</p> <p>Our preliminary experimental results demonstrate that MSOAR 2.0 is a highly accurate tool for one-to-one ortholog assignment between closely related genomes. The software is available to the public for free and included as online supplementary material.</p

    The Tetraodon nigroviridis reference transcriptome: Developmental transition, length retention and microsynteny of long non-coding RNAs in a compact vertebrate genome

    Get PDF
    Pufferfish such as fugu and tetraodon carry the smallest genomes among all vertebrates and are ideal for studying genome evolution. However, comparative genomics using these species is hindered by the poor annotation of their genomes. We performed RNA sequencing during key stages of maternal to zygotic transition of Tetraodon nigroviridis and report its first developmental transcriptome. We assembled 61,033 transcripts (23,837 loci) representing 80% of the annotated gene models and 3816 novel coding transcripts from 2667 loci. We demonstrate the similarities of gene expression profiles between pufferfish and zebrafish during maternal to zygotic transition and annotated 1120 long non-coding RNAs (lncRNAs) many of which differentially expressed during development. The promoters for 60% of the assembled transcripts result validated by CAGE-seq. Despite the extreme compaction of the tetraodon genome and the dramatic loss of transposons, the length of lncRNA exons remain comparable to that of other vertebrates and a small set of lncRNAs appears enriched for transposable elements suggesting a selective pressure acting on lncRNAs length and composition. Finally, a set of lncRNAs are microsyntenic between teleost and vertebrates, which indicates potential regulatory interactions between lncRNAs and their flanking coding genes. Our work provides a fundamental molecular resource for vertebrate comparative genomics and embryogenesis studies

    Dual Mechanism for the Translation of Subgenomic mRNA from Sindbis Virus in Infected and Uninfected Cells

    Get PDF
    Infection of BHK cells by Sindbis virus (SV) gives rise to a profound inhibition of cellular protein synthesis, whereas translation of viral subgenomic mRNA that encodes viral structural proteins, continues for hours. To gain further knowledge on the mechanism by which this subgenomic mRNA is translated, the requirements for some initiation factors (eIFs) and for the presence of the initiator AUG were examined both in infected and in uninfected cells. To this end, BHK cells were transfected with different SV replicons or with in vitro made SV subgenomic mRNAs after inactivation of some eIFs. Specifically, eIF4G was cleaved by expression of the poliovirus 2A protease (2Apro) and the alpha subunit of eIF2 was inactivated by phosphorylation induced by arsenite treatment. Moreover, cellular location of these and other translation components was analyzed in BHK infected cells by confocal microscopy. Cleavage of eIF4G by poliovirus 2Apro does not hamper translation of subgenomic mRNA in SV infected cells, but bisection of this factor blocks subgenomic mRNA translation in uninfected cells or in cell-free systems. SV infection induces phosphorylation of eIF2α, a process that is increased by arsenite treatment. Under these conditions, translation of subgenomic mRNA occurs to almost the same extent as controls in the infected cells but is drastically inhibited in uninfected cells. Notably, the correct initiation site on the subgenomic mRNA is still partially recognized when the initiation codon AUG is modified to other codons only in infected cells. Finally, immunolocalization of different eIFs reveals that eIF2 α and eIF4G are excluded from the foci, where viral RNA replication occurs, while eIF3, eEF2 and ribosomes concentrate in these regions. These findings support the notion that canonical initiation takes place when the subgenomic mRNA is translated out of the infection context, while initiation can occur without some eIFs and even at non-AUG codons in infected cells

    Bio::Homology::InterologWalk - A Perl module to build putative protein-protein interaction networks through interolog mapping

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein-protein interaction (PPI) data are widely used to generate network models that aim to describe the relationships between proteins in biological systems. The fidelity and completeness of such networks is primarily limited by the paucity of protein interaction information and by the restriction of most of these data to just a few widely studied experimental organisms. In order to extend the utility of existing PPIs, computational methods can be used that exploit functional conservation between orthologous proteins across taxa to predict putative PPIs or 'interologs'. To date most interolog prediction efforts have been restricted to specific biological domains with fixed underlying data sources and there are no software tools available that provide a generalised framework for 'on-the-fly' interolog prediction.</p> <p>Results</p> <p>We introduce <monospace>Bio::Homology::InterologWalk</monospace>, a Perl module to retrieve, prioritise and visualise putative protein-protein interactions through an orthology-walk method. The module uses orthology and experimental interaction data to generate putative PPIs and optionally collates meta-data into an Interaction Prioritisation Index that can be used to help prioritise interologs for further analysis. We show the application of our interolog prediction method to the genomic interactome of the fruit fly, <it>Drosophila melanogaster</it>. We analyse the resulting interaction networks and show that the method proposes new interactome members and interactions that are candidates for future experimental investigation.</p> <p>Conclusions</p> <p>Our interolog prediction tool employs the Ensembl Perl API and PSICQUIC enabled protein interaction data sources to generate up to date interologs 'on-the-fly'. This represents a significant advance on previous methods for interolog prediction as it allows the use of the latest orthology and protein interaction data for all of the genomes in Ensembl. The module outputs simple text files, making it easy to customise the results by post-processing, allowing the putative PPI datasets to be easily integrated into existing analysis workflows. The <monospace>Bio::Homology::InterologWalk</monospace> module, sample scripts and full documentation are freely available from the Comprehensive Perl Archive Network (CPAN) under the GNU Public license.</p

    Diversification and Molecular Evolution of ATOH8, a Gene Encoding a bHLH Transcription Factor

    Get PDF
    ATOH8 is a bHLH domain transcription factor implicated in the development of the nervous system, kidney, pancreas, retina and muscle. In the present study, we collected sequence of ATOH8 orthologues from 18 vertebrate species and 24 invertebrate species. The reconstruction of ATOH8 phylogeny and sequence analysis showed that this gene underwent notable divergences during evolution. For those vertebrate species investigated, we analyzed the gene structure and regulatory elements of ATOH8. We found that the bHLH domain of vertebrate ATOH8 was highly conserved. Mammals retained some specific amino acids in contrast to the non-mammalian orthologues. Mammals also developed another potential isoform, verified by a human expressed sequence tag (EST). Comparative genomic analyses of the regulatory elements revealed a replacement of the ancestral TATA box by CpG-islands in the eutherian mammals and an evolutionary tendency for TATA box reduction in vertebrates in general. We furthermore identified the region of the effective promoter of human ATOH8 which could drive the expression of EGFP reporter in the chicken embryo. In the opossum, both the coding region and regulatory elements of ATOH8 have some special features, such as the unique extended C-terminus encoded by the third exon and absence of both CpG islands and TATA elements in the regulatory region. Our gene mapping data showed that in human, ATOH8 was hosted in one chromosome which is a fusion product of two orthologous chromosomes in non-human primates. This unique chromosomal environment of human ATOH8 probably subjects its expression to the regulation at chromosomal level. We deduce that the great interspecific differences found in both ATOH8 gene sequence and its regulatory elements might be significant for the fine regulation of its spatiotemporal expression and roles of ATOH8, thus orchestrating its function in different tissues and organisms

    Identification of Novel Human Damage Response Proteins Targeted through Yeast Orthology

    Get PDF
    Studies in Saccharomyces cerevisiae show that many proteins influence cellular survival upon exposure to DNA damaging agents. We hypothesized that human orthologs of these S. cerevisiae proteins would also be required for cellular survival after treatment with DNA damaging agents. For this purpose, human homologs of S. cerevisiae proteins were identified and mapped onto the human protein-protein interaction network. The resulting human network was highly modular and a series of selection rules were implemented to identify 45 candidates for human toxicity-modulating proteins. The corresponding transcripts were targeted by RNA interference in human cells. The cell lines with depleted target expression were challenged with three DNA damaging agents: the alkylating agents MMS and 4-NQO, and the oxidizing agent t-BuOOH. A comparison of the survival revealed that the majority (74%) of proteins conferred either sensitivity or resistance. The identified human toxicity-modulating proteins represent a variety of biological functions: autophagy, chromatin modifications, RNA and protein metabolism, and telomere maintenance. Further studies revealed that MMS-induced autophagy increase the survival of cells treated with DNA damaging agents. In summary, we show that damage recovery proteins in humans can be identified through homology to S. cerevisiae and that many of the same pathways are represented among the toxicity modulators

    Comparative genomics of the major parasitic worms

    Get PDF
    Parasitic nematodes (roundworms) and platyhelminths (flatworms) cause debilitating chronic infections of humans and animals, decimate crop production and are a major impediment to socioeconomic development. Here we report a broad comparative study of 81 genomes of parasitic and non-parasitic worms. We have identified gene family births and hundreds of expanded gene families at key nodes in the phylogeny that are relevant to parasitism. Examples include gene families that modulate host immune responses, enable parasite migration though host tissues or allow the parasite to feed. We reveal extensive lineage-specific differences in core metabolism and protein families historically targeted for drug development. From an in silico screen, we have identified and prioritized new potential drug targets and compounds for testing. This comparative genomics resource provides a much-needed boost for the research community to understand and combat parasitic worms

    Test of lepton universality in b→sℓ+ℓ−b \rightarrow s \ell^+ \ell^- decays

    Get PDF
    The first simultaneous test of muon-electron universality using B+→K+ℓ+ℓ−B^{+}\rightarrow K^{+}\ell^{+}\ell^{-} and B0→K∗0ℓ+ℓ−B^{0}\rightarrow K^{*0}\ell^{+}\ell^{-} decays is performed, in two ranges of the dilepton invariant-mass squared, q2q^{2}. The analysis uses beauty mesons produced in proton-proton collisions collected with the LHCb detector between 2011 and 2018, corresponding to an integrated luminosity of 9 fb−1\mathrm{fb}^{-1}. Each of the four lepton universality measurements reported is either the first in the given q2q^{2} interval or supersedes previous LHCb measurements. The results are compatible with the predictions of the Standard Model.Comment: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2022-046.html (LHCb public pages
    • 

    corecore