5 research outputs found

    LAMPA, LArge Multidomain Protein Annotator, and its application to RNA virus polyproteins

    Get PDF
    Motivation: To facilitate accurate estimation of statistical significance of sequence similarity in profile-profile searches, queries should ideally correspond to protein domains. For multidomain proteins, using domains as queries depends on delineation of domain borders, which may be unknown. Thus, proteins are commonly used as queries that complicate establishing homology for similarities close to cutoff levels of statistical significance.Results: In this article, we describe an iterative approach, called LAMPA, LArge Multidomain Protein Annotator, that resolves the above conundrum by gradual expansion of hit coverage of multidomain proteins through re-evaluating statistical significance of hit similarity using ever smaller queries defined at each iteration. LAMPA employs TMHMM and HHsearch for recognition of transmembrane regions and homology, respectively. We used Pfam database for annotating 2985 multidomain proteins (polyproteins) composed of >1000 amino acid residues, which dominate proteomes of RNA viruses. Under strict cutoffs, LAMPA outperformed HHsearch-mediated runs using intact polyproteins as queries by three measures: number of and coverage by identified homologous regions, and number of hit Pfam profiles. Compared to HHsearch, LAMPA identified 507 extra homologous regions in 14.4% of polyproteins. This Pfam-based annotation of RNA virus polyproteins by LAMPA was also superior to RefSeq expert annotation by two measures, region number and annotated length, for 69.3% of RNA virus polyprotein entries. We rationalized the obtained results based on dependencies of HHsearch hit statistical significance for local alignment similarity score from lengths and diversities of query-target pairs in computational experiments

    Practical application of bioinformatics by the multidisciplinary VIZIER consortium

    No full text
    This review focuses on bioinformatics technologies employed by the EU-sponsored multidisciplinary VIZIER consortium (Comparative Structural Genomics of Viral Enzymes Involved in Replication, FP6 Project: 2004-511960, active from 1 November 2004 to 30 April 2009), to achieve its goals. From the management of the information flow of the project, to bioinformatics-mediated selection of RNA viruses and prediction of protein targets, to the analysis of 3D protein structures and antiviral compounds, these technologies provided a communication framework and integrated solutions for steady and timely advancement of the project. RNA viruses form a large class of major pathogens that affect humans and domestic animals. Such RNA viruses as HIV, Influenza virus and Hepatitis C virus are of prime medical concern today, but the identities of viruses that will threaten human population tomorrow are far from certain. To contain outbreaks of common or newly emerging infections, prototype drugs against viruses representing the Virus Universe must be developed. This concept was championed by the VIZIER project which brought together experts in diverse fields to produce a concerted and sustained effort for identifying and validating targets for antivirus therapy in dozens of RNA virus lineages. (C) 2010 Elsevier B.V. All rights reserved.Molecular basis of virus replication, viral pathogenesis and antiviral strategie

    The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2

    Get PDF
    Versión preprint diponible en BioRxiv (doi: 10.1101/2020.02.07.937862) http://hdl.handle.net/10261/212994The present outbreak of a coronavirus-associated acute respiratory disease called coronavirus disease 19 (COVID-19) is the third documented spillover of an animal coronavirus to humans in only two decades that has resulted in a major epidemic. The Coronaviridae Study Group (CSG) of the International Committee on Taxonomy of Viruses, which is responsible for developing the classification of viruses and taxon nomenclature of the family Coronaviridae, has assessed the placement of the human pathogen, tentatively named 2019-nCoV, within the Coronaviridae. Based on phylogeny, taxonomy and established practice, the CSG recognizes this virus as forming a sister clade to the prototype human and bat severe acute respiratory syndrome coronaviruses (SARS-CoVs) of the species Severe acute respiratory syndrome-related coronavirus, and designates it as SARS-CoV-2. In order to facilitate communication, the CSG proposes to use the following naming convention for individual isolates: SARS-CoV-2/host/location/isolate/date. While the full spectrum of clinical manifestations associated with SARS-CoV-2 infections in humans remains to be determined, the independent zoonotic transmission of SARS-CoV and SARS-CoV-2 highlights the need for studying viruses at the species level to complement research focused on individual pathogenic viruses of immediate significance. This will improve our understanding of virus–host interactions in an ever-changing environment and enhance our preparedness for future outbreaks.Work on DEmARC advancement and coronavirus and nidovirus taxonomies was supported by the EU Horizon 2020 EVAg 653316 project and the LUMC MoBiLe program (to A.E.G.), and on coronavirus and nidovirus taxonomies by a Mercator Fellowship by the Deutsche Forschungsgemeinschaft (to A.E.G.) in the context of the SFB1021 (A01 to J.Z.).Peer reviewe
    corecore