Search CORE

7 research outputs found

Common low complexity regions for SARS-CoV-2 and human proteomes as potential multidirectional risk factor in vaccine development

Author: Gruca Aleksandra
Grynberg Marcin
Jarnot Patryk
Sarnowska E.A.
Sarnowski T.J.
Ziemska-Legiecka Joanna
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Background The rapid spread of the COVID-19 demands immediate response from the scientific communities. Appropriate countermeasures mean thoughtful and educated choice of viral targets (epitopes). There are several articles that discuss such choices in the SARS-CoV-2 proteome, other focus on phylogenetic traits and history of the Coronaviridae genome/proteome. However none consider viral protein low complexity regions (LCRs). Recently we created the first methods that are able to compare such fragments. Results We show that five low complexity regions (LCRs) in three proteins (nsp3, S and N) encoded by the SARS-CoV-2 genome are highly similar to regions from human proteome. As many as 21 predicted T-cell epitopes and 27 predicted B-cell epitopes overlap with the five SARS-CoV-2 LCRs similar to human proteins. Interestingly, replication proteins encoded in the central part of viral RNA are devoid of LCRs. Conclusions Similarity of SARS-CoV-2 LCRs to human proteins may have implications on the ability of the virus to counteract immune defenses. The vaccine targeted LCRs may potentially be ineffective or alternatively lead to autoimmune diseases development. These findings are crucial to the process of selection of new epitopes for drugs or vaccines which should omit such regions

IBB PAS Repository

Insights from analyses of low complexity regions with canonical methods for protein sequence comparison

Author: Gruca Aleksandra
Grynberg Marcin
Jarnot Patryk
Ziemska-Legiecka Joanna
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/08/2022
Field of study

Low complexity regions are fragments of protein sequences composed of only a few types of amino acids. These regions frequently occur in proteins and can play an important role in their functions. However, scientists are mainly focused on regions characterized by high diversity of amino acid composition. Similarity between regions of protein sequences frequently reflect functional similarity between them. In this article, we discuss strengths and weaknesses of the similarity analysis of low complexity regions using BLAST, HHblits and CD-HIT. These methods are considered to be the gold standard in protein similarity analysis and were designed for comparison of high complexity regions. However, we lack specialized methods that could be used to compare the similarity of low complexity regions. Therefore, we investigated the existing methods in order to understand how they can be applied to compare such regions. Our results are supported by exploratory study, discussion of amino acid composition and biological roles of selected examples. We show that existing methods need improvements to efficiently search for similar low complexity regions. We suggest features that have to be re-designed specifically for comparing low complexity regions: scoring matrix, multiple sequence alignment, e-value, local alignment and clustering based on a set of representative sequences. Results of this analysis can either be used to improve existing methods or to create new methods for the similarity analysis of low complexity regions

IBB PAS Repository

ZENODO

PubMed Central

Providing Molecular Characterization for Unexplained Adverse Drug Reactions: Podium Abstract

Author: Boland Miguel
Bousquet Cedric
Bresso Emmanuel
Calvier François-Élie
Coulet Adrien
Jarnot Patryk
Monnin Pierre
Smaïl-Tabbone Malika
Publication venue: HAL CCSD
Publication date: 26/07/2019
Field of study

Podium Abstract at MedInfo 2019, Lyon, FranceMining large drug-oriented knowledge graphs enables predicting Adverse Drug Reactions (ADRs). Indeed, these graphs encompass knowledge elements about the molecular mechanism of drugs (e.g. drug targets, Gene Ontology annotations, gene variations, pathways). However, only few works explored further these graphs in the search for mechanistic explanation for this type of events. We assume that features documenting molecular mechanisms that take part in the prediction are particularly interesting features, since they may provide novel knowledge for the mechanism that may be underlying an ADR. We propose to explore PGxLOD, a knowledge graph built around drugs and pharmacogenomic processes in which they are involved, through the lens of several ADR datasets, each focusing on a particular type of ADRs. Particularly, we propose to use features resulting from the exploration of PGxLOD in a prediction task where best predictive features will be considered as potential elements of explanation

HAL-Inserm

INRIA a CCSD electronic archive server

HAL-Paris 13

PlaToLoCo: the first web meta-server for visualization and annotation of low complexity regions in proteins

Author: Andrade-Navarro Miguel A
Dobson Laszlo
Dosztányi Zsuzsanna
Gruca Aleksandra
Grynberg Marcin
Hancock John M
Jarnot Patryk
Merski Matthew
Mier Pablo
Necci Marco
Paladin Lisanna
Piovesan Damiano
Promponas Vasilis J
Tosatto Silvio C E
Ziemska-Legiecka Joanna
Publication venue: 'Oxford University Press (OUP)'
Publication date: 18/05/2020
Field of study

Low complexity regions (LCRs) in protein sequences are characterized by a less diverse amino acid composition compared to typically observed sequence diversity. Recent studies have shown that LCRs may co-occur with intrinsically disordered regions, are highly conserved in many organisms, and often play important roles in protein functions and in diseases. In previous decades, several methods have been developed to identify regions with LCRs or amino acid bias, but most of them as stand-alone applications and currently there is no web-based tool which allows users to explore LCRs in protein sequences with additional functional annotations. We aim to fill this gap by providing PlaToLoCo - PLAtform of TOols for LOw COmplexity—a meta-server that integrates and collects the output of five different state-of-the-art tools for discovering LCRs and provides functional annotations such as domain detection, transmembrane segment prediction, and calculation of amino acid frequencies. In addition, the union or intersection of the results of the search on a query sequence can be obtained. By developing the PlaToLoCo meta-server, we provide the community with a fast and easily accessible tool for the analysis of LCRs with additional information included to aid the interpretation of the results. The PlaToLoCo platform is available at: http://platoloco.aei.polsl.pl/

IBB PAS Repository

Quantitative Conformational Analysis of Functionally Important Electrostatic Interactions in the Intrinsically Disordered Region of Delta Subunit of Bacterial RNA Polymerase

Author: Blackledge Martin
Dohnálek Jan
Gruca Aleksandra
Grynberg Marcin
Jarnot Patryk
Jaseňáková Zuzana
Jensen Malene Ringkjøbing
Koval' Tomáš
Krasny Libor
Kubáň Vojtěch
Padrta Petr
Srb Pavel
Vítovská Dragana
Zachrdla Milan
Ziemska-Legiecka Joanna
Šanderová Hana
Štégnerová Hana
Žídek Lukáš
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2019
Field of study

International audienceElectrostatic interactions play important roles in the functional mechanisms exploited by intrinsically disordered proteins (IDPs). The atomic resolution description of long-range and local structural propensities that can both be crucial for the function of highly charged IDPs presents significant experimental challenges. Here, we investigate the conformational behavior of the δ subunit of RNA polymerase from Bacillus subtilis whose unfolded domain is highly charged, with 7 positively charged amino acids followed by 51 acidic amino acids. Using a specifically designed analytical strategy, we identify transient contacts between the two regions using a combination of NMR paramagnetic relaxation enhancements, residual dipolar couplings (RDCs), chemical shifts, and small-angle scattering. This strategy allows the resolution of long-range and local ensemble averaged structural contributions to the experimental RDCs, and reveals that the negatively charged segment folds back onto the positively charged strand, compacting the conformational sampling of the protein while remaining highly flexible in solution. Mutation of the positively charged region abrogates the long-range contact, leaving the disordered domain in an extended conformation, possibly due to local repulsion of like-charges along the chain. Remarkably, in vitro studies show that this mutation also has a significant effect on transcription activity, and results in diminished cell fitness of the mutated bacteria in vivo. This study highlights the importance of accurately describing electrostatic interactions for understanding the functional mechanisms of IDPs

Hal - Université Grenoble Alpes

Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases

Author: 1000 Genomes Project Consortium
Adams
Adnan
Aleksandra Gruca
Alex Bateman
Andrade
Andrey V Kajava
Anisimova
Baalsrud
Baalsrud
Balzer
Bankevich
Bastiaan Star
Belser
Benson
Benson
Bentley
Bergman
Blackburn
Boehm
Bragg
Campbell
Chakraborty
Chalopin
Chen
Chen
Conesa
Das
DeBolt
Dirk Linke
Eid
Elliott
Enright
Ferone
Franco
Franzen
Futschik
Gelfand
Glenn
Gnerre
Gonzalez-Garay
Grabherr
Guo
Guo
Gymrek
Haas
Hardison
Heringa
Hoff
Holt
Hommelsheim
Houston
Howe
Hurles
Hussing
Jeffreys
Jiang
Jones
Jorda
Jorda
Jurka
Jurka
Kajava
Kajava
Kashi
Kent
Khatri
Kidwell
Kim
Kjetill S Jakobsen
Koren
Kushwaha
Lewin
Liljegren
Litt
Lobanov
Lomsadze
Luo
Marcin Grynberg
Marcotte
Marcotte
Maria Anisimova
Matsushima
Mayer
Mehta
Mier
Mier
Mier
Miguel A Andrade-Navarro
Miller
Mularoni
Myers
Nagy
Olasagasti
Ole K Tørresen
Opazo
Ossowski
Pablo Mier
Paladin
Pamjav
Patryk Jarnot
Pellegrini
Pertea
Press
Promponas
Radó-Trilla
Rhoads
Riethman
Roche
Romero
Ruitberg
Schaper
Schaper
Schaper
Schaper
Schmid
Simon
Smith
Sotero-Caio
Souvorov
Stanke
Star
Star
Stein
Stålhammar-Carlemalm
Sutherland
Szalkowski
Teeling
Tompa
Treangen
Tørresen
Tørresen
UniProt Consortium
Vasilis J Promponas
Vergnaud
Verstrepen
Wang
Watson
Weirather
Weissensteiner
Wenger
Wrobel
Yandell
Zakin
Zerbino
Zhao
Zhao
Zhou
Zhuang
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2019
Field of study

The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with ‘ready-to-use’ deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotation-deposition workflow, and that may proliferate in public database repositories affecting all downstream analyses. As a case study, we provide examples of the Atlantic cod genome, whose sequencing and assembly were hindered by a particularly high prevalence of tandem repeats. We complement this case study with examples from other species, where mis-annotations and sequencing errors have propagated into protein databases. With this review, we aim to raise the awareness level within the community of database users, and alert scientists working in the underlying workflow of database creation that the data they omit or improperly assemble may well contain important biological information valuable to others

IBB PAS Repository

Crossref

ZHAW digitalcollection

NORA - Norwegian Open Research Archives

Hal-Diderot

Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases

Author: 1000 Genomes Project Consortium
Adams
Adnan
Aleksandra Gruca
Alex Bateman
Andrade
Andrey V Kajava
Anisimova
Baalsrud
Baalsrud
Balzer
Bankevich
Bastiaan Star
Belser
Benson
Benson
Bentley
Bergman
Blackburn
Boehm
Bragg
Campbell
Chakraborty
Chalopin
Chen
Chen
Conesa
Das
DeBolt
Dirk Linke
Eid
Elliott
Enright
Ferone
Franco
Franzen
Futschik
Gelfand
Glenn
Gnerre
Gonzalez-Garay
Grabherr
Guo
Guo
Gymrek
Haas
Hardison
Heringa
Hoff
Holt
Hommelsheim
Houston
Howe
Hurles
Hussing
Jeffreys
Jiang
Jones
Jorda
Jorda
Jurka
Jurka
Kajava
Kajava
Kashi
Kent
Khatri
Kidwell
Kim
Kjetill S Jakobsen
Koren
Kushwaha
Lewin
Liljegren
Litt
Lobanov
Lomsadze
Luo
Marcin Grynberg
Marcotte
Marcotte
Maria Anisimova
Matsushima
Mayer
Mehta
Mier
Mier
Mier
Miguel A Andrade-Navarro
Miller
Mularoni
Myers
Nagy
Olasagasti
Ole K Tørresen
Opazo
Ossowski
Pablo Mier
Paladin
Pamjav
Patryk Jarnot
Pellegrini
Pertea
Press
Promponas
Radó-Trilla
Rhoads
Riethman
Roche
Romero
Ruitberg
Schaper
Schaper
Schaper
Schaper
Schmid
Simon
Smith
Sotero-Caio
Souvorov
Stanke
Star
Star
Stein
Stålhammar-Carlemalm
Sutherland
Szalkowski
Teeling
Tompa
Treangen
Tørresen
Tørresen
UniProt Consortium
Vasilis J Promponas
Vergnaud
Verstrepen
Wang
Watson
Weirather
Weissensteiner
Wenger
Wrobel
Yandell
Zakin
Zerbino
Zhao
Zhao
Zhou
Zhuang
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref