Search CORE

21 research outputs found

Delineation of functionally essential protein regions for 242 neurodevelopmental genes

Author: Brunger T
Brunklaus A
Campbell AJ
Daly MJ
Hoksza D
Iqbal S
Lal D
Macnee M
May P
Perez-Palma E
Publication venue
Publication date: 01/01/2023
Field of study

Neurodevelopmental disorders (NDDs), including severe paediatric epilepsy, autism and intellectual disabilities are heterogeneous conditions in which clinical genetic testing can often identify a pathogenic variant. For many of them, genetic therapies will be tested in this or the coming years in clinical trials. In contrast to first-generation symptomatic treatments, the new disease-modifying precision medicines require a genetic test-informed diagnosis before a patient can be enrolled in a clinical trial. However, even in 2022, most identified genetic variants in NDD genes are "variants of uncertain significance'. To safely enrol patients in precision medicine clinical trials, it is important to increase our knowledge about which regions in NDD-associated proteins can "tolerate' missense variants and which ones are "essential' and will cause a NDD when mutated. In addition, knowledge about functionally indispensable regions in the 3D structure context of proteins can also provide insights into the molecular mechanisms of disease variants. We developed a novel consensus approach that overlays evolutionary, and population based genomic scores to identify 3D essential sites (Essential3D) on protein structures. After extensive benchmarking of AlphaFold predicted and experimentally solved protein structures, we generated the currently largest expert curated protein structure set for 242 NDDs and identified 14 377 Essential3D sites across 189 gene disorders associated proteins. We demonstrate that the consensus annotation of Essential3D sites improves prioritization of disease mutations over single annotations. The identified Essential3D sites were enriched for functional features such as intermembrane regions or active sites and discovered key inter-molecule interactions in protein complexes that were otherwise not annotated. Using the currently largest autism, developmental disorders, and epilepsies exome sequencing studies including > 360 000 NDD patients and population controls, we found that missense variants at Essential3D sites are 8-fold enriched in patients.In summary, we developed a comprehensive protein structure set for 242 NDDs and identified 14377 Essential3D sites in these. All data are available at https://es-ndd.broadinstitute.org for interactive visual inspection to enhance variant interpretation and development of mechanistic hypotheses for 242 NDDs genes. The provided resources will enhance clinical variant interpretation and in silico drug target development for NDD-associated genes and encoded proteins.Peer reviewe

Kölner UniversitätsPublikationsServer

Enlighten

Helsingin yliopiston digitaalinen arkisto

Open Repository and Bibliography - Luxembourg

MultiSETTER: web server for multiple RNA structure comparison

Author: AT Willingham
BS Schuwirth
C Kemena
C Neubauer
CW Wang
D Hoksza
D Hoksza
Daniel Svozil
David Hoksza
DG Higgins
DH Mathews
DJ Klein
DK Hendrix
E Capriotti
E Capriotti
E Capriotti
EP Nawrocki
F Ferre
G He
H Berman
HM Berman
I Tinoco Jr
J Harms
JD Westbrook
MA Huynen
MG Seetin
MN Nguyen
N Saitou
O Dror
O Dror
P Cech
P Hogeweg
Petr Čech
PW Rose
R Lorenz
RR Rahrig
RR Rahrig
S Gutmann
S Kirillova
SR Holbrook
TM Schmeing
WN Moss
XJ Lu
YC Liu
YF Chang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

PDBe-KB: collaboratively defining the biological context of structural data

Author: Al-Lazikani B.
Andreini C.
Anyango S.
Armstrong D.
Barton G. J.
Bednar D.
Berka K.
Berrisford J.
Blundell T.
Brock K. P.
Carazo J. M.
Choudhary P.
Damborsky J.
David A.
Deshpande M.
Dey S.
Dunbrack R.
Fraternali F.
Gibson T.
Helmer Citterich M.
Hoksza D.
Hopf T.
Jakubec D.
Kannan N.
Krivak R.
Kumar M.
Levy E. D.
London N.
Macias J. R.
Marks D. S.
Martens L.
McGowan S. A.
McGreig J. E.
Modi V.
Nadzirin N.
Nair S. S.
Orengo C.
Parra R. G.
Pepe G.
Piovesan D.
Pravda L.
Prilusky J.
Putignano V.
Radusky L. G.
Ramasamy P.
Rausch A. O.
Recio J. F.
Reuter N.
Rodriguez L. A.
Rollins N. J.
Rosato A.
Rubach P.
Serrano L.
Singh G.
Skoda P.
Sorzano C. O. S.
Srivatsan M. M.
Sternberg M.
Stourac J.
Sulkowska J. I.
Svobodova R.
Tanweer A.
Thornton J.
Tichshenko N.
Tosatto S. C. E.
Varadi M.
Velankar S.
Vranken W.
Wass M. N.
Xue D.
Zaidman D.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2022
Field of study

The Protein Data Bank in Europe - Knowledge Base (PDBe-KB, https://pdbe-kb.org) is an open collaboration between world-leading specialist data resources contributing functional and biophysical annotations derived from or relevant to the Protein Data Bank (PDB). The goal of PDBe-KB is to place macromolecular structure data in their biological context by developing standardised data exchange formats and integrating functional annotations from the contributing partner resources into a knowledge graph that can provide valuable biological insights. Since we described PDBe-KB in 2019, there have been significant improvements in the variety of available annotation data sets and user functionality. Here, we provide an overview of the consortium, highlighting the addition of annotations such as predicted covalent binders, phosphorylation sites, effects of mutations on the protein structure and energetic local frustration. In addition, we describe a library of reusable web-based visualisation components and introduce new features such as a bulk download data service and a novel superposition service that generates clusters of superposed protein chains weekly for the whole PDB archive

ART

RNAcentral 2021: secondary structure integration, improved sequence search and new member databases

Author: Barshir R
Bateman A
Bouchard-Bourelle P
Bruford E
Cannone JJ
Chan PP
dos Santos G
Finn RD
Fishilevich S
Frankish A
Fromm B
Gorodkin J
Griffiths-Jones S
Gutell RR
Hatzigeorgiou AG
Hoksza D
Kalvari I
Karagkouni D
Karlowski WM
Kay S
Kramarz B
Lovering RC
Lowe TM
Lui LM
Ma L
Mani P
Marygold S
Mestdagh P
Mudge JM
Nawrocki EP
Panni S
Peterson KJ
Petrov A
Petrov AS
Porras P
Ramachandran S
Ribas CE
Scott M
Seal R
Seemann SE
Sweeney BA
Szymanski M
Volders P-J
Weinberg Z
Weng S
Zhang Z
Publication venue: OXFORD UNIV PRESS
Publication date: 08/01/2021
Field of study

RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences that provides a single access point to 44 RNA resources and >18 million ncRNA sequences from a wide range of organisms and RNA types. RNAcentral now also includes secondary (2D) structure information for >13 million sequences, making RNAcentral the world’s largest RNA 2D structure database. The 2D diagrams are displayed using R2DT, a new 2D structure visualization method that uses consistent, reproducible and recognizable layouts for related RNAs. The sequence similarity search has been updated with a faster interface featuring facets for filtering search results by RNA type, organism, source database or any keyword. This sequence search tool is available as a reusable web component, and has been integrated into several RNAcentral member databases, including Rfam, miRBase and snoDB. To allow for a more fine-grained assignment of RNA types and subtypes, all RNAcentral sequences have been annotated with Sequence Ontology terms. The RNAcentral database continues to grow and provide a central data resource for the RNA community. RNAcentral is freely available at https://rnacentral.org

UCL Discovery

RNAcentral 2021: secondary structure integration, improved sequence search and new member databases

RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences that provides a single access point to 44 RNA resources and >18 million ncRNA sequences from a wide range of organisms and RNA types. RNAcentral now also includes secondary (2D) structure information for >13 million sequences, making RNAcentral the world's largest RNA 2D structure database. The 2D diagrams are displayed using R2DT, a new 2D structure visualization method that uses consistent, reproducible and recognizable layouts for related RNAs. The sequence similarity search has been updated with a faster interface featuring facets for filtering search results by RNA type, organism, source database or any keyword. This sequence search tool is available as a reusable web component, and has been integrated into several RNAcentral member databases, including Rfam, miRBase and snoDB. To allow for a more fine-grained assignment of RNA types and subtypes, all RNAcentral sequences have been annotated with Sequence Ontology terms. The RNAcentral database continues to grow and provide a central data resource for the RNA community

Ghent University Academic Bibliography

Copenhagen University Research Information System

RNAcentral 2021: secondary structure integration, improved sequence search and new member databases.

RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences that provides a single access point to 44 RNA resources and >18 million ncRNA sequences from a wide range of organisms and RNA types. RNAcentral now also includes secondary (2D) structure information for >13 million sequences, making RNAcentral the world's largest RNA 2D structure database. The 2D diagrams are displayed using R2DT, a new 2D structure visualization method that uses consistent, reproducible and recognizable layouts for related RNAs. The sequence similarity search has been updated with a faster interface featuring facets for filtering search results by RNA type, organism, source database or any keyword. This sequence search tool is available as a reusable web component, and has been integrated into several RNAcentral member databases, including Rfam, miRBase and snoDB. To allow for a more fine-grained assignment of RNA types and subtypes, all RNAcentral sequences have been annotated with Sequence Ontology terms. The RNAcentral database continues to grow and provide a central data resource for the RNA community. RNAcentral is freely available at https://rnacentral.org

Ghent University Academic Bibliography

Copenhagen University Research Information System

eScholarship - University of California

Apollo (Cambridge)

PDBe-KB: collaboratively defining the biological context of structural data

Author: Al-Lazikani B
Andreini C
Anyango S
Armstrong D
Barton GJ
Bednar D
Berka K
Berrisford J
Blundell T
Brock KP
Carazo JM
Choudhary P
Damborsky J
David A
Deshpande M
Dey S
Dunbrack R
Fraternali F
Gibson T
Helmer-Citterich M
Hoksza D
Hopf T
Jakubec D
Kannan N
Krivak R
Kumar M
Levy ED
London N
Macias JR
Marks DS
Martens L
McGowan SA
McGreig JE
Modi V
Nadzirin N
Nair SS
Orengo C
Parra RG
Pepe G
Piovesan D
Pravda L
Prilusky J
Putignano V
Radusky LG
Ramasamy P
Rausch AO
Recio JF
Reuter N
Rodriguez LA
Rollins NJ
Rosato A
Rubach P
Serrano L
Singh G
Skoda P
Sorzano COS
Srivatsan MM
Sternberg M
Stourac J
Sulkowska JI
Svobodova R
Tanweer A
Thornton J
Tichshenko N
Tosatto SCE
Varadi M
Velankar S
Vranken W
Wass MN
Xue D
Zaidman D
Publication venue: 'Oxford University Press (OUP)'
Publication date: 14/10/2021
Field of study

The Protein Data Bank in Europe – Knowledge Base (PDBe-KB, https://pdbe-kb.org) is an open collaboration between world-leading specialist data resources contributing functional and biophysical annotations derived from or relevant to the Protein Data Bank (PDB). The goal of PDBe-KB is to place macromolecular structure data in their biological context by developing standardised data exchange formats and integrating functional annotations from the contributing partner resources into a knowledge graph that can provide valuable biological insights. Since we described PDBe-KB in 2019, there have been significant improvements in the variety of available annotation data sets and user functionality. Here, we provide an overview of the consortium, highlighting the addition of annotations such as predicted covalent binders, phosphorylation sites, effects of mutations on the protein structure and energetic local frustration. In addition, we describe a library of reusable web-based visualisation components and introduce new features such as a bulk download data service and a novel superposition service that generates clusters of superposed protein chains weekly for the whole PDB archive

Spiral - Imperial College Digital Repository

Metric Indexing of Protein Databases and Promising Approaches

Author: D. Hoksza
Publication venue
Publication date
Field of study

Abstract. Most widely used biological databases nowadays are nucleotide and protein ones. These databases are crucial for determination of biological functions of living organisms with respect to their DNA structure. The biological function of a protein can be derived from the similarity with another protein with known function which is stored in a database and therefore the chance of finding the biological function of given protein or DNA sequence grows with size of the database. Because of this fact, the growth is exponential which in turn calls for sublinear methods of searching these databases. Optimal solution is aligning the query sequence with all sequences in the queried database. Since aligning of two sequences is computationally expensive, fast heuristic methods (e.g. BLAST [Altschul et al., 1997]) are used although they can only approximate the optimal solution without restricting the resulting error. In this paper we try to use metric access methods (MAMs) for exact and approximate searching through protein databases. As experiments show, such a straightforward use of MAMs is not very suitable, therefore we also show possible further directions in the area of indexing protein sequences based on the so far learned facts

CiteSeerX

Metric-space search in bioinformatics

Author: Daniel P. Miranke
Frank A. M.
Hoksza D.
Karakoç E.
Waterman M. S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Delineation of functionally essential protein regions for 242 neurodevelopmental genes

Author: Brunger T
Brunklaus A
Campbell AJ
Daly MJ
Hoksza D
Iqbal S
Lal D
Macnee M
May P
Perez-Palma E
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/02/2023
Field of study

Helsingin yliopiston digitaalinen arkisto