2,758 research outputs found
AI driven B-cell Immunotherapy Design
Antibodies, a prominent class of approved biologics, play a crucial role in
detecting foreign antigens. The effectiveness of antigen neutralisation and
elimination hinges upon the strength, sensitivity, and specificity of the
paratope-epitope interaction, which demands resource-intensive experimental
techniques for characterisation. In recent years, artificial intelligence and
machine learning methods have made significant strides, revolutionising the
prediction of protein structures and their complexes. The past decade has also
witnessed the evolution of computational approaches aiming to support
immunotherapy design. This review focuses on the progress of machine
learning-based tools and their frameworks in the domain of B-cell immunotherapy
design, encompassing linear and conformational epitope prediction, paratope
prediction, and antibody design. We mapped the most commonly used data sources,
evaluation metrics, and method availability and thoroughly assessed their
significance and limitations, discussing the main challenges ahead
Structure-based drug discovery with deep learning
Artificial intelligence (AI) in the form of deep learning bears promise for
drug discovery and chemical biology, , to predict protein
structure and molecular bioactivity, plan organic synthesis, and design
molecules . While most of the deep learning efforts in drug
discovery have focused on ligand-based approaches, structure-based drug
discovery has the potential to tackle unsolved challenges, such as affinity
prediction for unexplored protein targets, binding-mechanism elucidation, and
the rationalization of related chemical kinetic properties. Advances in deep
learning methodologies and the availability of accurate predictions for protein
tertiary structure advocate for a in structure-based
approaches for drug discovery guided by AI. This review summarizes the most
prominent algorithmic concepts in structure-based deep learning for drug
discovery, and forecasts opportunities, applications, and challenges ahead
NetTurnP – Neural Network Prediction of Beta-turns by Use of Evolutionary Information and Predicted Protein Sequence Features
UNLABELLED: β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC=0.50, Qtotal=82.1%, sensitivity=75.6%, PPV=68.8% and AUC=0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17-0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. CONCLUSION: The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences
3D Convolutional Neural Networks for Identifying Protein Interfaces
Protein interaction is a fundamental part of nearly all biochemical processes and
proteins evolved specific surface regions for molecular recognition and interaction. These
regions are different from the remaining surface, with different amino acid compositions,
geometry and chemical properties. Detecting protein interfaces can lead to a better
understanding of protein interactions granting advantages to fields such as drug design
and metabolic engineering.
Most of the existing interface predictors use structured data, clearly defined data
types usually obtained from data sets. However, proteins are very complex molecules
and there is not a single property capable of distinguishing the interface from the rest of
the protein surface to all types of proteins. Indeed, deep learning arises as an adequate
approach able to capture feature from unstructured data as images, texts, sensor data
and volumes. In here, the aim was to identify interface regions in known protein spatial
structures together with their biochemical properties by exploring new applications of
3D convolutional neural networks.
For this, some state-of-the-art convolutional neural networks architectures were explored in order to find an architecture that suits this problem, and even more, have good
performance. Other state-of-the-art machine learning predictors are also considered to
identify the best biochemical properties to be added as new channels.
Afterward, the interface predictions will be compared with the ground-truth, obtained by calculating the distances of atoms between the different chains of the protein
complexes.A interação entre proteĂnas Ă© fundamental em todos os processos biolĂłgicos e bioquĂmicos. As proteĂnas sĂŁo compostas por regiões especĂficas que permitem o reconhecimento molecular e, consequentemente, interações com outras molĂ©culas. Normalmente,
estas regiões são estruturalmente diferentes da restante molécula sendo caracterizadas
e compostas por aminoácidos diferentes, propriedades quĂmicas e geometria diversa. A
detecção das interfaces das proteĂnas pode ser uma mais valia no contexto de perceber
a interação entre as mesmas e consecutivamente, ser vantajoso para o design de novos
fármacos (ou drug design) e engenharia metabólica.
As previsões de interfaces usam maioritariamente dados estruturados, ou seja, dados
bem definidos normalmente obtidos em bancos de dados. No entanto, as proteĂnas sĂŁo
moléculas complexas o que impossibilita a distinção da sua interface, uma vez que não
existe uma propriedade Ăşnica e especĂfica para todas. Deste modo, o deep learning Ă© uma
ferramenta fundamental porque usa caracterĂsticas de dados nĂŁo estruturados, como
por exemplo a informação espacial da proteĂna, imagens, textos, dados de sensores ou
volumes.
O objetivo principal deste projeto Ă© identificar regiões de interfaces atravĂ©s de estruturas tri-dimensionais de proteĂnas conhecidas juntamente com as respetivas distribuição
espacial das suas propriedades, usando redes neuronais de convolução. Neste trabalho foram estudados algoritmos de deep learning para encontrar a rede neuronal mais adequada
ao problema que pretendemos resolver com o melhor desempenho. Outros algoritmos de
previsĂŁo foram considerados para identificar quais as melhores propriedades bioquĂmicas
a serem usadas como novos canais de input.
Seguidamente, as previsões do modelo foram comparadas com as interfaces reais, que
foram obtidas pelo cálculo das distâncias dos átomos entre cadeias diferentes do mesmo
complexo
Integration of Pre-trained Protein Language Models into Geometric Deep Learning Networks
Geometric deep learning has recently achieved great success in non-Euclidean
domains, and learning on 3D structures of large biomolecules is emerging as a
distinct research area. However, its efficacy is largely constrained due to the
limited quantity of structural data. Meanwhile, protein language models trained
on substantial 1D sequences have shown burgeoning capabilities with scale in a
broad range of applications. Several previous studies consider combining these
different protein modalities to promote the representation power of geometric
neural networks, but fail to present a comprehensive understanding of their
benefits. In this work, we integrate the knowledge learned by well-trained
protein language models into several state-of-the-art geometric networks and
evaluate a variety of protein representation learning benchmarks, including
protein-protein interface prediction, model quality assessment, protein-protein
rigid-body docking, and binding affinity prediction. Our findings show an
overall improvement of 20% over baselines. Strong evidence indicates that the
incorporation of protein language models' knowledge enhances geometric
networks' capacity by a significant margin and can be generalized to complex
tasks
Homology modeling in the time of collective and artificial intelligence
Homology modeling is a method for building protein 3D structures using protein primary sequence and utilizing prior knowledge gained from structural similarities with other proteins. The homology modeling process is done in sequential steps where sequence/structure alignment is optimized, then a backbone is built and later, side-chains are added. Once the low-homology loops are modeled, the whole 3D structure is optimized and validated. In the past three decades, a few collective and collaborative initiatives allowed for continuous progress in both homology and ab initio modeling. Critical Assessment of protein Structure Prediction (CASP) is a worldwide community experiment that has historically recorded the progress in this field. Folding@Home and Rosetta@Home are examples of crowd-sourcing initiatives where the community is sharing computational resources, whereas RosettaCommons is an example of an initiative where a community is sharing a codebase for the development of computational algorithms. Foldit is another initiative where participants compete with each other in a protein folding video game to predict 3D structure. In the past few years, contact maps deep machine learning was introduced to the 3D structure prediction process, adding more information and increasing the accuracy of models significantly. In this review, we will take the reader in a journey of exploration from the beginnings to the most recent turnabouts, which have revolutionized the field of homology modeling. Moreover, we discuss the new trends emerging in this rapidly growing field.O
- …