2,758 research outputs found

    AI driven B-cell Immunotherapy Design

    Full text link
    Antibodies, a prominent class of approved biologics, play a crucial role in detecting foreign antigens. The effectiveness of antigen neutralisation and elimination hinges upon the strength, sensitivity, and specificity of the paratope-epitope interaction, which demands resource-intensive experimental techniques for characterisation. In recent years, artificial intelligence and machine learning methods have made significant strides, revolutionising the prediction of protein structures and their complexes. The past decade has also witnessed the evolution of computational approaches aiming to support immunotherapy design. This review focuses on the progress of machine learning-based tools and their frameworks in the domain of B-cell immunotherapy design, encompassing linear and conformational epitope prediction, paratope prediction, and antibody design. We mapped the most commonly used data sources, evaluation metrics, and method availability and thoroughly assessed their significance and limitations, discussing the main challenges ahead

    Structure-based drug discovery with deep learning

    Get PDF
    Artificial intelligence (AI) in the form of deep learning bears promise for drug discovery and chemical biology, e.g.\textit{e.g.}, to predict protein structure and molecular bioactivity, plan organic synthesis, and design molecules de novo\textit{de novo}. While most of the deep learning efforts in drug discovery have focused on ligand-based approaches, structure-based drug discovery has the potential to tackle unsolved challenges, such as affinity prediction for unexplored protein targets, binding-mechanism elucidation, and the rationalization of related chemical kinetic properties. Advances in deep learning methodologies and the availability of accurate predictions for protein tertiary structure advocate for a renaissance\textit{renaissance} in structure-based approaches for drug discovery guided by AI. This review summarizes the most prominent algorithmic concepts in structure-based deep learning for drug discovery, and forecasts opportunities, applications, and challenges ahead

    NetTurnP – Neural Network Prediction of Beta-turns by Use of Evolutionary Information and Predicted Protein Sequence Features

    Get PDF
    UNLABELLED: β-turns are the most common type of non-repetitive structures, and constitute on average 25% of the amino acids in proteins. The formation of β-turns plays an important role in protein folding, protein stability and molecular recognition processes. In this work we present the neural network method NetTurnP, for prediction of two-class β-turns and prediction of the individual β-turn types, by use of evolutionary information and predicted protein sequence features. It has been evaluated against a commonly used dataset BT426, and achieves a Matthews correlation coefficient of 0.50, which is the highest reported performance on a two-class prediction of β-turn and not-β-turn. Furthermore NetTurnP shows improved performance on some of the specific β-turn types. In the present work, neural network methods have been trained to predict β-turn or not and individual β-turn types from the primary amino acid sequence. The individual β-turn types I, I', II, II', VIII, VIa1, VIa2, VIba and IV have been predicted based on classifications by PROMOTIF, and the two-class prediction of β-turn or not is a superset comprised of all β-turn types. The performance is evaluated using a golden set of non-homologous sequences known as BT426. Our two-class prediction method achieves a performance of: MCC=0.50, Qtotal=82.1%, sensitivity=75.6%, PPV=68.8% and AUC=0.864. We have compared our performance to eleven other prediction methods that obtain Matthews correlation coefficients in the range of 0.17-0.47. For the type specific β-turn predictions, only type I and II can be predicted with reasonable Matthews correlation coefficients, where we obtain performance values of 0.36 and 0.31, respectively. CONCLUSION: The NetTurnP method has been implemented as a webserver, which is freely available at http://www.cbs.dtu.dk/services/NetTurnP/. NetTurnP is the only available webserver that allows submission of multiple sequences

    3D Convolutional Neural Networks for Identifying Protein Interfaces

    Get PDF
    Protein interaction is a fundamental part of nearly all biochemical processes and proteins evolved specific surface regions for molecular recognition and interaction. These regions are different from the remaining surface, with different amino acid compositions, geometry and chemical properties. Detecting protein interfaces can lead to a better understanding of protein interactions granting advantages to fields such as drug design and metabolic engineering. Most of the existing interface predictors use structured data, clearly defined data types usually obtained from data sets. However, proteins are very complex molecules and there is not a single property capable of distinguishing the interface from the rest of the protein surface to all types of proteins. Indeed, deep learning arises as an adequate approach able to capture feature from unstructured data as images, texts, sensor data and volumes. In here, the aim was to identify interface regions in known protein spatial structures together with their biochemical properties by exploring new applications of 3D convolutional neural networks. For this, some state-of-the-art convolutional neural networks architectures were explored in order to find an architecture that suits this problem, and even more, have good performance. Other state-of-the-art machine learning predictors are also considered to identify the best biochemical properties to be added as new channels. Afterward, the interface predictions will be compared with the ground-truth, obtained by calculating the distances of atoms between the different chains of the protein complexes.A interação entre proteínas é fundamental em todos os processos biológicos e bioquímicos. As proteínas são compostas por regiões específicas que permitem o reconhecimento molecular e, consequentemente, interações com outras moléculas. Normalmente, estas regiões são estruturalmente diferentes da restante molécula sendo caracterizadas e compostas por aminoácidos diferentes, propriedades químicas e geometria diversa. A detecção das interfaces das proteínas pode ser uma mais valia no contexto de perceber a interação entre as mesmas e consecutivamente, ser vantajoso para o design de novos fármacos (ou drug design) e engenharia metabólica. As previsões de interfaces usam maioritariamente dados estruturados, ou seja, dados bem definidos normalmente obtidos em bancos de dados. No entanto, as proteínas são moléculas complexas o que impossibilita a distinção da sua interface, uma vez que não existe uma propriedade única e específica para todas. Deste modo, o deep learning é uma ferramenta fundamental porque usa características de dados não estruturados, como por exemplo a informação espacial da proteína, imagens, textos, dados de sensores ou volumes. O objetivo principal deste projeto é identificar regiões de interfaces através de estruturas tri-dimensionais de proteínas conhecidas juntamente com as respetivas distribuição espacial das suas propriedades, usando redes neuronais de convolução. Neste trabalho foram estudados algoritmos de deep learning para encontrar a rede neuronal mais adequada ao problema que pretendemos resolver com o melhor desempenho. Outros algoritmos de previsão foram considerados para identificar quais as melhores propriedades bioquímicas a serem usadas como novos canais de input. Seguidamente, as previsões do modelo foram comparadas com as interfaces reais, que foram obtidas pelo cálculo das distâncias dos átomos entre cadeias diferentes do mesmo complexo

    Integration of Pre-trained Protein Language Models into Geometric Deep Learning Networks

    Full text link
    Geometric deep learning has recently achieved great success in non-Euclidean domains, and learning on 3D structures of large biomolecules is emerging as a distinct research area. However, its efficacy is largely constrained due to the limited quantity of structural data. Meanwhile, protein language models trained on substantial 1D sequences have shown burgeoning capabilities with scale in a broad range of applications. Several previous studies consider combining these different protein modalities to promote the representation power of geometric neural networks, but fail to present a comprehensive understanding of their benefits. In this work, we integrate the knowledge learned by well-trained protein language models into several state-of-the-art geometric networks and evaluate a variety of protein representation learning benchmarks, including protein-protein interface prediction, model quality assessment, protein-protein rigid-body docking, and binding affinity prediction. Our findings show an overall improvement of 20% over baselines. Strong evidence indicates that the incorporation of protein language models' knowledge enhances geometric networks' capacity by a significant margin and can be generalized to complex tasks

    Homology modeling in the time of collective and artificial intelligence

    Get PDF
    Homology modeling is a method for building protein 3D structures using protein primary sequence and utilizing prior knowledge gained from structural similarities with other proteins. The homology modeling process is done in sequential steps where sequence/structure alignment is optimized, then a backbone is built and later, side-chains are added. Once the low-homology loops are modeled, the whole 3D structure is optimized and validated. In the past three decades, a few collective and collaborative initiatives allowed for continuous progress in both homology and ab initio modeling. Critical Assessment of protein Structure Prediction (CASP) is a worldwide community experiment that has historically recorded the progress in this field. Folding@Home and Rosetta@Home are examples of crowd-sourcing initiatives where the community is sharing computational resources, whereas RosettaCommons is an example of an initiative where a community is sharing a codebase for the development of computational algorithms. Foldit is another initiative where participants compete with each other in a protein folding video game to predict 3D structure. In the past few years, contact maps deep machine learning was introduced to the 3D structure prediction process, adding more information and increasing the accuracy of models significantly. In this review, we will take the reader in a journey of exploration from the beginnings to the most recent turnabouts, which have revolutionized the field of homology modeling. Moreover, we discuss the new trends emerging in this rapidly growing field.O
    • …
    corecore