7 research outputs found
Protein-ligand binding region prediction (PLB-SAVE) based on geometric features and CUDA acceleration
[[abstract]]Background
Protein-ligand interactions are key processes in triggering and controlling biological functions within cells. Prediction of protein binding regions on the protein surface assists in understanding the mechanisms and principles of molecular recognition. In silico geometrical shape analysis plays a primary step in analyzing the spatial characteristics of protein binding regions and facilitates applications of bioinformatics in drug discovery and design. Here, we describe the novel software, PLB-SAVE, which uses parallel processing technology and is ideally suited to extract the geometrical construct of solid angles from surface atoms. Representative clusters and corresponding anchors were identified from all surface elements and were assigned according to the ranking of their solid angles. In addition, cavity depth indicators were obtained by proportional transformation of solid angles and cavity volumes were calculated by scanning multiple directional vectors within each selected cavity. Both depth and volume characteristics were combined with various weighting coefficients to rank predicted potential binding regions.
Results
Two test datasets from LigASite, each containing 388 bound and unbound structures, were used to predict binding regions using PLB-SAVE and two well-known prediction systems, SiteHound and MetaPocket2.0 (MPK2). PLB-SAVE outperformed the other programs with accuracy rates of 94.3% for unbound proteins and 95.5% for bound proteins via a tenfold cross-validation process. Additionally, because the parallel processing architecture was designed to enhance the computational efficiency, we obtained an average of 160-fold increase in computational time.
Conclusions
In silico binding region prediction is considered the initial stage in structure-based drug design. To improve the efficacy of biological experiments for drug development, we developed PLB-SAVE, which uses only geometrical features of proteins and achieves a good overall performance for protein-ligand binding region prediction. Based on the same approach and rationale, this method can also be applied to predict carbohydrate-antibody interactions for further design and development of carbohydrate-based vaccines. PLB-SAVE is available at http://save.cs.ntou.edu.tw.[[booktype]]電子
Geometric algorithms for cavity detection on protein surfaces
Macromolecular structures such as proteins heavily empower cellular processes or functions.
These biological functions result from interactions between proteins and peptides,
catalytic substrates, nucleotides or even human-made chemicals. Thus, several
interactions can be distinguished: protein-ligand, protein-protein, protein-DNA,
and so on. Furthermore, those interactions only happen under chemical- and shapecomplementarity
conditions, and usually take place in regions known as binding sites.
Typically, a protein consists of four structural levels. The primary structure of a protein
is made up of its amino acid sequences (or chains). Its secondary structure essentially
comprises -helices and -sheets, which are sub-sequences (or sub-domains) of amino
acids of the primary structure. Its tertiary structure results from the composition of
sub-domains into domains, which represent the geometric shape of the protein. Finally,
the quaternary structure of a protein results from the aggregate of two or more
tertiary structures, usually known as a protein complex.
This thesis fits in the scope of structure-based drug design and protein docking. Specifically,
one addresses the fundamental problem of detecting and identifying protein
cavities, which are often seen as tentative binding sites for ligands in protein-ligand
interactions. In general, cavity prediction algorithms split into three main categories:
energy-based, geometry-based, and evolution-based. Evolutionary methods build upon
evolutionary sequence conservation estimates; that is, these methods allow us to detect
functional sites through the computation of the evolutionary conservation of the
positions of amino acids in proteins. Energy-based methods build upon the computation
of interaction energies between protein and ligand atoms. In turn, geometry-based algorithms
build upon the analysis of the geometric shape of the protein (i.e., its tertiary
structure) to identify cavities. This thesis focuses on geometric methods.
We introduce here three new geometric-based algorithms for protein cavity detection.
The main contribution of this thesis lies in the use of computer graphics techniques
in the analysis and recognition of cavities in proteins, much in the spirit of molecular
graphics and modeling. As seen further ahead, these techniques include field-of-view
(FoV), voxel ray casting, back-face culling, shape diameter functions, Morse theory,
and critical points. The leading idea is to come up with protein shape segmentation,
much like we commonly do in mesh segmentation in computer graphics. In practice,
protein cavity algorithms are nothing more than segmentation algorithms designed for
proteins.Estruturas macromoleculares tais como as proteínas potencializam processos ou funções
celulares. Estas funções resultam das interações entre proteínas e peptídeos, substratos
catalíticos, nucleótideos, ou até mesmo substâncias químicas produzidas pelo
homem. Assim, há vários tipos de interacções: proteína-ligante, proteína-proteína,
proteína-DNA e assim por diante. Além disso, estas interações geralmente ocorrem em
regiões conhecidas como locais de ligação (binding sites, do inglês) e só acontecem sob
condições de complementaridade química e de forma. É também importante referir que
uma proteína pode ser estruturada em quatro níveis. A estrutura primária que consiste
em sequências de aminoácidos (ou cadeias), a estrutura secundária que compreende
essencialmente por hélices e folhas , que são subsequências (ou subdomínios) dos
aminoácidos da estrutura primária, a estrutura terciária que resulta da composição de
subdomínios em domínios, que por sua vez representa a forma geométrica da proteína,
e por fim a estrutura quaternária que é o resultado da agregação de duas ou mais estruturas
terciárias. Este último nível estrutural é frequentemente conhecido por um
complexo proteico.
Esta tese enquadra-se no âmbito da conceção de fármacos baseados em estrutura e no
acoplamento de proteínas. Mais especificamente, aborda-se o problema fundamental
da deteção e identificação de cavidades que são frequentemente vistos como possíveis
locais de ligação (putative binding sites, do inglês) para os seus ligantes (ligands, do
inglês). De forma geral, os algoritmos de identificação de cavidades dividem-se em três
categorias principais: baseados em energia, geometria ou evolução. Os métodos evolutivos
baseiam-se em estimativas de conservação das sequências evolucionárias. Isto é,
estes métodos permitem detectar locais funcionais através do cálculo da conservação
evolutiva das posições dos aminoácidos das proteínas. Em relação aos métodos baseados
em energia estes baseiam-se no cálculo das energias de interação entre átomos
da proteína e do ligante. Por fim, os algoritmos geométricos baseiam-se na análise da
forma geométrica da proteína para identificar cavidades. Esta tese foca-se nos métodos
geométricos.
Apresentamos nesta tese três novos algoritmos geométricos para detecção de cavidades
em proteínas. A principal contribuição desta tese está no uso de técnicas de computação
gráfica na análise e reconhecimento de cavidades em proteínas, muito no espírito da
modelação e visualização molecular. Como pode ser visto mais à frente, estas técnicas
incluem o field-of-view (FoV), voxel ray casting, back-face culling, funções de diâmetro
de forma, a teoria de Morse, e os pontos críticos. A ideia principal é segmentar a
proteína, à semelhança do que acontece na segmentação de malhas em computação
gráfica. Na prática, os algoritmos de detecção de cavidades não são nada mais que
algoritmos de segmentação de proteínas
CavBench: a benchmark for protein cavity detection methods
Extensive research has been applied to discover new techniques and methods to model protein-ligand interactions. In particular, considerable efforts focused on identifying candidate binding sites, which quite often are active sites that correspond to protein pockets or cavities. Thus, these cavities play an important role in molecular docking. However, there is no established benchmark to assess the accuracy of new cavity detection methods. In practice, each new technique is evaluated using a small set of proteins with known binding sites as ground-truth. However, studies supported by large datasets of known cavities and/or binding sites and statistical classification (i.e., false positives, false negatives, true positives, and true negatives) would yield much stronger and reliable assessments. To this end, we propose CavBench, a generic and extensible benchmark to compare different cavity detection methods relative to diverse ground truth datasets (e.g., PDBsum) using statistical classification methods.info:eu-repo/semantics/publishedVersio