7 research outputs found
MEDock: a web server for efficient prediction of ligand binding sites based on a novel optimization algorithm
The prediction of ligand binding sites is an essential part of the drug discovery process. Knowing the location of binding sites greatly facilitates the search for hits, the lead optimization process, the design of site-directed mutagenesis experiments and the hunt for structural features that influence the selectivity of binding in order to minimize the drug's adverse effects. However, docking is still the rate-limiting step for such predictions; consequently, much more efficient algorithms are required. In this article, the design of the MEDock web server is described. The goal of this sever is to provide an efficient utility for predicting ligand binding sites. The MEDock web server incorporates a global search strategy that exploits the maximum entropy property of the Gaussian probability distribution in the context of information theory. As a result of the global search strategy, the optimization algorithm incorporated in MEDock is significantly superior when dealing with very rugged energy landscapes, which usually have insurmountable barriers. This article describes four different benchmark cases that span a diverse set of different types of ligand binding interactions. These benchmarks were compared with the use of the Lamarckian genetic algorithm (LGA), which is the major workhorse of the well-known AutoDock program. These results demonstrate that MEDock consistently converged to the correct binding modes with significantly smaller numbers of energy evaluations than the LGA required. When judged by a threshold of the number of energy evaluations consumed in the docking simulation, MEDock also greatly elevates the rate of accurate predictions for all benchmark cases. MEDock is available at and
A Study on Expediting Analysis of Protein Substructures
分析蛋白質(protein)與配體(ligand)之間的交互作用, 在藥物設計是一項重要的議題, 為了能夠得到詳細而且精確的分析結果必須計算原子之間的自由能(free energy)並牽涉到熱力學甚至量子力學; 然而這些計算的時間複雜度都非常地高, 因此在電腦輔助製藥上, 經常利用分析蛋白質和配體的立體結構進行過濾, 藉此加速分析的速度. 在這方面有一個值得注意的現象就是, 大部分蛋白質與配體的結合, 往往取決於蛋白質表面一小部分的子結構, 而蛋白質中大部份的部位對於結合過程則沒有決定性的影響, 因此, 如果能夠快速找出蛋白質三級結構中位於表面或是凹槽部位的胺基酸, 將會有助於加速整個分析的過程. 在這篇論文中, 我們提出了一個時間複雜度為的過濾演算法, 其中表示蛋白質的胺基酸個數. 這篇論文所提出的演算法利用本實驗室最近開發的核心密度預測演算法(kernel density estimation algorithm)為基礎, 跟電腦圖學(computer graphics)方面常用來尋找三維立體模型表面的-hull演算法比較起來, 本演算法將時間複雜度從降為, 使得過濾程式的執行效能有了明顯的改善. 實驗結果顯示經過我們提出的演算法過濾之後, 可以在完全沒有降低準確度的情形下, 加速整個分析的速度從24.91倍到83.53倍. 本論文同時實作所提出的演算法並開發成一套軟體工具, 可以用來在PDB(Protein Data Bank)資料庫中尋找具有相似子結構的蛋白質, 這套工具的結果可以提供給生物學家一些有用的線索進行更進一步的研究工作.One of the fundamental issues in drug design is analysis of protein-ligand interactions. The detailed and accurate analysis of protein-ligand interactions involves calculation of binding free energy based on thermodynamics and even quantum mechanics. However, this approach is highly expensive in terms of computing time. As a result, conformational and structural analysis of proteins and ligands has been widely employed as a screening process in computer-aided drug design. One interesting observation in this regard is that for many applications only the substructures on the contour of a protein are of significance. Therefore, in order to expedite the analysis process, it is desirable to incorporate a mechanism that can effectively extract the residues in the proximities of the caves of protein tertiary structures. In this thesis, an efficient filtering process with time complexity is proposed, where is the number of residues in the protein. In comparison with the -hull algorithm, which is a widely used algorithm in computer graphics for identifying those instances on the contour of a 3-dimensional object, the filtering process employed in this paper features a lower time complexity, versus . The low time complexity of the proposed filtering process is due to a novel kernel density estimation algorithm. Experimental results revealed that the proposed filtering mechanism is able to speed up the analysis process by a factor ranging from 24.91 to 83.53 times without trading off the accuracy of analysis. The software package developed with the mechanism proposed in this thesis has been applied to search for proteins containing a similar binding site to a well-studied crystal structure in PDB(Protein Data Bank). The experimental results provide the biochemists with some valuable clues for in-depth studies.1 導論
2 相關研究
2.1 相似度分析 . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 胺基酸相似性距陣 . . . . . . . . . . . . . . . . . . . . . 9
3 演算法
3.1 流程 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 分析蛋白質 . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 過濾胺基酸 . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4 Geometric Hashing . . . . . . . . . . . . . . . . . . . 16
3.5 Refinement . . . . . . . . . . . . . . . . . . . . . . . . 17
4 軟體實作
4.1 過濾胺基酸 . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Geometric Hashing . . . . . . . . . . . . . . . . . . . 20
4.3 評分 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.4 Refinement . . . . . . . . . . . . . . . . . . . . . . . . 23
5 實驗 26
6 結論 3