347 research outputs found

    MGOS: A library for molecular geometry and its operating system

    Get PDF
    The geometry of atomic arrangement underpins the structural understanding of molecules in many fields. However, no general framework of mathematical/computational theory for the geometry of atomic arrangement exists. Here we present "Molecular Geometry (MG)'' as a theoretical framework accompanied by "MG Operating System (MGOS)'' which consists of callable functions implementing the MG theory. MG allows researchers to model complicated molecular structure problems in terms of elementary yet standard notions of volume, area, etc. and MGOS frees them from the hard and tedious task of developing/implementing geometric algorithms so that they can focus more on their primary research issues. MG facilitates simpler modeling of molecular structure problems; MGOS functions can be conveniently embedded in application programs for the efficient and accurate solution of geometric queries involving atomic arrangements. The use of MGOS in problems involving spherical entities is akin to the use of math libraries in general purpose programming languages in science and engineering. (C) 2019 The Author(s). Published by Elsevier B.V

    PocketPicker: analysis of ligand binding-sites with shape descriptors

    Get PDF
    Background Identification and evaluation of surface binding-pockets and occluded cavities are initial steps in protein structure-based drug design. Characterizing the active site's shape as well as the distribution of surrounding residues plays an important role for a variety of applications such as automated ligand docking or in situ modeling. Comparing the shape similarity of binding site geometries of related proteins provides further insights into the mechanisms of ligand binding. Results We present PocketPicker, an automated grid-based technique for the prediction of protein binding pockets that specifies the shape of a potential binding-site with regard to its buriedness. The method was applied to a representative set of protein-ligand complexes and their corresponding apo-protein structures to evaluate the quality of binding-site predictions. The performance of the pocket detection routine was compared to results achieved with the existing methods CAST, LIGSITE, LIGSITEcs, PASS and SURFNET. Success rates PocketPicker were comparable to those of LIGSITEcs and outperformed the other tools. We introduce a descriptor that translates the arrangement of grid points delineating a detected binding-site into a correlation vector. We show that this shape descriptor is suited for comparative analyses of similar binding-site geometry by examining induced-fit phenomena in aldose reductase. This new method uses information derived from calculations of the buriedness of potential binding-sites. Conclusions The pocket prediction routine of PocketPicker is a useful tool for identification of potential protein binding-pockets. It produces a convenient representation of binding-site shapes including an intuitive description of their accessibility. The shape-descriptor for automated classification of binding-site geometries can be used as an additional tool complementing elaborate manual inspections

    Geometric algorithms for cavity detection on protein surfaces

    Get PDF
    Macromolecular structures such as proteins heavily empower cellular processes or functions. These biological functions result from interactions between proteins and peptides, catalytic substrates, nucleotides or even human-made chemicals. Thus, several interactions can be distinguished: protein-ligand, protein-protein, protein-DNA, and so on. Furthermore, those interactions only happen under chemical- and shapecomplementarity conditions, and usually take place in regions known as binding sites. Typically, a protein consists of four structural levels. The primary structure of a protein is made up of its amino acid sequences (or chains). Its secondary structure essentially comprises -helices and -sheets, which are sub-sequences (or sub-domains) of amino acids of the primary structure. Its tertiary structure results from the composition of sub-domains into domains, which represent the geometric shape of the protein. Finally, the quaternary structure of a protein results from the aggregate of two or more tertiary structures, usually known as a protein complex. This thesis fits in the scope of structure-based drug design and protein docking. Specifically, one addresses the fundamental problem of detecting and identifying protein cavities, which are often seen as tentative binding sites for ligands in protein-ligand interactions. In general, cavity prediction algorithms split into three main categories: energy-based, geometry-based, and evolution-based. Evolutionary methods build upon evolutionary sequence conservation estimates; that is, these methods allow us to detect functional sites through the computation of the evolutionary conservation of the positions of amino acids in proteins. Energy-based methods build upon the computation of interaction energies between protein and ligand atoms. In turn, geometry-based algorithms build upon the analysis of the geometric shape of the protein (i.e., its tertiary structure) to identify cavities. This thesis focuses on geometric methods. We introduce here three new geometric-based algorithms for protein cavity detection. The main contribution of this thesis lies in the use of computer graphics techniques in the analysis and recognition of cavities in proteins, much in the spirit of molecular graphics and modeling. As seen further ahead, these techniques include field-of-view (FoV), voxel ray casting, back-face culling, shape diameter functions, Morse theory, and critical points. The leading idea is to come up with protein shape segmentation, much like we commonly do in mesh segmentation in computer graphics. In practice, protein cavity algorithms are nothing more than segmentation algorithms designed for proteins.Estruturas macromoleculares tais como as proteínas potencializam processos ou funções celulares. Estas funções resultam das interações entre proteínas e peptídeos, substratos catalíticos, nucleótideos, ou até mesmo substâncias químicas produzidas pelo homem. Assim, há vários tipos de interacções: proteína-ligante, proteína-proteína, proteína-DNA e assim por diante. Além disso, estas interações geralmente ocorrem em regiões conhecidas como locais de ligação (binding sites, do inglês) e só acontecem sob condições de complementaridade química e de forma. É também importante referir que uma proteína pode ser estruturada em quatro níveis. A estrutura primária que consiste em sequências de aminoácidos (ou cadeias), a estrutura secundária que compreende essencialmente por hélices e folhas , que são subsequências (ou subdomínios) dos aminoácidos da estrutura primária, a estrutura terciária que resulta da composição de subdomínios em domínios, que por sua vez representa a forma geométrica da proteína, e por fim a estrutura quaternária que é o resultado da agregação de duas ou mais estruturas terciárias. Este último nível estrutural é frequentemente conhecido por um complexo proteico. Esta tese enquadra-se no âmbito da conceção de fármacos baseados em estrutura e no acoplamento de proteínas. Mais especificamente, aborda-se o problema fundamental da deteção e identificação de cavidades que são frequentemente vistos como possíveis locais de ligação (putative binding sites, do inglês) para os seus ligantes (ligands, do inglês). De forma geral, os algoritmos de identificação de cavidades dividem-se em três categorias principais: baseados em energia, geometria ou evolução. Os métodos evolutivos baseiam-se em estimativas de conservação das sequências evolucionárias. Isto é, estes métodos permitem detectar locais funcionais através do cálculo da conservação evolutiva das posições dos aminoácidos das proteínas. Em relação aos métodos baseados em energia estes baseiam-se no cálculo das energias de interação entre átomos da proteína e do ligante. Por fim, os algoritmos geométricos baseiam-se na análise da forma geométrica da proteína para identificar cavidades. Esta tese foca-se nos métodos geométricos. Apresentamos nesta tese três novos algoritmos geométricos para detecção de cavidades em proteínas. A principal contribuição desta tese está no uso de técnicas de computação gráfica na análise e reconhecimento de cavidades em proteínas, muito no espírito da modelação e visualização molecular. Como pode ser visto mais à frente, estas técnicas incluem o field-of-view (FoV), voxel ray casting, back-face culling, funções de diâmetro de forma, a teoria de Morse, e os pontos críticos. A ideia principal é segmentar a proteína, à semelhança do que acontece na segmentação de malhas em computação gráfica. Na prática, os algoritmos de detecção de cavidades não são nada mais que algoritmos de segmentação de proteínas

    Characterization of 3D Voronoi Tessellation Nearest Neighbor Lipid Shells Provides Atomistic Lipid Disruption Profile of Protein Containing Lipid Membranes

    Get PDF
    Quantifying protein-induced lipid disruptions at the atomistic level is a challenging problem in membrane biophysics. Here we propose a novel 3D Voronoi tessellation nearest-atom-neighbor shell method to classify and characterize lipid domains into discrete concentric lipid shells surrounding membrane proteins in structurally heterogeneous lipid membranes. This method needs only the coordinates of the system and is independent of force fields and simulation conditions. As a proof-of-principle, we use this multiple lipid shell method to analyze the lipid disruption profiles of three simulated membrane systems: phosphatidylcholine, phosphatidylcholine/cholesterol, and beta-amyloid/phosphatidylcholine/cholesterol. We observed different atomic volume disruption mechanisms due to cholesterol and beta-amyloid. Additionally, several lipid fractional groups and lipid-interfacial water did not converge to their control values with increasing distance or shell order from the protein. This volume divergent behavior was confirmed by bilayer thickness and chain orientational order calculations. Our method can also be used to analyze high-resolution structural experimental data

    Geometrically centered region: A "wet" model of protein binding hot spots not excluding water molecules

    Full text link
    A protein interface can be as "wet" as a protein surface in terms of the number of immobilized water molecules. This important water information has not been explicitly taken by computational methods to model and identify protein binding hot spots, overlooking the water role in forming interface hydrogen bonds and in filing cavities. Hot spot residues are usually clustered at the core of the protein binding interfaces. However, traditional machine learning methods often identify the hot spot residues individually, breaking the cooperativity of the energetic contribution. Our idea in this work is to explore the role of immobilized water and meanwhile to capture two essential properties of hot spots: the compactness in contact and the far distance from bulk solvent. Our model is named geometrically centered region (GCR). The detection of GCRs is based on novel tripartite graphs, and atom burial levels which are a concept more intuitive than SASA. Applying to a data set containing 355 mutations, we achieved an F measure of 0.6414 when δδG ≥ 1.0 kcal/mol was used to define hot spots. This performance is better than Robetta, a benchmark method in the field. We found that all but only one of the GCRs contain water to a certain degree, and most of the outstanding hot spot residues have water-mediated contacts. If the water is excluded, the burial level values are poorly related to the δδG, and the model loses its performance remarkably. We also presented a definition for the O-ring of a GCR as the set of immediate neighbors of the residues in the GCR. Comparative analysis between the O-rings and GCRs reveals that the newly defined O-ring is indeed energetically less important than the GCR hot spot, confirming a long-standing hypothesis. Proteins 2010. © 2010 Wiley-Liss, Inc

    Three-dimensional alpha shapes

    Full text link
    Frequently, data in scientific computing is in its abstract form a finite point set in space, and it is sometimes useful or required to compute what one might call the ``shape'' of the set. For that purpose, this paper introduces the formal notion of the family of α\alpha-shapes of a finite point set in \Real^3. Each shape is a well-defined polytope, derived from the Delaunay triangulation of the point set, with a parameter \alpha \in \Real controlling the desired level of detail. An algorithm is presented that constructs the entire family of shapes for a given set of size nn in time O(n2)O(n^2), worst case. A robust implementation of the algorithm is discussed and several applications in the area of scientific computing are mentioned.Comment: 32 page
    corecore