4 research outputs found

    A global optimization algorithm for protein surface alignment

    Get PDF
    Background A relevant problem in drug design is the comparison and recognition of protein binding sites. Binding sites recognition is generally based on geometry often combined with physico-chemical properties of the site since the conformation, size and chemical composition of the protein surface are all relevant for the interaction with a specific ligand. Several matching strategies have been designed for the recognition of protein-ligand binding sites and of protein-protein interfaces but the problem cannot be considered solved. Results In this paper we propose a new method for local structural alignment of protein surfaces based on continuous global optimization techniques. Given the three-dimensional structures of two proteins, the method finds the isometric transformation (rotation plus translation) that best superimposes active regions of two structures. We draw our inspiration from the well-known Iterative Closest Point (ICP) method for three-dimensional (3D) shapes registration. Our main contribution is in the adoption of a controlled random search as a more efficient global optimization approach along with a new dissimilarity measure. The reported computational experience and comparison show viability of the proposed approach. Conclusions Our method performs well to detect similarity in binding sites when this in fact exists. In the future we plan to do a more comprehensive evaluation of the method by considering large datasets of non-redundant proteins and applying a clustering technique to the results of all comparisons to classify binding sites

    Protein Binding Ligand Prediction Using Moments-Based Methods

    Get PDF
    Abstract Structural genomics initiatives have started to accumulate protein structures of unknown function in an increasing pace. Conventional sequence-based function prediction methods are not able to provide useful function information to most of such structures. Thus, structure-based approaches have been developed, which predict function of proteins by capturing structural characteristics of functional sites. Particularly, several approaches have been proposed to identify potential ligand binding sites in a query protein structure and to compare them with known ligand binding sites. In this chapter, we introduce computational methods for describing and comparing ligand binding sites using two dimensional and three dimensional moments. An advantage of moment-based methods is that the tertiary structure of pocket shapes is described compactly as a vector of coefficients of series expansion. Thus a search against an entire PDB-scale database can be performed in real-time. We evaluate two binding pocket representations, one based on two-dimensional pseudo-Zernike moments and the other based on threedimensional Zernike moments. A new development of pocket comparison method is also mentioned, which allows partial matching of pockets by using local patch descriptors

    Protein contour modelling and computation for complementarity detection and docking

    Get PDF
    The aim of this thesis is the development and application of a model that effectively and efficiently integrates the evaluation of geometric and electrostatic complementarity for the protein-protein docking problem. Proteins perform their biological roles by interacting with other biomolecules and forming macromolecular complexes. The structural characterization of protein complexes is important to understand the underlying biological processes. Unfortunately, there are several limitations to the available experimental techniques, leaving the vast majority of these complexes to be determined by means of computational methods such as protein-protein docking. The ultimate goal of the protein-protein docking problem is the in silico prediction of the three-dimensional structure of complexes of two or more interacting proteins, as occurring in living organisms, which can later be verified in vitro or in vivo. These interactions are highly specific and take place due to the simultaneous formation of multiple weak bonds: the geometric complementarity of the contours of the interacting molecules is a fundamental requirement in order to enable and maintain these interactions. However, shape complementarity alone cannot guarantee highly accurate docking predictions, as there are several physicochemical factors, such as Coulomb potentials, van der Waals forces and hydrophobicity, affecting the formation of protein complexes. In order to set up correct and efficient methods for the protein-protein docking, it is necessary to provide a unique representation which integrates geometric and physicochemical criteria in the complementarity evaluation. To this end, a novel local surface descriptor, capable of capturing both the shape and electrostatic distribution properties of macromolecular surfaces, has been designed and implemented. The proposed methodology effectively integrates the evaluation of geometrical and electrostatic distribution complementarity of molecular surfaces, while maintaining efficiency in the descriptor comparison phase. The descriptor is based on the 3D Zernike invariants which possess several attractive features, such as a compact representation, rotational and translational invariance and have been shown to adequately capture global and local protein surface shape similarity and naturally represent physicochemical properties on the molecular surface. Locally, the geometric similarity between two portions of protein surface implies a certain degree of complementarity, but the same cannot be stated about electrostatic distributions. Complementarity in electrostatic distributions is more complex to handle, as charges must be matched with opposite ones even if they do not have the same magnitude. The proposed method overcomes this limitation as follows. From a unique electrostatic distribution function, two separate distribution functions are obtained, one for the positive and one for the negative charges, and both functions are normalised in [0, 1]. Descriptors are computed separately for the positive and negative charge distributions, and complementarity evaluation is then done by cross-comparing descriptors of distributions of charges of opposite signs. The proposed descriptor uses a discrete voxel-based representation of the Connolly surface on which the corresponding electrostatic potentials have been mapped. Voxelised surface representations have received a lot of interest in several bioinformatics and computational biology applications as a simple and effective way of jointly representing geometric and physicochemical properties of proteins and other biomolecules by mapping auxiliary information in each voxel. Moreover, the voxel grid can be defined at different resolutions, thus giving the means to effectively control the degree of detail in the discrete representation along with the possibility of producing multiple representations of the same molecule at different resolutions. A specific algorithm has been designed for the efficient computation of voxelised macromolecular surfaces at arbitrary resolutions, starting from experimentally-derived structural data (X-ray crystallography, NMR spectroscopy or cryo-electron microscopy). Fast surface generation is achieved by adapting an approximate Euclidean Distance Transform algorithm in the Connolly surface computation step and by exploiting the geometrical relationship between the latter and the Solvent Accessible surface. This algorithm is at the base of VoxSurf (Voxelised Surface calculation program), a tool which can produce discrete representations of macromolecules at very high resolutions starting from the three-dimensional information of their corresponding PDB files. By employing compact data structures and implementing a spatial slicing protocol, the proposed tool can calculate the three main molecular surfaces at high resolutions with limited memory demands. To reduce the surface computation time without affecting the accuracy of the representation, two parallel algorithms for the computation of voxelised macromolecular surfaces, based on a spatial slicing procedure, have been introduced. The molecule is sliced in a user-defined number of parts and the portions of the overall surface can be calculated for each slice in parallel. The molecule is sliced with planes perpendicular to the abscissa axis of the Cartesian coordinate system defined in the molecule's PDB entry. The first algorithms uses an overlapping margin of one probe-sphere radius length among slices in order to guarantee the correctness of the Euclidean Distance Transform. Because of this margin, the Connolly surface can be computed nearly independently for each slice. Communications among processes are necessary only during the pocket identification procedure which ensures that pockets spanning through more than one slice are correctly identified and discriminated from solvent-excluded cavities inside the molecule. In the second parallel algorithm the size of the overlapping margin between slices has been reduced to a one-voxel length by adapting a multi-step region-growing Euclidean Distance Transform algorithm. At each step, distance values are first calculated independently for every slice, then, a small portion of the borders' information is exchanged between adjacent slices. The proposed methodologies will serve as a basis for a full-fledged protein-protein docking protocol based on local feature matching. Rigorous benchmark tests have shown that the combined geometric and electrostatic descriptor can effectively identify shape and electrostatic distribution complementarity in the binding sites of protein-protein complexes, by efficiently comparing circular surface patches and significantly decreasing the number of false positives obtained when using a purely-geometric descriptor. In the validation experiments, the contours of the two interacting proteins are divided in circular patches: all possible patch pairs from the two proteins are then evaluated in terms of complementarity and a general ranking is produced. Results show that native patch pairs obtain higher ranks when using the newly proposed descriptor, with respect to the ranks obtained when using the purely-geometric one

    Cavity Detection and Matching for Binding Site Recognition

    Get PDF
    AbstractWe developed a suite of methods for the problem of protein binding site recognition, based on a representation of the protein structures by a collection of spin-images. A procedure for cavity detection is coupled with a method previously developed for the recognition of similar regions in two proteins, and applied to the comparison of two protein’s cavities, the all-to-all pairwise comparison of a set of cavities, and the recognition of multiple binding sites in one cavity. All the presented methods can be used to screen large collections of proteins.The detection of cavities in a given protein is often the preliminary step in protein binding site recognition, since binding sites usually lie in cavities. The comparison of two cavities identifies two similar regions in the two cavities, and hints at a common functional structure when one or both regions include a binding site. The all-to-all pairwise comparison of a set of cavities is clustered according to the measure of similarity of the cavities, obtaining a clustering that groups together cavities with the same binding sites, when their structures are similar enough. Recognition of multiple binding sites in one cavity is performed by the comparison of a cavity, called background cavity, with a dataset of cavities, and clustering its residues that match the residues of other cavities in the data set. The four methods are benchmarked on different databases, and their effectiveness is discussed
    corecore