445 research outputs found
Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening
This work introduces a number of algebraic topology approaches, such as
multicomponent persistent homology, multi-level persistent homology and
electrostatic persistence for the representation, characterization, and
description of small molecules and biomolecular complexes. Multicomponent
persistent homology retains critical chemical and biological information during
the topological simplification of biomolecular geometric complexity.
Multi-level persistent homology enables a tailored topological description of
inter- and/or intra-molecular interactions of interest. Electrostatic
persistence incorporates partial charge information into topological
invariants. These topological methods are paired with Wasserstein distance to
characterize similarities between molecules and are further integrated with a
variety of machine learning algorithms, including k-nearest neighbors, ensemble
of trees, and deep convolutional neural networks, to manifest their descriptive
and predictive powers for chemical and biological problems. Extensive numerical
experiments involving more than 4,000 protein-ligand complexes from the PDBBind
database and near 100,000 ligands and decoys in the DUD database are performed
to test respectively the scoring power and the virtual screening power of the
proposed topological approaches. It is demonstrated that the present approaches
outperform the modern machine learning based methods in protein-ligand binding
affinity predictions and ligand-decoy discrimination
Structural and Functional Analysis of Multi-Interface Domains
10.1371/journal.pone.0050821PLoS ONE71
Multiscale Simulation and Analysis of Structured Ribonucleic Acids
I present the results of three projects in the course of my scientific work in the context of native structure-based models (SBMs) for regulatory RNA. They comprise a new and openly accessible software implementation of native structure-based model generation and evaluation, a study that employs a multiscale model to investigate cotranscriptional riboswitch folding and advances to a novel approach in the field of RNA tertiary structure prediction
Engineering naturally occurring trans-acting non-coding RNAs to sense molecular signals
Non-coding RNAs (ncRNAs) are versatile regulators in cellular networks. While most trans-acting ncRNAs possess well-defined mechanisms that can regulate transcription or translation, they generally lack the ability to directly sense cellular signals. In this work, we describe a set of design principles for fusing ncRNAs to RNA aptamers to engineer allosteric RNA fusion molecules that modulate the activity of ncRNAs in a ligand-inducible way in Escherichia coli. We apply these principles to ncRNA regulators that can regulate translation (IS10 ncRNA) and transcription (pT181 ncRNA), and demonstrate that our design strategy exhibits high modularity between the aptamer ligand-sensing motif and the ncRNA target-recognition motif, which allows us to reconfigure these two motifs to engineer orthogonally acting fusion molecules that respond to different ligands and regulate different targets in the same cell. Finally, we show that the same ncRNA fused with different sensing domains results in a sensory-level NOR gate that integrates multiple input signals to perform genetic logic. These ligand-sensing ncRNA regulators provide useful tools to modulate the activity of structurally related families of ncRNAs, and building upon the growing body of RNA synthetic biology, our ability to design aptamer–ncRNA fusion molecules offers new ways to engineer ligand-sensing regulatory circuits
Recommended from our members
On the origins of enzyme inhibitor selectivity and promiscuity: a case study of protein kinase binding to staurosporine
Protein kinases are important regulatory enzymes in signal transduction and in cell regulation. Understanding inhibition mechanisms of kinases is important for the further development of new therapies for cancer and inflammatory diseases. I have developed a statistical approach based on the Mantel test to find the relationship between the shapes of ATP binding sites and their affinities for inhibitors. My shape-based dendrogram shows clustering of the kinases based on similarity in shape. I investigate the pocket in terms of conservation of surrounding amino acids and atoms in order to identify the key determinants of ligand binding. I find that the most conserved regions are the main chain atoms in the hinge region and I show that the tetrahydropyran ring of staurosporine causes induced-fit of the glycine rich loop. I apply multiple linear regression to select distances measured between the distinctive parts of residues which correlate with the binding constants. This method allows me to understand the importance of the size of the gatekeeper residue and the closure between the first glycine of the GXGXXG motif and the aspartate of the DFG loop, which act together to promote tight binding to staurosporine. I also find that the greater the number of hydrogen bonds made by the kinase around the methylamine group of staurosporine, the tighter the binding to staurosporine. The website I have developed allows a better understanding of cross reactivity and may be useful for narrowing down the options for a synthetic strategy to design kinase inhibitors.This work was supported by the Royal Thai Government
FUNCTION-DRIVEN APPROACHES TO THE DESIGN OF OPTOGENETIC TOOLS
Proteins play a wide variety of roles in biology despite being produced from a small set of common subunits; this commonality can be exploited to understand the dynamics by which proteins fold into structures and perform their manifold functions and, subsequently, design new proteins for use both in research and as nanoscale machines in industry. While this design process has classically involved residue-level redesign of existing protein backbones and, more recently, the de novo design of backbones according to geometrical parameters, the increasing complexity of optogenetic photosystems, biosensors, and other mechanisms for making use of proteins with specific functions has established a need for a design protocol that can reconcile their various structural exigencies with the function-specific elements of as wide an array of proteins as possible in order to make best use of them. Requirement-driven design eschews specific structural templates in favor of general requirements dependent on the intended function of the design, and so can exploit the vastness of protein structural space in finding solutions to increasingly complex design problems. Here, we present three new advances in the requirement-driven design of proteins as diagnostic tools, including a more general photosystem for the direct optogenetic control of protein-protein interactions, a series of algorithmic improvements to the leading implementation of requirement-driven design in the Rosetta macromolecular design software suite, and a new version of that algorithm capable of performing requirement-driven backbone design and residue-level backbone optimization simultaneously. These technologies collectively represent a significant improvement in our ability to control the activity of proteins with a wide variety of control schemes and produce functional proteins for arbitrary requirement sets more generally.Doctor of Philosoph
Efficient search and comparison algorithms for 3D protein binding site retrieval and structure alignment from large-scale databases
Finding similar 3D structures is crucial for discovering potential structural, evolutionary, and functional relationships among proteins. As the number of known protein structures has dramatically increased, traditional methods can no longer provide the life science community with the adequate informatics capability needed to conduct large-scale and complex analyses. A suite of high-throughput and accurate protein structure search and comparison methods is essential. To meet the needs of the community, we develop several bioinformatics methods for protein binding site comparison and global structure alignment. First, we developed an efficient protein binding site search that is based on extracting geometric features both locally and globally. The main idea of this work was to capture spatial relationships among landmarks of binding site surfaces and bfuild a vocabulary of visual words to represent the characteristics of the surfaces. A vector model was then used to speed up the search of similar surfaces that share similar visual words with the query interface. Second, we developed an approach for accurate protein binding site comparison. Our algorithm provides an accurate binding site alignment by applying a two-level heuristic process which progressively refines alignment results from coarse surface point level to accurate residue atom level. This setting allowed us to explore different combinations of pairs of corresponding residues, thus improving the alignment quality of the binding site surfaces. Finally, we introduced a parallel algorithm for global protein structure alignment. Specifically, to speed up the time-consuming structure alignment process of protein 3D structures, we designed a parallel protein structure alignment framework to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, the framework is capable of parallelizing traditional structure alignment algorithms. Our findings can be applied in various research areas, such as prediction of protein inte
Study of complex RNA function modulated by small molecules: the development of RNA directed small molecule library and probing the S-adenosyl methionine discrimination between on and off conformational states of the SAM-I riboswitch
RNA recently remained unexploited and is now drawing interest as a potential drug target. The methodology and available drug libraries for RNA targeting/screening are in rudimentary stages. The interactions made by ligands with RNA can be explored for RNA based drug development. The dissertation is composed of 4 chapters. The first chapter focuses on the structural features of RNA and the attempts made to target RNA previously. The second chapter focuses on the development of a small molecule library enriched with substructures derived from RNA binding ligands. For this study a fragment-based approach (fragment based approach is detailed in chapter 2) is used in order to accommodate the conformational flexibility of RNA. The library molecules are used for screening against suitable RNA targets using NMR. We identified at least 5 ligands out of which 2 are novel ligands binding to the ribosomal 16s rRNA. The third chapter is focused on the role of small molecules in inducing conformational changes in an RNA genetic regulatory element called the S-Adenosyl methionine (SAM) SAM-I riboswitch. The mechanistic features of the SAM-I riboswitch to understand the basis for specificity and discrimination and its gene regulation mechanism are reported. To address the conformational dynamics Bacillus subtilis and Thermoanearobacter tencongenesis SAM-I riboswitches in response to SAM binding several conformer mimics are designed, synthesized and characterized using NMR, equilibrium dialysis, and inline probing. The study shows that apart from the conserved residues of the binding pocket, residues downstream of the binding pocket are involved in detecting SAM and assist the binding of SAM to the riboswitch with weak affinity. Our data highlights the capacity of a so-called antiterminator helix from the expression platform to assist the formation of a partial P1 helix of the aptamer domain. A stable P1 is involved in recognition and tight binding of SAM. Our in vitro experiments suggest that the riboswitch could switch from an unbound conformation to tightly SAM bound structure through weakly binding intermediate structures in the presence of the small molecule SAM. The future directions are included in the fourth chapter along with the conclusions
- …