3 research outputs found

    Design of photoactivatable inhibitors for spatiotemporal control of GEF activity

    Get PDF
    Protein design methods have been applied to engineer novel protein folds, enzymes, and materials with atomic-level accuracy. However, little work has been done to apply it to engineer novel proteins that can be used in vivo to dissect biological processes. Here we utilize protein design to study the cellular signaling involved in cell motility. Cell motility is driven by the reorganization of the cytoskeleton, a process regulated by the Rho protein family of small GTPases. These molecules are activated at precise subcellular locations by guanine exchange factors (GEFs) with fine temporal control. Understanding the biological role of these molecules requires their investigation at the subcellular level in living cells. To this end, we developed photoactivatable GEF inhibitors to allow for the spatio-temporal control of these GTPases. The GEFs targeted in this work were GEF-H1 and the members of Vav family GEFs, specifically Vav2. The two serve orthogonal roles in cell motility, where GEF-H1 has GEF activity towards RhoA at the retracting edge, Vav targets Rac1 at the leading edge. Our computational work generated alibrary of GEF-H1 inhibitors, and two experimentally validated inhibitors for the Vav family displaying high specificity in silico.Bachelor of Art

    Conditional Generation of Protein Sequence and Structure

    No full text
    Thesis (Ph.D.)--University of Washington, 2023The advent of atomic accuracy protein sequence structure prediction with deep learning networks spurred by AlphaFold has had a remarkable impact on the field of biochemistry. It has resulted in rapid progress in protein design because it allows for the quick interrogation of structural hypotheses without the need to acquire experimental data which is expensive and time consuming. However, it remains elusive how to properly use these deep learning models for the generation of protein sequences and structures with user defined functional and biochemical properties. This was the focus of my dissertation work. I first interrogated this question by taking pre trained structure prediction networks, namely RoseTTAFold, and applying techniques from image processing to make them generative in a method termed “constrained Hallucination”. I apply the technique to optimize sequences such that their predicted structures contain desired functional sites on a slew of design problems ranging from epitope scaffolding, metal binding, to protein binding. Experimentally characterization of these designs demonstrate the have the desired activities. In follow up work, I improve upon joint sequence-structure generation by employing the denoising diffusion probabilistic framework popularized in image generation. I developed ProteinGenerator, a sequence space diffusion model based on RoseTTAfold that simultaneously generates protein sequences and structures. Beginning from random amino acid sequences, the model generates sequence and structure pairs by iterative denoising, guided by any desired sequence and structural protein attributes. To explore the versatility of this approach, I designed and tested proteins enriched for specific amino acids, with internal sequence repeats, with masked bioactive peptides, with state dependent structures, and with key sequence features of specific protein families. And lastly looking to the future, particularly difficult protein design problems such as the design of highly active enzymes, experimental data feedback is necessary to improve functionality with minimal design iterations. Active learning (AL) and bayesian optimization (BO) approaches provide a principled way to incorporate experimental feedback into the design process, and subsequently minimize the number of iterations cycling between computation and experimental testing to optimize the desired function. However, these approaches do not incorporate strong generative priors to bias exploration/exploitations to valid regions of protein space. Therefore to improve upon current BO and AL methods, I hypothesize that coupling a joint sequence and structure diffusion model with bayesian optimization methods will allow for the more efficient search of the sequence activity landscape to find highly active variants. To this end I developed a joint sequence and structure denoising generative model, ProteinGenerator2 (PG2), to which I bias generation with both zero shot predictors to yield predicted highly active and diverse sequence pools for testing

    Scaffolding protein functional sites using deep learning

    No full text
    The binding and catalytic functions of proteins are generally mediated by a small number of functional residues held in place by the overall protein structure. Here, we describe deep learning approaches for scaffolding such functional sites without needing to prespecify the fold or secondary structure of the scaffold. The first approach, "constrained hallucination," optimizes sequences such that their predicted structures contain the desired functional site. The second approach, "inpainting," starts from the functional site and fills in additional sequence and structure to create a viable protein scaffold in a single forward pass through a specifically trained RoseTTAFold network. We use these two methods to design candidate immunogens, receptor traps, metalloproteins, enzymes, and protein-binding proteins and validate the designs using a combination of in silico and experimental tests.N
    corecore