5,992 research outputs found

    Fold Family-Regularized Bayesian Optimization for Directed Protein Evolution

    Get PDF
    Directed Evolution (DE) is a technique for protein engineering that involves iterative rounds of mutagenesis and screening to search for sequences that optimize a given property (ex. binding affinity to a specified target). Unfortunately, the underlying optimization problem is under-determined, and so mutations introduced to improve the specified property may come at the expense of unmeasured, but nevertheless important properties (ex. subcellular localization). We seek to address this issue by incorporating a fold-specific regularization factor into the optimization problem. The regularization factor biases the search towards designs that resemble sequences from the fold family to which the protein belongs. We applied our method to a large library of protein GB1 mutants with binding affinity measurements to IgG-Fc. Our results demonstrate that the regularized optimization problem produces more native-like GB1 sequences with only a minor decrease in binding affinity. Specifically, the log-odds of our designs under a generative model of the GB1 fold family are between 41-45% higher than those obtained without regularization, with only a 7% drop in binding affinity. Thus, our method is capable of making a trade-off between competing traits. Moreover, we demonstrate that our active-learning driven approach reduces the wet-lab burden to identify optimal GB1 designs by 67%, relative to recent results from the Arnold lab on the same data

    A Probability-Based Similarity Measure for Saupe Alignment Tensors with Applications to Residual Dipolar Couplings in NMR Structural Biology

    Get PDF
    High-throughput NMR structural biology and NMR structural genomics pose a fascinating set of geometric challenges. A key bottleneck in NMR structural biology is the resonance assignment problem. We seek to accelerate protein NMR resonance assignment and structure determination by exploiting a priori structural information. In particular, a method known as Nuclear Vector Replacement (NVR) has been proposed as a method for solving the assignment problem given a priori structural information [24,25]. Among several different kinds of input data, NVR uses a particular type of NMR data known as residual dipolar couplings (RDCs). The basic physics of residual dipolar couplings tells us that the data should be explainable by a structural model and set of parameters contained within the Saupe alignment tensor. In the NVR algorithm, one estimates the Saupe alignment tensors and then proceeds to refine those estimates. We would like to quantify the accuracy of such estimates, where we compare the estimated Saupe matrix to the correct Saupe matrix. In this work, we propose a way to quantify this comparison. Given a correct Saupe matrix and an estimated Saupe matrix, we compute an upper bound on the probability that a randomly rotated Saupe tensor would have an error smaller than the estimated Saupe matrix. This has the advantage of being a quantified upper bound which also has a clear interpretation in terms of geometry and probability. While the specific application of our rotation probability results is given to NVR, our novel methods can be used for any RDC-based algorithm to bound the accuracy of the estimated alignment tensors. Furthermore, they could also be used in X-ray crystallography or molecular docking to quantitate the accuracy of calculated rotations of proteins, protein domains, nucleic acids, or small molecules

    Efficient construction of an assembly string graph using the FM-index

    Get PDF
    Motivation: Sequence assembly is a difficult problem whose importance has grown again recently as the cost of sequencing has dramatically dropped. Most new sequence assembly software has started by building a de Bruijn graph, avoiding the overlap-based methods used previously because of the computational cost and complexity of these with very large numbers of short reads. Here, we show how to use suffix array-based methods that have formed the basis of recent very fast sequence mapping algorithms to find overlaps and generate assembly string graphs asymptotically faster than previously described algorithms

    Recombinants between Deformed wing virus and Varroa destructor virus-1 may prevail in Varroa destructor-infested honeybee colonies

    Get PDF
    We have used high-throughput Illumina sequencing to identify novel recombinants between deformed wing virus (DWV) and Varroa destructor virus-1 (VDV-1), which accumulate to higher levels than DWV in both honeybees and Varroa destructor mites. The recombinants, VDV-1VVD and VDV-1DVD, exhibit crossovers between the 5’-untranslated region (5’-UTR), and/or the regions encoding the structural (capsid) and non-structural viral proteins. This implies the genomes are modular and that each region may evolve independently, as demonstrated in human enteroviruses. Individual honeybee pupae were infected with a mixture of observed recombinants and DWV. The strong correlation between VDV-1DVD levels in honeybee pupae and the associated mites was observed, suggesting that this recombinant, with a DWV-derived 5’-UTR and non-structural protein region flanking VDV- 1-derived capsid encoding region, is better adapted to transmission between V. destructor and honeybees than the parental DWV or a recombinant bearing the VDV-1-derived 5’-UTR (VDV-1VVD)

    Bitopic binding mode of an M1 muscarinic acetylcholine receptor agonist associated with adverse clinical trial outcomes

    Get PDF
    The realisation of the therapeutic potential of targeting the M1 muscarinic acetylcholine receptor (M1 mAChR) for the treatment of cognitive decline in Alzheimer's disease has prompted the discovery of M1 mAChR ligands showing efficacy in alleviating cognitive dysfunction in both rodents and humans. Among these is GSK1034702, described previously as a potent M1 receptor allosteric agonist, which showed pro-cognitive effects in rodents and improved immediate memory in a clinical nicotine withdrawal test but induced significant side-effects. Here we provide evidence using ligand binding, chemical biology and functional assays to establish that rather than the allosteric mechanism claimed, GSK1034702 interacts in a bitopic manner at the M1 mAChR such that it can concomitantly span both the orthosteric and an allosteric binding site. The bitopic nature of GSK1034702 together with the intrinsic agonist activity and a lack of muscarinic receptor subtype selectivity reported here, all likely contribute to the adverse effects of this molecule in clinical trials. We conclude that these properties, whilst imparting beneficial effects on learning and memory, are undesirable in a clinical candidate due to the likelihood of adverse side effects. Rather, our data supports the notion that "pure" positive allosteric modulators showing selectivity for the M1 mAChR with low levels of intrinsic activity would be preferable to provide clinical efficacy with low adverse responses

    CloudAligner: A fast and full-featured MapReduce based tool for sequence mapping

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Research in genetics has developed rapidly recently due to the aid of next generation sequencing (NGS). However, massively-parallel NGS produces enormous amounts of data, which leads to storage, compatibility, scalability, and performance issues. The Cloud Computing and MapReduce framework, which utilizes hundreds or thousands of shared computers to map sequencing reads quickly and efficiently to reference genome sequences, appears to be a very promising solution for these issues. Consequently, it has been adopted by many organizations recently, and the initial results are very promising. However, since these are only initial steps toward this trend, the developed software does not provide adequate primary functions like bisulfite, pair-end mapping, etc., in on-site software such as RMAP or BS Seeker. In addition, existing MapReduce-based applications were not designed to process the long reads produced by the most recent second-generation and third-generation NGS instruments and, therefore, are inefficient. Last, it is difficult for a majority of biologists untrained in programming skills to use these tools because most were developed on Linux with a command line interface.</p> <p>Results</p> <p>To urge the trend of using Cloud technologies in genomics and prepare for advances in second- and third-generation DNA sequencing, we have built a Hadoop MapReduce-based application, CloudAligner, which achieves higher performance, covers most primary features, is more accurate, and has a user-friendly interface. It was also designed to be able to deal with long sequences. The performance gain of CloudAligner over Cloud-based counterparts (35 to 80%) mainly comes from the omission of the reduce phase. In comparison to local-based approaches, the performance gain of CloudAligner is from the partition and parallel processing of the huge reference genome as well as the reads. The source code of CloudAligner is available at <url>http://cloudaligner.sourceforge.net/</url> and its web version is at <url>http://mine.cs.wayne.edu:8080/CloudAligner/.</url></p> <p>Conclusions</p> <p>Our results show that CloudAligner is faster than CloudBurst, provides more accurate results than RMAP, and supports various input as well as output formats. In addition, with the web-based interface, it is easier to use than its counterparts.</p

    Monoclonal anti-β1-adrenergic receptor antibodies activate G protein signaling in the absence of β-arrestin recruitment

    Get PDF
    Thermostabilized G protein-coupled receptors used as antigens for in vivo immunization have resulted in the generation of functional agonistic anti-β1-adrenergic (β1AR) receptor monoclonal antibodies (mAbs). The focus of this study was to examine the pharmacology of these antibodies to evaluate their mechanistic activity at β1AR. Immunization with the β1AR stabilized receptor yielded five stable hybridoma clones, four of which expressed functional IgG, as determined in cell-based assays used to evaluate cAMP stimulation. The antibodies bind diverse epitopes associated with low nanomolar agonist activity at β1AR, and they appeared to show some degree of biased signaling as they were inactive in an assay measuring signaling through β-arrestin. In vitro characterization also verified different antibody-receptor interactions reflecting the different epitopes on the extracellular surface of β1AR to which the mAbs bind. The anti-β1AR mAbs only demonstrated agonist activity when in dimeric antibody format, but not as the monomeric Fab format, suggesting that agonist activation may be mediated through promoting receptor dimerization. Finally, we have also shown that at least one of these antibodies exhibits in vivo functional activity at a therapeutically-relevant dose producing an increase in heart rate consistent with β1AR agonism
    corecore