5,041 research outputs found

    MOMCMC: An Efficient Monte Carlo Method for Multi-Objective Sampling Over Real Parameter Space

    Get PDF
    In this paper, we present a new population-based Monte Carlo method, so-called MOMCMC (Multi-Objective Markov Chain Monte Carlo). for sampling in the presence of multiple objective functions in real parameter space. The MOMCMC method is designed to address the multi-objective sampling problem, which is not only of interest in exploring diversified solutions at the Pareto optimal front in the function space of multiple objective functions, but also those near the front. MOMCMC integrates Differential Evolution (DE) style crossover into Markov Chain Monte Carlo (MCMC) to adaptively propose new solutions from the current population. The significance of dominance is taken into consideration in MOMCMC\u27s fitness assignment scheme while balancing the solution\u27s optimality and diversity. Moreover, the acceptance rate in MOMCMC is used to control the sampling bandwidth of the solutions near the Pareto optimal front. As a result, the computational results of MOMCMC with the high-dimensional ZDT benchmark functions demonstrate its efficiency in obtaining solution samples at or near the Pareto optimal front. Compared to MOSCEM (Multiobjective Shuffled Complex Evolution Metropolis), an existing Monte Carlo sampling method for multi-objective optimization, MOMCMC exhibits significantly faster convergence to the Pareto optimal front. Furthermore, with small population size, MOMCMC also shows effectiveness in sampling complicated multiobjective function space

    A novel framework for engineering protein loops exploring length and compositional variation

    Get PDF
    Insertions and deletions (indels) are known to affect function, biophysical properties and substrate specificity of enzymes, and they play a central role in evolution. Despite such clear significance, this class of mutation remains an underexploited tool in protein engineering with few available platforms capable of systematically generating and analysing libraries of varying sequence composition and length. We present a novel DNA assembly platform (InDel assembly), based on cycles of endonuclease restriction digestion and ligation of standardised dsDNA building blocks, that can generate libraries exploring both composition and sequence length variation. In addition, we developed a framework to analyse the output of selection from InDel-generated libraries, combining next generation sequencing and alignment-free strategies for sequence analysis. We demonstrate the approach by engineering the well-characterized TEM-1 β-lactamase Ω-loop, involved in substrate specificity, identifying multiple novel extended spectrum β-lactamases with loops of modified length and composition—areas of the sequence space not previously explored. Together, the InDel assembly and analysis platforms provide an efficient route to engineer protein loops or linkers where sequence length and composition are both essential functional parameters

    Coalescent-based species delimitation in the sand lizards of the Liolaemus wiegmannii complex (Squamata: Liolaemidae)

    Get PDF
    Coalescent-based algorithms coupled with the access to genome-wide data have become powerful tools forassessing questions on recent or rapid diversification, as well as delineating species boundaries in the absence of reciprocal monophyly. In southern South America, the diversification of Liolaemus lizards during the Pleistocene is well documented and has been attributed to the climatic changes that characterized this recent period of time. Past climatic changes had harsh effects at extreme latitudes, including Patagonia, but habitat changes at intermediate latitudes of South America have also been recorded, including expansion of sand fields over northern Patagonia and Pampas). In this work, we apply a coalescent-based approach to study the diversification of the Liolaemus wiegmannii species complex, a morphologically conservative clade that inhabits sandy soils across northwest and south-central Argentina, and the south shores of Uruguay. Using four standard sequence markers (mitochondrial DNA and three nuclear loci) along with ddRADseq data we inferred species limits and a time calibrated species tree for the L. wiegmannii complex in order to evaluate the influence of Quaternary sand expansion/retraction cycles on diversification. We also evaluated the evolutionary independence of the recently described L. gardeli and inferred its phylogenetic position relative to L. wiegmannii. We find strong evidence for six allopatric candidate species within L. wiegmannii, which diversified during the Pleistocene. The Great Patagonian Glaciation (∼1 million years before present) likely split the species complex into two main groups: one composed of lineages associated with sub-Andean sedimentary formations, and the other mostly related to sand fields in the Pampas and northern Patagonia. We hypothesize that early speciation within L. wiegmannii was influenced by the expansion of sand dunes throughout central Argentina and Pampas. Finally, L. gardeli is supported as a distinct lineage nested within the L. wiegmannii complex.Fil: Villamil, Joaquín. Universidad de la República. Facultad de Ciencias; UruguayFil: Avila, Luciano Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Instituto Patagónico para el Estudio de los Ecosistemas Continentales; ArgentinaFil: Morando, Mariana. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Centro Nacional Patagónico. Instituto Patagónico para el Estudio de los Ecosistemas Continentales; ArgentinaFil: Sites, Jack W.. University Brigham Young; Estados UnidosFil: Leaché, Adam D.. University of Washington; Estados UnidosFil: Maneyro, Raúl. Universidad de la República. Facultad de Ciencias; UruguayFil: Camargo Bentaberry, Arley. Universidad de la República; Urugua

    Assessment of the Food Habits of the Moroccan Dorcas Gazelle in M’Sabih Talaa, West Central Morocco, Using the trnL Approach

    Get PDF
    Food habits of the Moroccan dorcas gazelle, Gazella dorcas massaesyla, previously investigated in the 1980s using microhistological fecal analysis, in the M’Sabih Talaa Reserve, west central Morocco, were re-evaluated over three seasons (spring, summer and autumn 2009) using the trnL approach to determine the diet composition and its seasonal variation from fecal samples. Taxonomic identification was carried out using the identification originating from the database built from EMBL and the list of plant species within the reserve. The total taxonomic richness in the reserve was 130 instead of 171 species in the 1980s. The diet composition revealed to be much more diversified (71 plant taxa belonging to 57 genus and 29 families) than it was 22 years ago (29 identified taxa). Thirty-four taxa were newly identified in the diet while 13 reported in 1986–87 were not found. Moroccan dorcas gazelle showed a high preference to Acacia gummifera, Anagallis arvensis, Glebionis coronaria, Cladanthus arabicus, Diplotaxis tenuisiliqua, Erodium salzmannii, Limonium thouini, Lotus arenarius and Zizyphus lotus. Seasonal variations occurred in both number (40–41 taxa in spring-summer and 49 taxa in autumn vs. respectively 23–22 and 26 in 1986–1987) and taxonomic type of eaten plant taxa. This dietary diversification could be attributed either to the difference in methods of analysis, trnL approach having a higher taxonomic resolution, or a potential change in nutritional quality of plants over time

    Navigating the Extremes of Biological Datasets for Reliable Structural Inference and Design

    Get PDF
    Structural biologists currently confront serious challenges in the effective interpretation of experimental data due to two contradictory situations: a severe lack of structural data for certain classes of proteins, and an incredible abundance of data for other classes. The challenge with small data sets is how to extract sufficient information to draw meaningful conclusions, while the challenge with large data sets is how to curate, categorize, and search the data to allow for its meaningful interpretation and application to scientific problems. Here, we develop computational strategies to address both sparse and abundant data sets. In the category of sparse data sets, we focus our attention on the problem of transmembrane (TM) protein structure determination. As X-ray crystallography and NMR data is notoriously difficult to obtain for TM proteins, we develop a novel algorithm which uses low-resolution data from protein cross-linking or scanning mutagenesis studies to produce models of TM helix oligomers and show that our method produces models with an accuracy on par with X-ray crystallography or NMR for a test set of known TM proteins. Turning to instances of data abundance, we examine how to mine the vast stores of protein structural data in the Protein Data Bank (PDB) to aid in the design of proteins with novel binding properties. We show how the identification of an anion binding motif in an antibody structure allowed us to develop a phosphate binding module that can be used to produce novel antibodies to phosphorylated peptides - creating antibodies to 7 novel phospho-peptides to illustrate the utility of our approach. We then describe a general strategy for designing binders to a target protein epitope based upon recapitulating protein interaction geometries which are over-represented in the PDB. We follow this by using data describing the transition probabilities of amino acids to develop a novel set of degenerate codons to create more efficient gene libraries. We conclude by describing a novel, real-time, all-atom structural search engine, giving researchers the ability to quickly search known protein structures for a motif of interest and providing a new interactive paradigm of protein design

    Optimisation of T cell receptors using in vivo recombination and selection

    Get PDF
    The αβT cell receptor (TCR) orchestrates immunity through the recognition of peptides, derived from degraded proteins, presented on major histocompatibility complex (MHC) molecules. The remarkable ability of the receptor to respond to a vast plethora of antigens is driven by V(D)J recombination, a process which generates a highly diverse TCR repertoire by somatic gene rearrangement of coding DNA. TCR diversity is confined to three short hairpin loops on each TCR chain, called the complementarity determining region (CDR), which form the antigen-binding site. The germline-encoded CDR1 and CDR2 loops predominantly contact MHC, whereas the hypervariable CDR3 are non-germline and primarily bind to the MHC-bound peptide. In this study, we developed a novel in vivo mutagenesis approach which redirects somatic gene rearrangement using V(D)J recombination machinery to diversify and optimise TCR binding. This approach involves embedding a gene recombination cassette into the peptide-binding CDR3β region of established TCRs. A retrogenic system was employed to facilitate the in vivo processes necessary for gene rearrangement and thymic selection. We demonstrate that the recombination cassette can successfully induce gene rearrangement and introduce variation to the targeted CDR3β site. Thymocytes expressing the diversified TCRs can be selected on MHC and develop into functional peripheral T cells. Subsequent exposure to cognate ligands also allowed us to identify optimised and ‘immunodominant’ TCRs. In addition, we produced a novel chimeric TCR chain which comprises Vα and Cβ domains. This TCR chain forms a heterodimer with endogenous TCRα chains to form a unique Vα-Vα antigen-binding surface. Thymocytes expressing this novel form of αβTCR were able to engage efficiently with both MHC classes and develop normally into functional T cells typical of a conventional repertoire. Collectively, these findings suggest that the germline CDR loops are not essential for mediating MHC recognition during MHC-restricted T cell development and function.Open Acces
    • …
    corecore