193 research outputs found

    True prediction of lowest observed adverse effect levels

    Get PDF
    Summary A database of structurally heterogeneous chemical structures with their experimental values of Lowest Observed Adverse Effect Levels (LOAELs) was modeled using graph theoretical descriptors. Variable selection for multiple linear regression (MLR) and linear discriminant analysis (LDA) was accomplished by the Internal Test Set (ITS) method in order to achieve true predicted LOAEL values. The results obtained can be considered good if we take in count the structural diversity of the training set

    Development of Computer-Aided Molecular Design Methods for Bioengineering Applications

    Get PDF
    Computer-aided molecular design (CAMD) offers a methodology for rational product design. The CAMD procedure consists of pre-design, design and post-design phases. CAMD was used to address two bioengineering problems: design of excipients for lyophilized protein formulations and design of ionic liquids for use in bioseparations. Protein stability remains a major concern during protein drug development. Lyophilization, or freeze-drying, is often sought to improve chemical stability. However, lyophilization can result in protein aggregation. Excipients, or additives, are included to stabilize proteins in lyophilized formulations. CAMD was used to rationally select or design excipients for lyophilized protein formulations. The use of solvents to aid separation is common in chemical processes. Ionic liquids offer a class of molecules with tunable properties that can be altered to find optimal solvents for a given application. CAMD was used to design ionic liquids for extractive distillation and in situ extractive fermentation processes. The pre-design phase involves experimental data gathering and problem formulation. When available, data was obtained from literature sources. For excipient design, data of percent protein monomer remaining post-lyophilization was measured for a variety of protein-excipient combinations. In problem formulation, the objective was to minimize the difference between the properties of the designed molecule and the target property values. Problem formulations resulted in either mixed-integer linear programs (MILPs) or mixed-integer non-linear programs (MINLPs). The design phase consists of the forward problem and the reverse problem. In the forward problem, linear quantitative structure-property relationships (QSPRs) were developed using connectivity indices. Chiral connectivity indices were used for excipient property models to improve fit and incorporate three-dimensional structural information. Descriptor selection methods were employed to find models that minimized Mallow's Cp statistic, obtaining models with good fit while avoiding overfitting. Cross-validation was performed to access predictive capabilities. Model development was also performed to develop group contribution models and non-linear QSPRs. A UNIFAC model was developed to predict the thermodynamic properties of ionic liquids. In the reverse problem of the design phase, molecules were proposed with optimal property values. Deterministic methods were used to design ionic liquids entrainers for azeotropic distillation. Tabu search, a stochastic optimization method, was applied to both ionic liquid and excipient design to provide novel molecular candidates. Tabu search was also compared to a genetic algorithm for CAMD applications. Tuning was performed using a test case to determine parameter values for both methods. After tuning, both stochastic methods were used with design cases to provide optimal excipient stabilizers for lyophilized protein formulations. Results suggested that the genetic algorithm provided a faster time to solution while the tabu search provides quality solutions more consistently. The post-design phase provides solution analysis and verification. Process simulation was used to evaluate the energy requirements of azeotropic separations using designed ionic liquids. Results demonstrated that less energy was required than processes using conventional entrainers or ionic liquids that were not optimally designed. Molecular simulation was used to guide protein formulation design and may prove to be a useful tool in post-design verification. Finally, prediction intervals were used for properties predicted from linear QSPRs to quantify the prediction error in the CAMD solutions. Overlapping prediction intervals indicate solutions with statistically similar property values. Prediction interval analysis showed that tabu search returns many results with statistically similar property values in the design of carbohydrate glass formers for lyophilized protein formulations. The best solutions from tabu search and the genetic algorithm were shown to be statistically similar for all design cases considered. Overall the CAMD method developed here provides a comprehensive framework for the design of novel molecules for bioengineering approaches

    Investigating the Effect of Coarse-Graining on Chemical Compound Space

    Get PDF

    Environment matters : the impact of urea and macromolecular crowding on proteins

    Get PDF
    [eng] This work aims to analytically understand the impact of two diametric opposite environments on protein structure and dynamics and compared them to the most common solvent on earth: water. The first environment is a popular denaturing solution (urea 8M), which has served for years in protein-science laboratories to investigate protein stability; still many open questions regarding its mechanism of action remained unclear. The second environment instead moves towards a more physiological representation of proteins. The cell interior, in fact, is a crowded solution highly populated prevalently by proteins, but studies on protein structure and dynamics have lead so far to confusing or even opposite observations. The lack of a consensus view in both phenomena possibly derives from the bias of the system under study. This work is an attempt of a comparative study using the most general systems: a diverse spectrum of proteins folds, different stages along the reaction path (early stages or end-point) and/or different protein force-fields. Our main objective was to derive common pattern and general rules valid at proteome level, focusing on three major aspects of proteins: the structure, the dynamic and the interactions with the solvent molecules. Molecular dynamics simulation appeared then as the most suitable tool because of its ability to i) analyze proteins at broad range of resolutions; ii) access the direct time-resolved dynamic of the system and iii) dissect the specific interactions that arise in the new settings. Specifically, the case of urea-induced unfolding needs a system for which is possible to clearly identify folded and unfolded state – globular proteins are then the most suitable ones. We extracted general rules on the folded/unfolded transition by studying independently the two end-points of folded/unfolded reaction. We simulated the urea-induced unfolded state of a model protein, ubiquitin to understand the energetics stabilizing unfolded structures in urea. We found that the unfolded ubiquitin in 8M urea is fully extend and flexible and capturing efficiently urea molecules to the first solvation shell. Dispersion, rather than electrostatic, appear the main energetic contribution to explain the stabilization of the unfolded state. We then simulated the early stages of urea-induced unfolding on a large dataset of folded proteins, which represent the major folds of globular proteins, aiming also to investigate the kinetic role of urea in triggering the protein unfolding. We found that partially unfolded proteins expose the apolar residues buried in the protein interior, mainly via cavitation. Similar to the unfolded state, it is the dispersion interactions that drive urea accumulation in the solvation shell but here urea molecules take advantage of microscopic unfolding events to penetrate the protein interior. Macromolecular crowding instead is a phenomenon that universally affects all the proteins. We simulated a system that included as crowding agents proteins with different conformational landscapes (a globular protein, an intrinsically disordered proteins and a molten globule) arranged to reach cell-like concentrations. We conclude that the universal effect of crowding, valid for all the proteins types, is exerted via the aspecific interactions and favors open and moderately extended conformations with higher secondary structure content. This phenomenon counterbalances the volume-exclusion, which prevails at higher crowding concentrations. The impact of crowding is proportional to the degree of disorder of the protein and for folded protein crowding favors structural rearrangements while unfolded structures experience a stronger stabilization and a higher secondary structures content. The synthetic crowder PEG doesn’t reproduce any of these effects, arising concerns about its employment in study cell-like environments

    Kinetic model construction using chemoinformatics

    Get PDF
    Kinetic models of chemical processes not only provide an alternative to costly experiments; they also have the potential to accelerate the pace of innovation in developing new chemical processes or in improving existing ones. Kinetic models are most powerful when they reflect the underlying chemistry by incorporating elementary pathways between individual molecules. The downside of this high level of detail is that the complexity and size of the models also steadily increase, such that the models eventually become too difficult to be manually constructed. Instead, computers are programmed to automate the construction of these models, and make use of graph theory to translate chemical entities such as molecules and reactions into computer-understandable representations. This work studies the use of automated methods to construct kinetic models. More particularly, the need to account for the three-dimensional arrangement of atoms in molecules and reactions of kinetic models is investigated and illustrated by two case studies. First of all, the thermal rearrangement of two monoterpenoids, cis- and trans-2-pinanol, is studied. A kinetic model that accounts for the differences in reactivity and selectivity of both pinanol diastereomers is proposed. Secondly, a kinetic model for the pyrolysis of the fuel “JP-10” is constructed and highlights the use of state-of-the-art techniques for the automated estimation of thermochemistry of polycyclic molecules. A new code is developed for the automated construction of kinetic models and takes advantage of the advances made in the field of chemo-informatics to tackle fundamental issues of previous approaches. Novel algorithms are developed for three important aspects of automated construction of kinetic models: the estimation of symmetry of molecules and reactions, the incorporation of stereochemistry in kinetic models, and the estimation of thermochemical and kinetic data using scalable structure-property methods. Finally, the application of the code is illustrated by the automated construction of a kinetic model for alkylsulfide pyrolysis

    Theoretical-experimental study on protein-ligand interactions based on thermodynamics methods, molecular docking and perturbation models

    Get PDF
    The current doctoral thesis focuses on understanding the thermodynamic events of protein-ligand interactions which have been of paramount importance from traditional Medicinal Chemistry to Nanobiotechnology. Particular attention has been made on the application of state-of-the-art methodologies to address thermodynamic studies of the protein-ligand interactions by integrating structure-based molecular docking techniques, classical fractal approaches to solve protein-ligand complementarity problems, perturbation models to study allosteric signal propagation, predictive nano-quantitative structure-toxicity relationship models coupled with powerful experimental validation techniques. The contributions provided by this work could open an unlimited horizon to the fields of Drug-Discovery, Materials Sciences, Molecular Diagnosis, and Environmental Health Sciences
    • …
    corecore