398 research outputs found
CurlySMILES: a chemical language to customize and annotate encodings of molecular and nanodevice structures
CurlySMILES is a chemical line notation which extends SMILES with annotations for storage, retrieval and modeling of interlinked, coordinated, assembled and adsorbed molecules in supramolecular structures and nanodevices. Annotations are enclosed in curly braces and anchored to an atomic node or at the end of the molecular graph depending on the annotation type. CurlySMILES includes predefined annotations for stereogenicity, electron delocalization charges, extra-molecular interactions and connectivity, surface attachment, solutions, and crystal structures and allows extensions for domain-specific annotations. CurlySMILES provides a shorthand format to encode molecules with repetitive substructural parts or motifs such as monomer units in macromolecules and amino acids in peptide chains. CurlySMILES further accommodates special formats for non-molecular materials that are commonly denoted by composition of atoms or substructures rather than complete atom connectivity
Consumer satisfaction with primary care provider choice and associated trust
BACKGROUND: Development of managed care, characterized by limited provider choice, is believed to undermine trust. Provider choice has been identified as strongly associated with physician trust. Stakeholders in a competitive healthcare market have competing agendas related to choice. The purpose of this study is to analyze variables associated with consumer's satisfaction that they have enough choice when selecting their primary care provider (PCP), and to analyze the importance of these variables on provider trust. METHODS: A 1999 randomized national cross-sectional telephone survey conducted of United States residential households, who had a telephone, had seen a medical professional at least twice in the past two years, and aged ≥ 20 years was selected for secondary data analyses. Among 1,117 households interviewed, 564 were selected as the final sample. Subjects responded to a core set of questions related to provider trust, and a subset of questions related to trust in the insurer. A previously developed conceptual framework was adopted. Linear and logistic regressions were performed based on this framework. RESULTS: Results affirmed 'satisfaction with amount of PCP choice' was significantly (p < .001) associated with provider trust. 'PCP's care being extremely effective' was strongly associated with 'satisfaction with amount of PCP choice' and 'provider trust'. Having sought a second opinion(s) was associated with lower trust. 'Spoke to the PCP outside the medical office,' 'satisfaction with the insurer' and 'insurer charges less if PCP within network' were all variables associated with 'satisfaction with amount of PCP choice' (all p < .05). CONCLUSION: This study confirmed the association of 'satisfaction with amount of PCP choice' with provider trust. Results affirmed 'enough PCP choice' was a strong predictor of provider trust. 'Second opinion on PCP' may indicate distrust in the provider. Data such as 'trust in providers in general' and 'the role of provider performance information' in choice, though import in PCP choice, were not available for analysis and should be explored in future studies. Results have implications for rethinking the relationships among consumer choice, consumer behaviors in making trade-offs in PCP choice, and the role of healthcare experiences in 'satisfaction with amount of PCP choice' or 'provider trust.
Artificial intelligence in biological activity prediction
Artificial intelligence has become an indispensable resource in chemoinformatics. Numerous machine learning algorithms for activity prediction recently emerged, becoming an indispensable approach to mine chemical information from large compound datasets. These approaches enable the automation of compound discovery to find biologically active molecules with important properties. Here, we present a review of some of the main machine learning studies in biological activity prediction of compounds, in particular for sweetness prediction. We discuss some of the most used compound featurization techniques and the major databases of chemical compounds relevant to these tasks.This study was supported by the European Commission through project SHIKIFACTORY100 - Modular cell factories for the production of 100 compounds from the shikimate pathway (Reference 814408), and by the Portuguese FCT under the scope of the strategic funding of UID/BIO/04469/2019 unit and BioTecNorte operation (NORTE-01-0145-FEDER-000004) funded by the European Regional Development Fund under the scope of Norte2020.info:eu-repo/semantics/publishedVersio
Functional Group and Substructure Searching as a Tool in Metabolomics
BACKGROUND: A direct link between the names and structures of compounds and the functional groups contained within them is important, not only because biochemists frequently rely on literature that uses a free-text format to describe functional groups, but also because metabolic models depend upon the connections between enzymes and substrates being known and appropriately stored in databases. METHODOLOGY: We have developed a database named "Biochemical Substructure Search Catalogue" (BiSSCat), which contains 489 functional groups, >200,000 compounds and >1,000,000 different computationally constructed substructures, to allow identification of chemical compounds of biological interest. CONCLUSIONS: This database and its associated web-based search program (http://bisscat.org/) can be used to find compounds containing selected combinations of substructures and functional groups. It can be used to determine possible additional substrates for known enzymes and for putative enzymes found in genome projects. Its applications to enzyme inhibitor design are also discussed
Interpreting linear support vector machine models with heat map molecule coloring
<p>Abstract</p> <p>Background</p> <p>Model-based virtual screening plays an important role in the early drug discovery stage. The outcomes of high-throughput screenings are a valuable source for machine learning algorithms to infer such models. Besides a strong performance, the interpretability of a machine learning model is a desired property to guide the optimization of a compound in later drug discovery stages. Linear support vector machines showed to have a convincing performance on large-scale data sets. The goal of this study is to present a heat map molecule coloring technique to interpret linear support vector machine models. Based on the weights of a linear model, the visualization approach colors each atom and bond of a compound according to its importance for activity.</p> <p>Results</p> <p>We evaluated our approach on a toxicity data set, a chromosome aberration data set, and the maximum unbiased validation data sets. The experiments show that our method sensibly visualizes structure-property and structure-activity relationships of a linear support vector machine model. The coloring of ligands in the binding pocket of several crystal structures of a maximum unbiased validation data set target indicates that our approach assists to determine the correct ligand orientation in the binding pocket. Additionally, the heat map coloring enables the identification of substructures important for the binding of an inhibitor.</p> <p>Conclusions</p> <p>In combination with heat map coloring, linear support vector machine models can help to guide the modification of a compound in later stages of drug discovery. Particularly substructures identified as important by our method might be a starting point for optimization of a lead compound. The heat map coloring should be considered as complementary to structure based modeling approaches. As such, it helps to get a better understanding of the binding mode of an inhibitor.</p
Evolutionarily Conserved Substrate Substructures for Automated Annotation of Enzyme Superfamilies
The evolution of enzymes affects how well a species can adapt to new environmental conditions. During enzyme evolution, certain aspects of molecular function are conserved while other aspects can vary. Aspects of function that are more difficult to change or that need to be reused in multiple contexts are often conserved, while those that vary may indicate functions that are more easily changed or that are no longer required. In analogy to the study of conservation patterns in enzyme sequences and structures, we have examined the patterns of conservation and variation in enzyme function by analyzing graph isomorphisms among enzyme substrates of a large number of enzyme superfamilies. This systematic analysis of substrate substructures establishes the conservation patterns that typify individual superfamilies. Specifically, we determined the chemical substructures that are conserved among all known substrates of a superfamily and the substructures that are reacting in these substrates and then examined the relationship between the two. Across the 42 superfamilies that were analyzed, substantial variation was found in how much of the conserved substructure is reacting, suggesting that superfamilies may not be easily grouped into discrete and separable categories. Instead, our results suggest that many superfamilies may need to be treated individually for analyses of evolution, function prediction, and guiding enzyme engineering strategies. Annotating superfamilies with these conserved and reacting substructure patterns provides information that is orthogonal to information provided by studies of conservation in superfamily sequences and structures, thereby improving the precision with which we can predict the functions of enzymes of unknown function and direct studies in enzyme engineering. Because the method is automated, it is suitable for large-scale characterization and comparison of fundamental functional capabilities of both characterized and uncharacterized enzyme superfamilies
- …