Search CORE

2,184 research outputs found

Development of selected mesoscopic physical models with the aid of machine learning methods and their applications in studies of molecular systems

Author: Dziubiński Maciej
Publication venue
Publication date
Field of study

This dissertation is concerned with the development and application of unsupervised machine learning methods in the field of theoretical biophysics and bioinformatics. The machine learning approach offers a powerful framework for extracting and purifying valuable information from large, multi-dimensional sets of data generated in simulations and experiments of biomolecular systems. It is not, however, the case that ready-made machine learning methods offer infallible means of dealing with all sorts of complex, and partially chaotic data encountered in computational biophysics and structural biology. Large portion of this work is devoted to the adaptation of unsupervised machine learning techniques to our particular purposes. In this dissertation, we employed unsupervised machine learning strategies dealing with two problems arising in theoretical biophysics and bioinformatics. The first problem was the identification of quasi-rigid structural parts in proteins, whereas the second one was devoted to discovery of internal cooperation of molecular subsystems that propels a conformational transition. Both problems involved dynamical properties of molecular systems, and the analyses presented in this dissertation allowed for a simplified description of these phenomena. We demonstrate how the unsupervised machine learning approach can help in explaining intricacies hidden within seemingly chaotic molecular dynamics simulation data. The methods developed in this thesis increase our ability to understand complex molecular phenomena. But we also point out potential problems associated with applying unsupervised machine learning algorithms in the field of molecular biophysics

Repozytorium UW

Recommended from our members

Understanding virtual solvent through large-scale ligand discovery

Author: Stein Reed
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Predicting new ligands and their binding poses for a protein target relies on an understanding of the physical forces that exist between the water-submerged protein and ligand. The relative favorability of these molecular and atomic interactions between the protein and ligand compared with their interactions with water determine the binding affinity, which in turn can be converted into a binding free energy. Protein-ligand binding energetics are, with varying levels of success, encoded into scoring functions, which at their best, can only partially emulate the true binding affinity of a protein-ligand binding event. In the context of virtually screening millions or hundreds of millions of drug-like ligands, molecular docking algorithms take advantage of scoring functions to rank the binding energies of these molecules relative to one another to help prioritize the most promising ligands.The focus of this dissertation is the balance between scoring function energy terms with an emphasis on water energetics, specifically the desolvation of the protein upon ligand binding. It is thought that our limited understanding of water is largely responsible for our limitations in discovering and designing drugs. This is due to the large number of roles that water can play, as well as its significant, and even dominant, contribution to protein-ligand binding energetics, which in the realm of molecular docking, is typically under-modeled or completely neglected. First, I focus on the incorporation of receptor desolvation into the standard DOCK3.7 scoring function to more accurately model protein-ligand binding interactions by including further contributions of water. This is the original implementation of Grid Inhomogeneous Solvation Theory applied to the model cavity, cytochrome c peroxidate, and spearheaded by Trent Balius and Marcus Fischer. Second, I discuss an extension of GIST in DOCK3.7, a new implementation that relies on pre-computed Gaussian-weighted GIST receptor desolvation enthalpies. This results in negligible slowdown of the standard DOCK3.7 scoring function, similar performance to the original implementation of GIST, and the identification of new ligands for the drug-like model system, AmpC β-lactamase. The work on receptor desolvation contained within these two chapters inspires the name of this thesis, and were started in my rotation and have continued until the end. Third, I focus on the use of property-matched and property-unmatched decoys for use in retrospective enrichment calculations prior to running a large-scale molecular docking virtual screen. Decoy molecules share the same physical properties as ligands that bind a protein but are topologically dissimilar to ensure that they do not actually bind the protein. What we found was that charge mismatching between ligands and decoys could bias one’s docking setup towards artifactually strong performance. Chapter 3 focuses on how we both decreased and increased the property space of decoys relative to ligands to safeguard against these docking setup biases. Fourth, I employ this knowledge of protein-ligand binding affinities to identify novel selective melatonin receptor ligands that are active in in vivo circadian rhythm assays. Finally, I discuss my current project on the CB1 cannabinoid receptor in the context of analgesia, followed by future directions

eScholarship - University of California

Characterizing Interdisciplinarity of Researchers and Research Topics Using Web Search Engines

Author: AA Hagberg
AL Barabási
AL Barabási
AL Porter
CS Wagner
D Sullivan
DJ de Solla Price
F Janssens
F Åström
H Kautz
Hiroki Sayama
I Rafols
J Akaishi
J Mori
Jin Akaishi
JP Eaton
K Börner
L Leydesdorff
MEJ Newman
MEJ Newman
MEJ Newman
NE Friedkin
P Levy
P Mika
R Klavans
Renaud Lambiotte
RR Braam
S Wasserman
SD Dionne
SH Lee
TW Malone
X Liu
Y Asada
Y Matsuo
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

Researchers' networks have been subject to active modeling and analysis. Earlier literature mostly focused on citation or co-authorship networks reconstructed from annotated scientific publication databases, which have several limitations. Recently, general-purpose web search engines have also been utilized to collect information about social networks. Here we reconstructed, using web search engines, a network representing the relatedness of researchers to their peers as well as to various research topics. Relatedness between researchers and research topics was characterized by visibility boost-increase of a researcher's visibility by focusing on a particular topic. It was observed that researchers who had high visibility boosts by the same research topic tended to be close to each other in their network. We calculated correlations between visibility boosts by research topics and researchers' interdisciplinarity at individual level (diversity of topics related to the researcher) and at social level (his/her centrality in the researchers' network). We found that visibility boosts by certain research topics were positively correlated with researchers' individual-level interdisciplinarity despite their negative correlations with the general popularity of researchers. It was also found that visibility boosts by network-related topics had positive correlations with researchers' social-level interdisciplinarity. Research topics' correlations with researchers' individual- and social-level interdisciplinarities were found to be nearly independent from each other. These findings suggest that the notion of "interdisciplinarity" of a researcher should be understood as a multi-dimensional concept that should be evaluated using multiple assessment means.Comment: 20 pages, 7 figures. Accepted for publication in PLoS On

arXiv.org e-Print Archive

The Open Repository @Binghamton (The ORB)

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

Author: Bastien Olivier
Birkholtz Lyn-Marie
Breton Vincent
Grando Delphine
Hofmann-Apitius Martin
Jacq Nicolas
Joubert Fourie
Kasam Vinod
Louw Abraham I
Maréchal Eric
Ortet Philippe
Roy Sylvaine
Saïdani Nadia
Wells Gordon
Zimmermann Marc
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

Hal - Université Grenoble Alpes

HAL AMU

Fraunhofer-ePrints

HAL Clermont Université

HAL Descartes

HAL-CEA

ProdInra

arXiv.org e-Print Archive

HAL-IN2P3

Springer - Publisher Connector

PubMed Central

UPSpace at the University of Pretoria

Reliable estimation of prediction uncertainty for physico-chemical property models

Author: Bishop C. M.
Chernick M. R.
Davison A. C.
Eaton J. W.
Fan Y.-P.
Gentle J. E.
Gütlich P.
Hastie T.
Jonny Proppe
Markus Reiher
Rasmussen
Schwabl F.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2017
Field of study

The predictions of parameteric property models and their uncertainties are sensitive to systematic errors such as inconsistent reference data, parametric model assumptions, or inadequate computational methods. Here, we discuss the calibration of property models in the light of bootstrapping, a sampling method akin to Bayesian inference that can be employed for identifying systematic errors and for reliable estimation of the prediction uncertainty. We apply bootstrapping to assess a linear property model linking the 57Fe Moessbauer isomer shift to the contact electron density at the iron nucleus for a diverse set of 44 molecular iron compounds. The contact electron density is calculated with twelve density functionals across Jacob's ladder (PWLDA, BP86, BLYP, PW91, PBE, M06-L, TPSS, B3LYP, B3PW91, PBE0, M06, TPSSh). We provide systematic-error diagnostics and reliable, locally resolved uncertainties for isomer-shift predictions. Pure and hybrid density functionals yield average prediction uncertainties of 0.06-0.08 mm/s and 0.04-0.05 mm/s, respectively, the latter being close to the average experimental uncertainty of 0.02 mm/s. Furthermore, we show that both model parameters and prediction uncertainty depend significantly on the composition and number of reference data points. Accordingly, we suggest that rankings of density functionals based on performance measures (e.g., the coefficient of correlation, r2, or the root-mean-square error, RMSE) should not be inferred from a single data set. This study presents the first statistically rigorous calibration analysis for theoretical Moessbauer spectroscopy, which is of general applicability for physico-chemical property models and not restricted to isomer-shift predictions. We provide the statistically meaningful reference data set MIS39 and a new calibration of the isomer shift based on the PBE0 functional.Comment: 49 pages, 9 figures, 7 table

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

FigShare

Investigation of Membrane Receptors’ Oligomers Using Fluorescence Resonance Energy Transfer and Multiphoton Microscopy in Living Cells

Author: Mishra Ashish K.
Publication venue: UWM Digital Commons
Publication date: 01/05/2017
Field of study

Investigating quaternary structure (oligomerization) of macromolecules (such as proteins and nucleic acids) in living systems (in vivo) has been a great challenge in biophysics, due to molecular diffusion, fluctuations in several biochemical parameters such as pH, quenching of fluorescence by oxygen (when fluorescence methods are used), etc. We studied oligomerization of membrane receptors in living cells by means of Fluorescence (Förster) Resonance Energy Transfer (FRET) using fluorescent markers and two photon excitation fluorescence micro-spectroscopy. Using suitable FRET models, we determined the stoichiometry and quaternary structure of various macromolecular complexes. The proteins of interest for this work are : (1) sigma-1 receptor and (2) rhodopsin, are described as below. (1) Sigma-1 receptors are molecular chaperone proteins, which also regulate ion channels. S1R seems to be involved in substance abuse, as well as several diseases such as Alzheimer’s. We studied S1R in the presence and absence of its ligands haloperidol (an antagonist) and pentazocine +/- (an agonist), and found that at low concentration they reside as a mixture of monomers and dimers and that they may form higher order oligomers at higher concentrations. (2) Rhodopsin is a prototypical G protein coupled receptor (GPCR) and is directly involved in vision. GPCRs form a large family of receptors that participate in cell signaling by responding to external stimuli such as drugs, thus being a major drug target (more than 40% drugs target GPCRs). Their oligomerization has been largely controversial. Understanding this may help to understand the functional role of GPCRs oligomerization, and may lead to the discovery of more drugs targeting GPCR oligomers. It may also contribute toward finding a cure for Retinitis Pigmentosa, which is caused by a mutation (G188R) in rhodopsin, a disease which causes blindness and has no cure so far. Comparing healthy rhodopsin’s oligomeric structure with that of the mutant may give clues to find the cure

University of Wisconsin-Milwaukee

Modeling single microtubules as a colloidal system to measure the harmonic interactions between tubulin dimers in bovine brain derived versus cancer cell derived microtubules

Author: Aslam Arooj
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2020
Field of study

The local properties of tubulin dimers dictate the properties of the larger microtubule assembly. In order to elucidate this connection, tubulin-tubulin interactions are be modeled as harmonic interactions to map the stiffness matrix along the length of the microtubule. The strength of the interactions are measured by imaging and tracking the movement of segments along the microtubule over time, and then performing a fourier transform to extract the natural vibrational frequencies. Using this method the first ever reported experimental phonon spectrum of the microtubule is reported. This method can also be applied to other biological materials, and opens new doors for structural analysis in the life sciences. Methods used in colloidal soft matter physics were also adapted to the study of the microtubule to develop new methods to measure local stiffness in biological materials. Using this method it is shown that there is local variability in the mechanical properties of bovine brain derived versus cancer cell derived microtubules. This provide insight to how local changes affect the dynamic instability of microtubules of different types. Finally, a nanofluidic device to isolate single microtubules is also reported, and is designed to be used for the study of any biological polymer. It can also be adapted to incorporate nano-scale electrodes for the sensing and actuation of single isolated proteins

Digital Commons @ New Jersey Institute of Technology (NJIT)

Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models

Author: Chen Huajun
Chen Zhuo
Fan Xiaohui
Fang Yin
Huang Rui
Liang Xiaozhuan
Liu Kangwei
Zhang Ningyu
Publication venue
Publication date: 29/08/2023
Field of study

Large Language Models (LLMs), with their remarkable task-handling capabilities and innovative outputs, have catalyzed significant advancements across a spectrum of fields. However, their proficiency within specialized domains such as biomolecular studies remains limited. To address this challenge, we introduce Mol-Instructions, a meticulously curated, comprehensive instruction dataset expressly designed for the biomolecular realm. Mol-Instructions is composed of three pivotal components: molecule-oriented instructions, protein-oriented instructions, and biomolecular text instructions, each curated to enhance the understanding and prediction capabilities of LLMs concerning biomolecular features and behaviors. Through extensive instruction tuning experiments on the representative LLM, we underscore the potency of Mol-Instructions to enhance the adaptability and cognitive acuity of large models within the complex sphere of biomolecular studies, thereby promoting advancements in the biomolecular research community. Mol-Instructions is made publicly accessible for future research endeavors and will be subjected to continual updates for enhanced applicability.Comment: Project homepage: https://github.com/zjunlp/Mol-Instructions. Add quantitative evaluation

arXiv.org e-Print Archive