22 research outputs found
Recommended from our members
Tail-regression estimator for heavy-tailed distributions of known tail indices and its application to continuum quantum Monte Carlo data.
Standard statistical analysis is unable to provide reliable confidence intervals on expectation values of probability distributions that do not satisfy the conditions of the central limit theorem. We present a regression-based estimator of an arbitrary moment of a probability distribution with power-law heavy tails that exploits knowledge of the exponents of its asymptotic decay to bypass this issue entirely. Our method is applied to synthetic data and to energy and atomic force data from variational and diffusion quantum Monte Carlo calculations, whose distributions have known asymptotic forms [J. R. Trail, Phys. Rev. E 77, 016703 (2008)PLEEE81539-375510.1103/PhysRevE.77.016703; A. Badinski et al., J. Phys.: Condens. Matter 22, 074202 (2010)JCOMEL0953-898410.1088/0953-8984/22/7/074202]. We obtain convergent, accurate confidence intervals on the variance of the local energy of an electron gas and on the Hellmann-Feynman force on an atom in the all-electron carbon dimer. In each of these cases the uncertainty on our estimator is 45% and 60 times smaller, respectively, than the nominal (ill-defined) standard error
Machine learning to predict mesenchymal stem cell efficacy for cartilage repair.
Inconsistent therapeutic efficacy of mesenchymal stem cells (MSCs) in regenerative medicine has been documented in many clinical trials. Precise prediction on the therapeutic outcome of a MSC therapy based on the patient's conditions would provide valuable references for clinicians to decide the treatment strategies. In this article, we performed a meta-analysis on MSC therapies for cartilage repair using machine learning. A small database was generated from published in vivo and clinical studies. The unique features of our neural network model in handling missing data and calculating prediction uncertainty enabled precise prediction of post-treatment cartilage repair scores with coefficient of determination of 0.637 ± 0.005. From this model, we identified defect area percentage, defect depth percentage, implantation cell number, body weight, tissue source, and the type of cartilage damage as critical properties that significant impact cartilage repair. A dosage of 17 - 25 million MSCs was found to achieve optimal cartilage repair. Further, critical thresholds at 6% and 64% of cartilage damage in area, and 22% and 56% in depth were predicted to significantly compromise on the efficacy of MSC therapy. This study, for the first time, demonstrated machine learning of patient-specific cartilage repair post MSC therapy. This approach can be applied to identify and investigate more critical properties involved in MSC-induced cartilage repair, and adapted for other clinical indications
Recommended from our members
Imputation versus prediction: applications in machine learning for drug discovery
Imputation is a powerful statistical method that is distinct from the predictive modelling techniques more commonly used in drug discovery. Imputation uses sparse experimental data in an incomplete dataset to predict missing values by leveraging correlations between experimental assays. This contrasts with quantitative structure–activity relationship methods that use only descriptor – assay correlations. We summarize three recent imputation strategies – heterogeneous deep imputation, assay profile methods and matrix factorization – and compare these with quantitative structure–activity relationship methods, including deep learning, in drug discovery settings. We comment on the value added by imputation methods when used in an ongoing project and find that imputation produces stronger models, earlier in the project, over activity and absorption, distribution, metabolism and elimination end points. </jats:p
Recommended from our members
Au - Ge Alloys for Wide-Range Low-Temperature On-Chip Thermometry
We present results for a
Au
-
Ge
alloy that is useful as a resistance-based thermometer from room temperature down to at least 0.2 K. Over a wide range, the electrical resistivity of the alloy shows a logarithmic temperature dependence, which simultaneously retains the sensitivity required for practical thermometry while also maintaining a relatively modest and easily measurable value of resistivity. We characterize the sensitivity of the alloy as a possible thermometer and show that it compares favorably with commercially available temperature sensors. We experimentally identify that the characteristic logarithmic temperature dependence of the alloy stems from Kondo-like behavior induced by the specific heat treatment it undergoes.J.R.A.D., P.C.V., G.J.C., and V.N. acknowledge funding from the Engineering and Physical Sciences Research
Council, United Kingdom. G.J.C. and S.E.R. acknowledge funding from the Royal Society, United Kingdom.
J.F.O. thanks the Brazilian Agency CNPq. A.D. and
S.K-N. acknowledge financial support through a European
Research Council Starting Grant (Grant No. ERC-2014-
STG-639526, NANOGEN)
Recommended from our members
Deep imputation on large‐scale drug discovery data
More accurate predictions of the biological properties of chemical compounds would guide the selection and design of new compounds in drug discovery and help to address the enormous cost and low success-rate of pharmaceutical R&D. However this domain presents a significant challenge for AI methods due to the sparsity of compound data and the noise inherent in results from biological experiments. In this paper, we demonstrate how data imputation using deep learning provides substantial improvements over quantitative structure-activity relationship (QSAR) machine learning models that are widely applied in drug discovery. We present the largest-to-date successful application of deep-learning imputation to datasetswhich arecomparablein sizetothe corporate data repository of a pharmaceutical company (678,994 compounds by 1166 endpoints). We demonstrate this improvement for three areas of practical application linked to distinct use cases; i) target activity data compiled from a range of drug discovery projects, ii) a high value and heterogeneous datasetcovering complex absorption, distribution, metabolism and elimination properties and, iii) high throughput screeningdata, testing thealgorithm’slimits on early-stage noisy and very sparse data.Achieving median coefficients of determination, 2, of 0.69, 0.36 and 0.43 respectively across these applications, the deep learning imputation method offers an unambiguous improvement over random forest QSAR methods, which achieve median 2 values of 0.28, 0.19 and 0.23 respectively.We also demonstrate that robust estimates of the uncertainties in the predicted values correlate strongly with the accuracies in prediction, enabling greater confidence in decision-making based on the imputed values.Optibrium Ltd, Intellegens Ltd, Takeda, Royal Societ
OPTIMADE, an API for exchanging materials data
The Open Databases Integration for Materials Design (OPTIMADE) consortium has designed a universal application programming interface (API) to make materials databases accessible and interoperable. We outline the first stable release of the specification, v1.0, which is already supported by many leading databases and several software packages. We illustrate the advantages of the OPTIMADE API through worked examples on each of the public materials databases that support the full API specification
An Open Drug Discovery Competition: Experimental Validation of Predictive Models in a Series of Novel Antimalarials.
The Open Source Malaria (OSM) consortium is developing compounds that kill the human malaria parasite, Plasmodium falciparum, by targeting PfATP4, an essential ion pump on the parasite surface. The structure of PfATP4 has not been determined. Here, we describe a public competition created to develop a predictive model for the identification of PfATP4 inhibitors, thereby reducing project costs associated with the synthesis of inactive compounds. Competition participants could see all entries as they were submitted. In the final round, featuring private sector entrants specializing in machine learning methods, the best-performing models were used to predict novel inhibitors, of which several were synthesized and evaluated against the parasite. Half possessed biological activity, with one featuring a motif that the human chemists familiar with this series would have dismissed as "ill-advised". Since all data and participant interactions remain in the public domain, this research project "lives" and may be improved by others
OPTIMADE, an API for exchanging materials data
: The Open Databases Integration for Materials Design (OPTIMADE) consortium has designed a universal application programming interface (API) to make materials databases accessible and interoperable. We outline the first stable release of the specification, v1.0, which is already supported by many leading databases and several software packages. We illustrate the advantages of the OPTIMADE API through worked examples on each of the public materials databases that support the full API specification