1,905 research outputs found

    A comparative evaluation of the generalised predictive ability of eight machine learning algorithms across ten clinical metabolomics data sets for binary classification

    Get PDF
    Introduction: Metabolomics is increasingly being used in the clinical setting for disease diagnosis, prognosis and risk prediction. Machine learning algorithms are particularly important in the construction of multivariate metabolite prediction. Historically, partial least squares (PLS) regression has been the gold standard for binary classification. Nonlinear machine learning methods such as random forests (RF), kernel support vector machines (SVM) and artificial neural networks (ANN) may be more suited to modelling possible nonlinear metabolite covariance, and thus provide better predictive models. Objectives: We hypothesise that for binary classification using metabolomics data, non-linear machine learning methods will provide superior generalised predictive ability when compared to linear alternatives, in particular when compared with the current gold standard PLS discriminant analysis. Methods: We compared the general predictive performance of eight archetypal machine learning algorithms across ten publicly available clinical metabolomics data sets. The algorithms were implemented in the Python programming language. All code and results have been made publicly available as Jupyter notebooks. Results: There was only marginal improvement in predictive ability for SVM and ANN over PLS across all data sets. RF performance was comparatively poor. The use of out-of-bag bootstrap confidence intervals provided a measure of uncertainty of model prediction such that the quality of metabolomics data was observed to be a bigger influence on generalised performance than model choice. Conclusion: The size of the data set, and choice of performance metric, had a greater influence on generalised predictive performance than the choice of machine learning algorithm

    Migrating from partial least squares discriminant analysis to artificial neural networks: A comparison of functionally equivalent visualisation and feature contribution tools using Jupyter Notebooks

    Get PDF
    Introduction: Metabolomics data is commonly modelled multivariately using partial least squares discriminant analysis (PLS-DA). Its success is primarily due to ease of interpretation, through projection to latent structures, and transparent assessment of feature importance using regression coefficients and Variable Importance in Projection scores. In recent years several non-linear machine learning (ML) methods have grown in popularity but with limited uptake essentially due to convoluted optimisation and interpretation. Artificial neural networks (ANNs) are a non-linear projection-based ML method that share a structural equivalence with PLS, and as such should be amenable to equivalent optimisation and interpretation methods. Objectives: We hypothesise that standardised optimisation, visualisation, evaluation and statistical inference techniques commonly used by metabolomics researchers for PLS-DA can be migrated to a non-linear, single hidden layer, ANN. Methods: We compared a standardised optimisation, visualisation, evaluation and statistical inference techniques workflow for PLS with the proposed ANN workflow. Both workflows were implemented in the Python programming language. All code and results have been made publicly available as Jupyter notebooks on GitHub. Results: The migration of the PLS workflow to a non-linear, single hidden layer, ANN was successful. There was a similarity in significant metabolites determined using PLS model coefficients and ANN Connection Weight Approach. Conclusion: We have shown that it is possible to migrate the standardised PLS-DA workflow to simple non-linear ANNs. This result opens the door for more widespread use and to the investigation of transparent interpretation of more complex ANN architectures

    No surviving evolved companions to the progenitor of supernova SN 1006

    Get PDF
    Type Ia supernovae are thought to occur as a white dwarf made of carbon and oxygen accretes sufficient mass to trigger a thermonuclear explosion1^{1}. The accretion could occur slowly from an unevolved (main-sequence) or evolved (subgiant or giant) star2,3^{2,3}, that being dubbed the single-degenerate channel, or rapidly as it breaks up a smaller orbiting white dwarf (the double- degenerate channel)3,4^{3,4}. Obviously, a companion will survive the explosion only in the single-degenerate channel5^{5}. Both channels might contribute to the production of type Ia supernovae6,7^{6,7} but their relative proportions still remain a fundamental puzzle in astronomy. Previous searches for remnant companions have revealed one possible case for SN 15728,9^{8,9}, though that has been criticized10^{10}. More recently, observations have restricted surviving companions to be small, main-sequence stars11,12,13^{11,12,13}, ruling out giant companions, though still allowing the single-degenerate channel. Here we report the result of a search for surviving companions to the progenitor of SN 100614^{14}. None of the stars within 4' of the apparent site of the explosion is associated with the supernova remnant, so we can firmly exclude all giant and subgiant companions to the progenitor. Combined with the previous results, less than 20 per cent of type Iae occur through the single degenerate channel.Comment: Published as a letter in Nature (2012 September 27

    Toward collaborative open data science in metabolomics using Jupyter Notebooks and cloud computing

    Get PDF
    Background A lack of transparency and reporting standards in the scientific community has led to increasing and widespread concerns relating to reproduction and integrity of results. As an omics science, which generates vast amounts of data and relies heavily on data science for deriving biological meaning, metabolomics is highly vulnerable to irreproducibility. The metabolomics community has made substantial efforts to align with FAIR data standards by promoting open data formats, data repositories, online spectral libraries, and metabolite databases. Open data analysis platforms also exist; however, they tend to be inflexible and rely on the user to adequately report their methods and results. To enable FAIR data science in metabolomics, methods and results need to be transparently disseminated in a manner that is rapid, reusable, and fully integrated with the published work. To ensure broad use within the community such a framework also needs to be inclusive and intuitive for both computational novices and experts alike. Aim of Review To encourage metabolomics researchers from all backgrounds to take control of their own data science, mould it to their personal requirements, and enthusiastically share resources through open science. Key Scientific Concepts of Review This tutorial introduces the concept of interactive web-based computational laboratory notebooks. The reader is guided through a set of experiential tutorials specifically targeted at metabolomics researchers, based around the Jupyter Notebook web application, GitHub data repository, and Binder cloud computing platform

    Chemi-Structural Stabilization of Formamidinium Lead Iodide Perovskite by Using Embedded Quantum Dots

    Get PDF
    The approaches to stabilize the perovskite structure of formamidinium lead iodide (FAPI) commonly result in a blue shift of the band gap, which limits the maximum photoconversion efficiency. Here, we report the use of PbS colloidal quantum dots (QDs) as a stabilizing agent, preserving the original low band gap of 1.5 eV. The surface chemistry of PbS plays a pivotal role by developing strong bonds with the black phase but weak ones with the yellow phase. As a result, a stable perovskite FAPI black phase can be formed at temperatures as low as 85 °C in just 10 min, setting a record of concomitantly fast and low-temperature formation for FAPI, with important consequences for industrialization. FAPI thin films obtained through this procedure reach an open-circuit potential (Voc) of 1.105 V, 91% of the maximum theoretical Voc, and preserve the efficiency for more than 700 h. These findings reveal the potential of strategies exploiting the chemi-structural properties of external additives to relax the tolerance factor and optimize the optoelectronic performance of perovskite materials

    Chronic Glaucoma Using Biodegradable Microspheres to Induce Intraocular Pressure Elevation. Six-Month Follow-Up

    Get PDF
    Altres ajuts: Rio Hortega Research Grant M17/00213, Research Group UCM 920415, UCM-Santander fellowship (CT17/17-CT17-18).Background: To compare two prolonged animal models of glaucoma over 24 weeks of follow-up. A novel pre-trabecular model of chronic glaucoma was achieved by injection of biodegradable poly lactic-co-glycolic acid (PLGA) microspheres (10-20 µm) (Ms20/10) into the ocular anterior chamber to progressively increase ocular hypertension (OHT). Methods: Rat right eyes were injected to induce OHT: 50% received a suspension of Ms20/10 in the anterior chamber at 0, 2, 4, 8, 12, 16 and 20 weeks, and the other 50% received a sclerosing episcleral vein injection biweekly (EPIm). Ophthalmological clinical signs, intraocular pressure (IOP), neuroretinal functionality measured by electroretinography (ERG), and structural analysis of the retina, retinal nerve fiber layer (RNFL) and ganglion cell layer (GCL) protocols using optical coherence tomography (OCT) and histological exams were performed. Results: Both models showed progressive neuroretinal degeneration (p < 0.05), and contralateral eye affectation. The Ms20/10 model showed a more progressive increase in IOP and better preservation of ocular surface. Although no statistical differences were found between models, the EPIm showed a tendency to produce thicker retinal and thinner GCL thicknesses, slower latency and smaller amplitude as measured using ERG, and more aggressive disturbances in retinal histology. In both models, while the GCL showed the greatest percentage loss of thickness, the RNFL showed the greatest and earliest rate of thickness loss. Conclusions: The intracameral model with biodegradable microspheres resulted more like the conditions observed in humans. It was obtained by a less-aggressive mechanism, which allows for adequate study of the pathology over longer periods

    An assessment of phytoplankton primary productivity in the Arctic Ocean from satellite ocean color/in situ chlorophyll-a based models

    Get PDF
    We investigated 32 net primary productivity (NPP) models by assessing skills to reproduce integrated NPP in the Arctic Ocean. The models were provided with two sources each of surface chlorophyll-a concentration (chlorophyll), photosynthetically available radiation (PAR), sea surface temperature (SST), and mixed-layer depth (MLD). The models were most sensitive to uncertainties in surface chlorophyll, generally performing better with in situ chlorophyll than with satellite-derived values. They were much less sensitive to uncertainties in PAR, SST, and MLD, possibly due to relatively narrow ranges of input data and/or relatively little difference between input data sources. Regardless of type or complexity, most of the models were not able to fully reproduce the variability of in situ NPP, whereas some of them exhibited almost no bias (i.e., reproduced the mean of in situ NPP). The models performed relatively well in low-productivity seasons as well as in sea ice-covered/deep-water regions. Depth-resolved models correlated more with in situ NPP than other model types, but had a greater tendency to overestimate mean NPP whereas absorption-based models exhibited the lowest bias associated with weaker correlation. The models performed better when a subsurface chlorophyll-a maximum (SCM) was absent. As a group, the models overestimated mean NPP, however this was partly offset by some models underestimating NPP when a SCM was present. Our study suggests that NPP models need to be carefully tuned for the Arctic Ocean because most of the models performing relatively well were those that used Arctic-relevant parameters

    Metabolite signatures associated with microRNA miR-143-3p serve as drivers of poor lung function trajectories in childhood asthma

    Get PDF
    Background: Lung function trajectories (LFTs) have been shown to be an important measure of long-term health in asthma. While there is a growing body of metabolomic studies on asthma status and other phenotypes, there are no prospective studies of the relationship between metabolomics and LFTs or their genomic determinants. Methods: We utilized ordinal logistic regression to identify plasma metabolite principal components associated with four previously-published LFTs in children from the Childhood Asthma Management Program (CAMP) (n = 660). The top significant metabolite principal component (PCLF) was evaluated in an independent cross-sectional child cohort, the Genetic Epidemiology of Asthma in Costa Rica Study (GACRS) (n = 1151) and evaluated for association with spirometric measures. Using meta-analysis of CAMP and GACRS, we identified associations between PCLF and microRNA, and SNPs in their target genes. Statistical significance was determined using an false discovery rate-adjusted Q-value. Findings: The top metabolite principal component, PCLF, was significantly associated with better LFTs after multiple-testing correction (Q-value = 0.03). PCLF is composed of the urea cycle, caffeine, corticosteroid, carnitine, and potential microbial (secondary bile acid, tryptophan, linoleate, histidine metabolism) metabolites. Higher levels of PCLF were also associated with increases in lung function measures and decreased circulating neutrophil percentage in both CAMP and GACRS. PCLF was also significantly associated with microRNA miR-143-3p, and SNPs in three miR-143-3p target genes; CCZ1 (P-value = 2.6 × 10−5), SLC8A1 (P-value = 3.9 × 10−5); and TENM4 (P-value = 4.9 × 10−5). Interpretation: This study reveals associations between metabolites, miR-143-3p and LFTs in children with asthma, offering insights into asthma physiology and possible interventions to enhance lung function and long-term health. Funding: Molecular data for CAMP and GACRS via the Trans-Omics in Precision Medicine (TOPMed) program was supported by the National Heart, Lung, and Blood Institute (NHLBI)

    Measurement of the Hadronic Photon Structure Function F_2^gamma at LEP2

    Get PDF
    The hadronic structure function of the photon F_2^gamma is measured as a function of Bjorken x and of the factorisation scale Q^2 using data taken by the OPAL detector at LEP. Previous OPAL measurements of the x dependence of F_2^gamma are extended to an average Q^2 of 767 GeV^2. The Q^2 evolution of F_2^gamma is studied for average Q^2 between 11.9 and 1051 GeV^2. As predicted by QCD, the data show positive scaling violations in F_2^gamma. Several parameterisations of F_2^gamma are in agreement with the measurements whereas the quark-parton model prediction fails to describe the data.Comment: 4 pages, 2 figures, to appear in the proceedings of Photon 2001, Ascona, Switzerlan

    Search for the Standard Model Higgs Boson with the OPAL Detector at LEP

    Get PDF
    This paper summarises the search for the Standard Model Higgs boson in e+e- collisions at centre-of-mass energies up to 209 GeV performed by the OPAL Collaboration at LEP. The consistency of the data with the background hypothesis and various Higgs boson mass hypotheses is examined. No indication of a signal is found in the data and a lower bound of 112.7GeV/C^2 is obtained on the mass of the Standard Model Higgs boson at the 95% CL.Comment: 51 pages, 21 figure
    corecore