84 research outputs found

    Machine learning on normalized protein sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Machine learning techniques have been widely applied to biological sequences, e.g. to predict drug resistance in HIV-1 from sequences of drug target proteins and protein functional classes. As deletions and insertions are frequent in biological sequences, a major limitation of current methods is the inability to handle varying sequence lengths.</p> <p>Findings</p> <p>We propose to normalize sequences to uniform length. To this end, we tested one linear and four different non-linear interpolation methods for the normalization of sequence lengths of 19 classification datasets. Classification tasks included prediction of HIV-1 drug resistance from drug target sequences and sequence-based prediction of protein function. We applied random forests to the classification of sequences into "positive" and "negative" samples. Statistical tests showed that the linear interpolation outperforms the non-linear interpolation methods in most of the analyzed datasets, while in a few cases non-linear methods had a small but significant advantage. Compared to other published methods, our prediction scheme leads to an improvement in prediction accuracy by up to 14%.</p> <p>Conclusions</p> <p>We found that machine learning on sequences normalized by simple linear interpolation gave better or at least competitive results compared to state-of-the-art procedures, and thus, is a promising alternative to existing methods, especially for protein sequences of variable length.</p

    Global patterns in endemicity and vulnerability of soil fungi

    Get PDF
    Fungi are highly diverse organisms, which provide multiple ecosystem services. However, compared with charismatic animals and plants, the distribution patterns and conservation needs of fungi have been little explored. Here, we examined endemicity patterns, global change vulnerability and conservation priority areas for functional groups of soil fungi based on six global surveys using a high-resolution, long-read metabarcoding approach. We found that the endemicity of all fungi and most functional groups peaks in tropical habitats, including Amazonia, Yucatan, West-Central Africa, Sri Lanka, and New Caledonia, with a negligible island effect compared with plants and animals. We also found that fungi are predominantly vulnerable to drought, heat and land-cover change, particularly in dry tropical regions with high human population density. Fungal conservation areas of highest priority include herbaceous wetlands, tropical forests, and woodlands. We stress that more attention should be focused on the conservation of fungi, especially root symbiotic arbuscular mycorrhizal and ectomycorrhizal fungi in tropical regions as well as unicellular early-diverging groups and macrofungi in general. Given the low overlap between the endemicity of fungi and macroorganisms, but high conservation needs in both groups, detailed analyses on distribution and conservation requirements are warranted for other microorganisms and soil organisms

    Mesenchymal stem/stromal cells as a delivery platform in cell and gene therapies

    Full text link

    Mendelian randomization analyses in cardiometabolic disease:the challenge of rigorous interpretations of causality

    Get PDF

    The contribution of Paris to limit global warming to 2 °C

    No full text
    The international community has set a goal to limit global warming to 2 °C. Limiting global warming to 2 °C is a challenging goal and will entail a dramatic transformation of the global energy system, largely complete by 2040. As part of the work toward this goal, countries have been submitting their Intended Nationally Determined Contributions (INDCs) to the United Nations Framework Convention on Climate Change, indicating their emissions reduction commitments through 2025 or 2030, in advance of the 21st Conference of the Parties (COP21) in Paris in December 2015. In this paper, we use the Global Change Assessment Model (GCAM) to analyze the near versus long-term energy and economic-cost implications of these INDCs. The INDCs imply near-term actions that reduce the level of mitigation needed in the post-2030 period, particularly when compared with an alternative path in which nations are unable to undertake emissions mitigation until after 2030. We find that the latter case could require up to 2300 GW of premature retirements of fossil fuel power plants and up to 2900 GW of additional low-carbon power capacity installations within a five-year period of 2031–2035. INDCs have the effect of reducing premature retirements and new-capacity installations after 2030 by 50% and 34%, respectively. However, if presently announced INDCs were strengthened to achieve greater near-term emissions mitigation, the 2031–2035 transformation could be tempered to require 84% fewer premature retirements of power generation capacity and 56% fewer new-capacity additions. Our results suggest that the INDCs delivered for COP21 in Paris will have important contributions in reducing the challenges of achieving the goal of limiting global warming to 2 °C
    • 

    corecore