488 research outputs found

    Robust Principal Component Analysis-based Prediction of Protein-Protein Interaction Hot spots ( {RBHS} )

    Get PDF
    Proteins often exert their function by binding to other cellular partners. The hot spots are key residues for protein-protein binding. Their identification may shed light on the impact of disease associated mutations on protein complexes and help design protein-protein interaction inhibitors for therapy. Unfortunately, current machine learning methods to predict hot spots, suffer from limitations caused by gross errors in the data matrices. Here, we present a novel data pre-processing pipeline that overcomes this problem by recovering a low rank matrix with reduced noise using Robust Principal Component Analysis. Application to existing databases shows the predictive power of the method

    Robust principal component analysis-based prediction of protein-protein interaction hot spots.

    Get PDF
    AbstractProteins often exert their function by binding to other cellular partners. The hot spots are key residues for protein‐protein binding. Their identification may shed light on the impact of disease associated mutations on protein complexes and help design protein‐protein interaction inhibitors for therapy. Unfortunately, current machine learning methods to predict hot spots, suffer from limitations caused by gross errors in the data matrices. Here, we present a novel data pre‐processing pipeline that overcomes this problem by recovering a low rank matrix with reduced noise using Robust Principal Component Analysis. Application to existing databases shows the predictive power of the method

    A transferable machine-learning framework linking interstice distribution and plastic heterogeneity in metallic glasses

    Get PDF
    When metallic glasses (MGs) are subjected to mechanical loads, the plastic response of atoms is non-uniform. However, the extent and manner in which atomic environment signatures present in the undeformed structure determine this plastic heterogeneity remain elusive. Here, we demonstrate that novel site environment features that characterize interstice distributions around atoms combined with machine learning (ML) can reliably identify plastic sites in several Cu-Zr compositions. Using only quenched structural information as input, the ML-based plastic probability estimates ("quench-in softness" metric) can identify plastic sites that could activate at high strains, losing predictive power only upon the formation of shear bands. Moreover, we reveal that a quench-in softness model trained on a single composition and quenching rate substantially improves upon previous models in generalizing to different compositions and completely different MG systems (Ni62Nb38, Al90Sm10 and Fe80P20). Our work presents a general, data-centric framework that could potentially be used to address the structural origin of any site-specific property in MGs

    Rapidly predicting Kohn–Sham total energy using data-centric AI

    Get PDF
    Predicting material properties by solving the Kohn-Sham (KS) equation, which is the basis of modern computational approaches to electronic structures, has provided significant improvements in materials sciences. Despite its contributions, both DFT and DFTB calculations are limited by the number of electrons and atoms that translate into increasingly longer run-times. In this work we introduce a novel, data-centric machine learning framework that is used to rapidly and accurately predicate the KS total energy of anatase TiO 2 nanoparticles (NPs) at different temperatures using only a small amount of theoretical data. The proposed framework that we call co-modeling eliminates the need for experimental data and is general enough to be used over any NPs to determine electronic structure and, consequently, more efficiently study physical and chemical properties. We include a web service to demonstrate the effectiveness of our approach. © 2022, The Author(s)

    The spatial dynamics of invasive para grass on a monsoonal floodplain, Kakadu National Park, northern Australia

    Get PDF
    Abstract: African para grass (Urochloa mutica) is an invasive weed that has become prevalent across many important freshwater wetlands of the world. In northern Australia, including the World Heritage landscape of Kakadu National Park (KNP), its dense cover can displace ecologically, genetically and culturally significant species, such as the Australian native rice (Oryza spp.). In regions under management for biodiversity conservation para grass is often beyond eradication. However, its targeted control is also necessary to manage and preserve site-specific wetland values. This requires an understanding of para grass spread-patterns and its potential impacts on valuable native vegetation. We apply a multi-scale approach to examine the spatial dynamics and impact of para grass cover across a 181 km2 floodplain of KNP. First, we measure the overall displacement of different native vegetation communities across the floodplain from 1986 to 2006. Using high spatial resolution satellite imagery in conjunction with historical aerial-photo mapping, we then measure finer-scale, inter-annual, changes between successive dry seasons from 1990 to 2010 (for a 48 km2 focus area); Para grass presence-absence maps from satellite imagery (2002 to 2010) were produced with an object-based machine-learning approach (stochastic gradient boosting). Changes, over time, in mapped para grass areas were then related to maps of depth-habitat and inter-annual fire histories. Para grass invasion and establishment patterns varied greatly in time and space. Wild rice communities were the most frequently invaded, but the establishment and persistence of para grass fluctuated greatly between years, even within previously invaded communities. However, these different patterns were also shown to vary with different depth-habitat and recent fire history. These dynamics have not been previously documented and this understanding presents opportunities for intensive para grass management in areas of high conservation value, such as those occupied by wild rice
    • 

    corecore