11,353 research outputs found

    Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening

    Full text link
    This work introduces a number of algebraic topology approaches, such as multicomponent persistent homology, multi-level persistent homology and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. Multicomponent persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for chemical and biological problems. Extensive numerical experiments involving more than 4,000 protein-ligand complexes from the PDBBind database and near 100,000 ligands and decoys in the DUD database are performed to test respectively the scoring power and the virtual screening power of the proposed topological approaches. It is demonstrated that the present approaches outperform the modern machine learning based methods in protein-ligand binding affinity predictions and ligand-decoy discrimination

    NILM techniques for intelligent home energy management and ambient assisted living: a review

    Get PDF
    The ongoing deployment of smart meters and different commercial devices has made electricity disaggregation feasible in buildings and households, based on a single measure of the current and, sometimes, of the voltage. Energy disaggregation is intended to separate the total power consumption into specific appliance loads, which can be achieved by applying Non-Intrusive Load Monitoring (NILM) techniques with a minimum invasion of privacy. NILM techniques are becoming more and more widespread in recent years, as a consequence of the interest companies and consumers have in efficient energy consumption and management. This work presents a detailed review of NILM methods, focusing particularly on recent proposals and their applications, particularly in the areas of Home Energy Management Systems (HEMS) and Ambient Assisted Living (AAL), where the ability to determine the on/off status of certain devices can provide key information for making further decisions. As well as complementing previous reviews on the NILM field and providing a discussion of the applications of NILM in HEMS and AAL, this paper provides guidelines for future research in these topics.Agência financiadora: Programa Operacional Portugal 2020 and Programa Operacional Regional do Algarve 01/SAICT/2018/39578 Fundação para a Ciência e Tecnologia through IDMEC, under LAETA: SFRH/BSAB/142998/2018 SFRH/BSAB/142997/2018 UID/EMS/50022/2019 Junta de Comunidades de Castilla-La-Mancha, Spain: SBPLY/17/180501/000392 Spanish Ministry of Economy, Industry and Competitiveness (SOC-PLC project): TEC2015-64835-C3-2-R MINECO/FEDERinfo:eu-repo/semantics/publishedVersio

    Quantitative toxicity prediction using topology based multi-task deep neural networks

    Full text link
    The understanding of toxicity is of paramount importance to human health and environmental protection. Quantitative toxicity analysis has become a new standard in the field. This work introduces element specific persistent homology (ESPH), an algebraic topology approach, for quantitative toxicity prediction. ESPH retains crucial chemical information during the topological abstraction of geometric complexity and provides a representation of small molecules that cannot be obtained by any other method. To investigate the representability and predictive power of ESPH for small molecules, ancillary descriptors have also been developed based on physical models. Topological and physical descriptors are paired with advanced machine learning algorithms, such as deep neural network (DNN), random forest (RF) and gradient boosting decision tree (GBDT), to facilitate their applications to quantitative toxicity predictions. A topology based multi-task strategy is proposed to take the advantage of the availability of large data sets while dealing with small data sets. Four benchmark toxicity data sets that involve quantitative measurements are used to validate the proposed approaches. Extensive numerical studies indicate that the proposed topological learning methods are able to outperform the state-of-the-art methods in the literature for quantitative toxicity analysis. Our online server for computing element-specific topological descriptors (ESTDs) is available at http://weilab.math.msu.edu/TopTox/Comment: arXiv admin note: substantial text overlap with arXiv:1703.1095

    TopologyNet: Topology based deep convolutional neural networks for biomolecular property predictions

    Full text link
    Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the entangled geometric complexity and biological complexity. We introduce topology, i.e., element specific persistent homology (ESPH), to untangle geometric complexity and biological complexity. ESPH represents 3D complex geometry by one-dimensional (1D) topological invariants and retains crucial biological information via a multichannel image representation. It is able to reveal hidden structure-function relationships in biomolecules. We further integrate ESPH and convolutional neural networks to construct a multichannel topological neural network (TopologyNet) for the predictions of protein-ligand binding affinities and protein stability changes upon mutation. To overcome the limitations to deep learning arising from small and noisy training sets, we present a multitask topological convolutional neural network (MT-TCNN). We demonstrate that the present TopologyNet architectures outperform other state-of-the-art methods in the predictions of protein-ligand binding affinities, globular protein mutation impacts, and membrane protein mutation impacts.Comment: 20 pages, 8 figures, 5 table

    Damage identification in structural health monitoring: a brief review from its implementation to the Use of data-driven applications

    Get PDF
    The damage identification process provides relevant information about the current state of a structure under inspection, and it can be approached from two different points of view. The first approach uses data-driven algorithms, which are usually associated with the collection of data using sensors. Data are subsequently processed and analyzed. The second approach uses models to analyze information about the structure. In the latter case, the overall performance of the approach is associated with the accuracy of the model and the information that is used to define it. Although both approaches are widely used, data-driven algorithms are preferred in most cases because they afford the ability to analyze data acquired from sensors and to provide a real-time solution for decision making; however, these approaches involve high-performance processors due to the high computational cost. As a contribution to the researchers working with data-driven algorithms and applications, this work presents a brief review of data-driven algorithms for damage identification in structural health-monitoring applications. This review covers damage detection, localization, classification, extension, and prognosis, as well as the development of smart structures. The literature is systematically reviewed according to the natural steps of a structural health-monitoring system. This review also includes information on the types of sensors used as well as on the development of data-driven algorithms for damage identification.Peer ReviewedPostprint (published version
    • …
    corecore