128 research outputs found

    Recent Advances in Variational Autoencoders With Representation Learning for Biomedical Informatics: A Survey

    Get PDF
    Variational autoencoders (VAEs) are deep latent space generative models that have been immensely successful in multiple exciting applications in biomedical informatics such as molecular design, protein design, medical image classification and segmentation, integrated multi-omics data analyses, and large-scale biological sequence analyses, among others. The fundamental idea in VAEs is to learn the distribution of data in such a way that new meaningful data with more intra-class variations can be generated from the encoded distribution. The ability of VAEs to synthesize new data with more representation variance at state-of-art levels provides hope that the chronic scarcity of labeled data in the biomedical field can be resolved. Furthermore, VAEs have made nonlinear latent variable models tractable for modeling complex distributions. This has allowed for efficient extraction of relevant biomedical information from learned features for biological data sets, referred to as unsupervised feature representation learning. In this article, we review the various recent advancements in the development and application of VAEs for biomedical informatics. We discuss challenges and future opportunities for biomedical research with respect to VAEs.https://doi.org/10.1109/ACCESS.2020.304830

    Protein-Ligand Binding Affinity Directed Multi-Objective Drug Design Based on Fragment Representation Methods

    Get PDF
    Drug discovery is a challenging process with a vast molecular space to be explored and numerous pharmacological properties to be appropriately considered. Among various drug design protocols, fragment-based drug design is an effective way of constraining the search space and better utilizing biologically active compounds. Motivated by fragment-based drug search for a given protein target and the emergence of artificial intelligence (AI) approaches in this field, this work advances the field of in silico drug design by (1) integrating a graph fragmentation-based deep generative model with a deep evolutionary learning process for large-scale multi-objective molecular optimization, and (2) applying protein-ligand binding affinity scores together with other desired physicochemical properties as objectives. Our experiments show that the proposed method can generate novel molecules with improved property values and binding affinities

    Application of Generative Models on Modeling Biological Molecules

    Get PDF
    The last decade has been the stage for many groundbreaking Artificial Intelligence technologies, such as revolutionary language models: Generative models capable of synthesizing surprisingly unique data. Such a novelty also brings about public concerns, primarily due to state-of-the-art models' ''black box'' nature. One of the domains that has quickly adopted the generative deep learning paradigm is drug discovery, which, from a pharmaceutical industry point of view, is an extremely expensive and time-consuming process. However, the inner workings of such models are not inherently understandable by humans, causing hesitation to fully trust their results. The concept of disentanglement is one of the fundamental requirements to explain generative models, determining the extent to which steerability and navigation can be achieved in the latent space. Unfortunately, the application potential of interpretability approaches has some limitations depending on the availability of generative latent factors. This work aims to shed some light on the synthesized latent spaces of state-of-the-art molecular generative models: A couple of basic assumptions made about the latent space characteristics are analyzed and potential pitfalls related to domain, architecture, and molecule representation preferences are addressed. The degree to which the steerability in the latent space is achieved is quantified by implementing a novel interpretability approach, providing the basis for the comparison of alternative model configurations. The experiments further revealed that modeling decisions have a direct impact on achievable interpretability; albeit limited by the intricacies of the medicinal chemistry domain

    Neural Embeddings for Dimensionality Reduction of Complex Topology Feature Spaces

    Get PDF
    This study focuses on the main role of neural embeddings and their related design and optimization in the context of Artificial Intelligence (AI), particularly in the field of Deep Learning and Explainable AI (XAI). It explores how neural embeddings of data characterized by complex topology are crucial to address challenges and developments in the areas of data dimensionality reduction and network prediction analysis.In this thesis work, two independent but connected investigations were carried out to investigate the effect of neural encoding generated by the network for the target task in the case of data with a graph structure.The first project involved the study, design, and analysis of neural embeddings of synthetic polymers through the development of two Graph Variational Autoencoder neural networks. The goal is generating new polymers that incorporate additional structural information specific to the compounds, such as stoichiometry and chain architecture.These results were analyzed through several evaluation metrics that compare the two models created and highlight weaknesses and strengths of both approaches.A qualitative investigation of the latent space of the network highlighted how different neural embeddings created by the networks encode different information depending on the decoder model trained for generation, confirming and justifying the results obtained.In the second work, a graph neural network capable of predicting the bioactivity of molecules toward specific proteins was developed, employing neural embeddings to condense the totality of chemical information of the input data. Next, a hierarchical XAI methodology was devised to obtain additional interpretability information on molecular moieties that are relevant for the prediction, thus helping to clarify the model's decision-making process. The results obtained through explainability contribute to a deeper understanding of the data and the underlying problem.Through these studies, the importance of neural embedding design and optimization in the case of data and features with complex topology is highlighted, showing how deep neural networks, downstream of perfectly conducted training, embed all the information needed for the objective task in an encoded representation
    • …
    corecore