13,848 research outputs found

    Highly Accurate Quantum Chemical Property Prediction with Uni-Mol+

    Full text link
    Recent developments in deep learning have made remarkable progress in speeding up the prediction of quantum chemical (QC) properties by removing the need for expensive electronic structure calculations like density functional theory. However, previous methods learned from 1D SMILES sequences or 2D molecular graphs failed to achieve high accuracy as QC properties primarily depend on the 3D equilibrium conformations optimized by electronic structure methods, far different from the sequence-type and graph-type data. In this paper, we propose a novel approach called Uni-Mol+ to tackle this challenge. Uni-Mol+ first generates a raw 3D molecule conformation from inexpensive methods such as RDKit. Then, the raw conformation is iteratively updated to its target DFT equilibrium conformation using neural networks, and the learned conformation will be used to predict the QC properties. To effectively learn this update process towards the equilibrium conformation, we introduce a two-track Transformer model backbone and train it with the QC property prediction task. We also design a novel approach to guide the model's training process. Our extensive benchmarking results demonstrate that the proposed Uni-Mol+ significantly improves the accuracy of QC property prediction in various datasets. We have made the code and model publicly available at \url{https://github.com/dptech-corp/Uni-Mol}

    MolFM: A Multimodal Molecular Foundation Model

    Full text link
    Molecular knowledge resides within three different modalities of information sources: molecular structures, biomedical documents, and knowledge bases. Effective incorporation of molecular knowledge from these modalities holds paramount significance in facilitating biomedical research. However, existing multimodal molecular foundation models exhibit limitations in capturing intricate connections between molecular structures and texts, and more importantly, none of them attempt to leverage a wealth of molecular expertise derived from knowledge graphs. In this study, we introduce MolFM, a multimodal molecular foundation model designed to facilitate joint representation learning from molecular structures, biomedical texts, and knowledge graphs. We propose cross-modal attention between atoms of molecular structures, neighbors of molecule entities and semantically related texts to facilitate cross-modal comprehension. We provide theoretical analysis that our cross-modal pre-training captures local and global molecular knowledge by minimizing the distance in the feature space between different modalities of the same molecule, as well as molecules sharing similar structures or functions. MolFM achieves state-of-the-art performance on various downstream tasks. On cross-modal retrieval, MolFM outperforms existing models with 12.13% and 5.04% absolute gains under the zero-shot and fine-tuning settings, respectively. Furthermore, qualitative analysis showcases MolFM's implicit ability to provide grounding from molecular substructures and knowledge graphs. Code and models are available on https://github.com/BioFM/OpenBioMed.Comment: 31 pages, 15 figures, and 15 table

    Multi-dimensional omics approaches to dissect natural immune control mechanisms associated with RNA virus infections

    Get PDF
    In recent decades, global health has been challenged by emerging and re-emerging viruses such as severe acute respiratory syndrome coronavirus 2 (SARS-CoV2), human immunodeficiency viruses (HIV-1), and Crimean–Congo hemorrhagic fever virus (CCHFV). Studies have shown dysregulations in the host metabolic processes against SARS-CoV2 and HIV-1 infections, and the research on CCHFV infection is still in the infant stage. Hence, understanding the host metabolic re-programming on the reaction level in infectious disease has therapeutic importance. The thesis uses systems biology methods to investigate the host metabolic alterations in response to SARS-CoV2, HIV-1, and CCHFV infections. The three distinct viruses induce distinct effects on human metabolism that, nevertheless, show some commonalities. We have identified alterations in various immune cell types in patients during the infections of the three viruses. Further, differential expression analysis identified that COVID-19 causes disruptions in pathways related to antiviral response and metabolism (fructose mannose metabolism, oxidative phosphorylation (OXPHOS), and pentose phosphate pathway). Up-regulation of OXPHOS and ROS pathways with most changes in OXPHOS complexes I, III, and IV were identified in people living with HIV on treatment (PLWHART). The acute phase of CCHFV infection is found to be linked with OXPHOS, glycolysis, N-glycan biosynthesis, and NOD-like receptor signaling pathways. The dynamic nature of the metabolic process and adaptive immune response in CCHFV-pathogenesis are also observed. Further, we have identified different metabolic flux in reactions transporting TCA cycle intermediates from the cytosol to mitochondria in COVID-19 patients. Genes such as monocarboxylate transporter (SLC16A6) and nucleoside transporter (SLC29A1) and metabolites such as α-ketoglutarate, succinate, and malate were found to be linked with COVID-19 disease response. Metabolic reactions associated with amino acid, carbohydrate, and energy metabolism pathways and various transporter reactions were observed to be uniquely disrupted in PLWHART along with increased production of αketoglutarate (αKG) and ATP molecules. Changes in essential (leucine and threonine) and non-essential (arginine, alanine, and glutamine) amino acid transport were found to be caused by acute CCHFV infection. The altered flux of reactions involving TCA cycle compounds such as pyruvate, isocitrate, and alpha-ketoglutarate was also observed in CCHFV infection. The research described in the thesis displayed dysregulations in similar metabolic processes against the three viral Infections. But further downstream analysis unveiled unique alterations in several metabolic reactions specific to each virus in the same metabolic pathways showing the importance of increasing the resolution of knowledge about host metabolism in infectious diseases

    Introduction to Facial Micro Expressions Analysis Using Color and Depth Images: A Matlab Coding Approach (Second Edition, 2023)

    Full text link
    The book attempts to introduce a gentle introduction to the field of Facial Micro Expressions Recognition (FMER) using Color and Depth images, with the aid of MATLAB programming environment. FMER is a subset of image processing and it is a multidisciplinary topic to analysis. So, it requires familiarity with other topics of Artifactual Intelligence (AI) such as machine learning, digital image processing, psychology and more. So, it is a great opportunity to write a book which covers all of these topics for beginner to professional readers in the field of AI and even without having background of AI. Our goal is to provide a standalone introduction in the field of MFER analysis in the form of theorical descriptions for readers with no background in image processing with reproducible Matlab practical examples. Also, we describe any basic definitions for FMER analysis and MATLAB library which is used in the text, that helps final reader to apply the experiments in the real-world applications. We believe that this book is suitable for students, researchers, and professionals alike, who need to develop practical skills, along with a basic understanding of the field. We expect that, after reading this book, the reader feels comfortable with different key stages such as color and depth image processing, color and depth image representation, classification, machine learning, facial micro-expressions recognition, feature extraction and dimensionality reduction. The book attempts to introduce a gentle introduction to the field of Facial Micro Expressions Recognition (FMER) using Color and Depth images, with the aid of MATLAB programming environment.Comment: This is the second edition of the boo

    Chemometric approach to characterization of the selected grape seed oils based on their fatty acids composition and FTIR spectroscopy

    Get PDF
    Addressing the issues arising from the production and trade of low-quality foods necessitates developing new quality control methods. Cooking oils, especially those produced from the grape seeds, are an example of food products that often suffer from questionable quality due to various adulterations and low-quality fruits used for their production. Among many methods allowing for fast and efficient food quality control, the combination of experimental and advanced mathematical approaches seems most reliable. In this work a method for grape seed oils compositional characterization based on the infrared (FTIR) spectroscopy and fatty acids profile is reported. Also, the relevant parameters of oils are characterized using a combination of standard techniques such as the Principal Component Analysis, k-Means, and Gaussian Mixture Model (GMM) fitting parameters. Two different approaches to perform unsupervised clustering using GMM were investigated. The first approach relies on the profile of fatty acids, while the second is FT-IR spectroscopy-based. The GMM fitting parameters in both approaches were compared. The results obtained from both approaches are consistent and complementary and provide the tools to address the characterization and clustering issues in grape seed oils.O

    Identifying Appropriate Intellectual Property Protection Mechanisms for Machine Learning Models: A Systematization of Watermarking, Fingerprinting, Model Access, and Attacks

    Full text link
    The commercial use of Machine Learning (ML) is spreading; at the same time, ML models are becoming more complex and more expensive to train, which makes Intellectual Property Protection (IPP) of trained models a pressing issue. Unlike other domains that can build on a solid understanding of the threats, attacks and defenses available to protect their IP, the ML-related research in this regard is still very fragmented. This is also due to a missing unified view as well as a common taxonomy of these aspects. In this paper, we systematize our findings on IPP in ML, while focusing on threats and attacks identified and defenses proposed at the time of writing. We develop a comprehensive threat model for IP in ML, categorizing attacks and defenses within a unified and consolidated taxonomy, thus bridging research from both the ML and security communities

    Security and Privacy Problems in Voice Assistant Applications: A Survey

    Full text link
    Voice assistant applications have become omniscient nowadays. Two models that provide the two most important functions for real-life applications (i.e., Google Home, Amazon Alexa, Siri, etc.) are Automatic Speech Recognition (ASR) models and Speaker Identification (SI) models. According to recent studies, security and privacy threats have also emerged with the rapid development of the Internet of Things (IoT). The security issues researched include attack techniques toward machine learning models and other hardware components widely used in voice assistant applications. The privacy issues include technical-wise information stealing and policy-wise privacy breaches. The voice assistant application takes a steadily growing market share every year, but their privacy and security issues never stopped causing huge economic losses and endangering users' personal sensitive information. Thus, it is important to have a comprehensive survey to outline the categorization of the current research regarding the security and privacy problems of voice assistant applications. This paper concludes and assesses five kinds of security attacks and three types of privacy threats in the papers published in the top-tier conferences of cyber security and voice domain.Comment: 5 figure
    corecore