5,474 research outputs found

    Beyond Chemical Language: A Multimodal Approach to Enhance Molecular Property Prediction

    Full text link
    We present a novel multimodal language model approach for predicting molecular properties by combining chemical language representation with physicochemical features. Our approach, MULTIMODAL-MOLFORMER, utilizes a causal multistage feature selection method that identifies physicochemical features based on their direct causal effect on a specific target property. These causal features are then integrated with the vector space generated by molecular embeddings from MOLFORMER. In particular, we employ Mordred descriptors as physicochemical features and identify the Markov blanket of the target property, which theoretically contains the most relevant features for accurate prediction. Our results demonstrate a superior performance of our proposed approach compared to existing state-of-the-art algorithms, including the chemical language-based MOLFORMER and graph neural networks, in predicting complex tasks such as biodegradability and PFAS toxicity estimation. Moreover, we demonstrate the effectiveness of our feature selection method in reducing the dimensionality of the Mordred feature space while maintaining or improving the model's performance. Our approach opens up promising avenues for future research in molecular property prediction by harnessing the synergistic potential of both chemical language and physicochemical features, leading to enhanced performance and advancements in the field.Comment: 14 pages, 6 Figures, 5 tables. Submited to NEURIPS 2023, Under revie

    Comparison of deep-learning data fusion strategies in mandibular osteoradionecrosis prediction modelling using clinical variables and radiation dose distribution volumes

    Full text link
    Purpose. NTCP modelling is rapidly embracing DL methods as the need to include spatial dose information is acknowledged. Finding the most appropriate way of combining radiation dose distribution images and clinical data involves technical challenges and requires domain knowledge. We propose different data fusion strategies that we hope will serve as a starting point for future DL NTCP studies. Methods. Early, joint and late DL multi-modality fusion strategies were compared using clinical variables and mandibular radiation dose distribution volumes. The discriminative performance of the multi-modality models was compared to that of single-modality models. All the experiments were conducted on a control-case matched cohort of 92 ORN cases and 92 controls from a single institution. Results. The highest ROC AUC score was obtained with the late fusion model (0.70), but no statistically significant differences in discrimination performance were observed between strategies. While late fusion was the least technically complex strategy, its design did not model the inter-modality interactions that are required for NTCP modelling. Joint fusion involved the most complex design but resulted in a single network training process which included intra- and inter-modality interactions in its model parameter optimisation. Conclusions. This is the first study that compares different strategies for including image data into DL NTCP models in combination with lower dimensional data such as clinical variables. The discrimination performance of such multi-modality NTCP models and the choice of fusion strategy will depend on the distribution and quality of both types of data. We encourage future DL NTCP studies to report on different fusion strategies to better justify their choice of DL pipeline.Comment: 10 pages, 4 figures, 3 table

    Prediction of Total Drug Clearance in Humans Using Animal Data: Proposal of a Multimodal Learning Method Based on Deep Learning

    Get PDF
    Research into pharmacokinetics plays an important role in the development process of new drugs. Accurately predicting human pharmacokinetic parameters from preclinical data can increase the success rate of clinical trials. Since clearance (CL) which indicates the capacity of the entire body to process a drug is one of the most important parameters, many methods have been developed. However, there are still rooms to be improved for practical use in drug discovery research; "improving CL prediction accuracy" and "understanding the chemical structure of compounds in terms of pharmacokinetics". To improve those, this research proposes a multimodal learning method based on deep learning that takes not only the chemical structure of a drug but also rat CL as inputs. Good results were obtained compared with the conventional animal scale-up method; the geometric mean fold error was 2.68 and the proportion of compounds with prediction errors of 2-fold or less was 48.5%. Furthermore, it was found to be possible to infer the partial structure useful for CL prediction by a structure contributing factor inference method. The validity of these results of structural interpretation of metabolic stability was confirmed by chemists

    Artificial General Intelligence for Radiation Oncology

    Full text link
    The emergence of artificial general intelligence (AGI) is transforming radiation oncology. As prominent vanguards of AGI, large language models (LLMs) such as GPT-4 and PaLM 2 can process extensive texts and large vision models (LVMs) such as the Segment Anything Model (SAM) can process extensive imaging data to enhance the efficiency and precision of radiation therapy. This paper explores full-spectrum applications of AGI across radiation oncology including initial consultation, simulation, treatment planning, treatment delivery, treatment verification, and patient follow-up. The fusion of vision data with LLMs also creates powerful multimodal models that elucidate nuanced clinical patterns. Together, AGI promises to catalyze a shift towards data-driven, personalized radiation therapy. However, these models should complement human expertise and care. This paper provides an overview of how AGI can transform radiation oncology to elevate the standard of patient care in radiation oncology, with the key insight being AGI's ability to exploit multimodal clinical data at scale

    Morphological Profiling for Drug Discovery in the Era of Deep Learning

    Full text link
    Morphological profiling is a valuable tool in phenotypic drug discovery. The advent of high-throughput automated imaging has enabled the capturing of a wide range of morphological features of cells or organisms in response to perturbations at the single-cell resolution. Concurrently, significant advances in machine learning and deep learning, especially in computer vision, have led to substantial improvements in analyzing large-scale high-content images at high-throughput. These efforts have facilitated understanding of compound mechanism-of-action (MOA), drug repurposing, characterization of cell morphodynamics under perturbation, and ultimately contributing to the development of novel therapeutics. In this review, we provide a comprehensive overview of the recent advances in the field of morphological profiling. We summarize the image profiling analysis workflow, survey a broad spectrum of analysis strategies encompassing feature engineering- and deep learning-based approaches, and introduce publicly available benchmark datasets. We place a particular emphasis on the application of deep learning in this pipeline, covering cell segmentation, image representation learning, and multimodal learning. Additionally, we illuminate the application of morphological profiling in phenotypic drug discovery and highlight potential challenges and opportunities in this field.Comment: 44 pages, 5 figure, 5 table
    corecore