5,474 research outputs found
Beyond Chemical Language: A Multimodal Approach to Enhance Molecular Property Prediction
We present a novel multimodal language model approach for predicting
molecular properties by combining chemical language representation with
physicochemical features. Our approach, MULTIMODAL-MOLFORMER, utilizes a causal
multistage feature selection method that identifies physicochemical features
based on their direct causal effect on a specific target property. These causal
features are then integrated with the vector space generated by molecular
embeddings from MOLFORMER. In particular, we employ Mordred descriptors as
physicochemical features and identify the Markov blanket of the target
property, which theoretically contains the most relevant features for accurate
prediction. Our results demonstrate a superior performance of our proposed
approach compared to existing state-of-the-art algorithms, including the
chemical language-based MOLFORMER and graph neural networks, in predicting
complex tasks such as biodegradability and PFAS toxicity estimation. Moreover,
we demonstrate the effectiveness of our feature selection method in reducing
the dimensionality of the Mordred feature space while maintaining or improving
the model's performance. Our approach opens up promising avenues for future
research in molecular property prediction by harnessing the synergistic
potential of both chemical language and physicochemical features, leading to
enhanced performance and advancements in the field.Comment: 14 pages, 6 Figures, 5 tables. Submited to NEURIPS 2023, Under revie
Comparison of deep-learning data fusion strategies in mandibular osteoradionecrosis prediction modelling using clinical variables and radiation dose distribution volumes
Purpose. NTCP modelling is rapidly embracing DL methods as the need to
include spatial dose information is acknowledged. Finding the most appropriate
way of combining radiation dose distribution images and clinical data involves
technical challenges and requires domain knowledge. We propose different data
fusion strategies that we hope will serve as a starting point for future DL
NTCP studies. Methods. Early, joint and late DL multi-modality fusion
strategies were compared using clinical variables and mandibular radiation dose
distribution volumes. The discriminative performance of the multi-modality
models was compared to that of single-modality models. All the experiments were
conducted on a control-case matched cohort of 92 ORN cases and 92 controls from
a single institution. Results. The highest ROC AUC score was obtained with the
late fusion model (0.70), but no statistically significant differences in
discrimination performance were observed between strategies. While late fusion
was the least technically complex strategy, its design did not model the
inter-modality interactions that are required for NTCP modelling. Joint fusion
involved the most complex design but resulted in a single network training
process which included intra- and inter-modality interactions in its model
parameter optimisation. Conclusions. This is the first study that compares
different strategies for including image data into DL NTCP models in
combination with lower dimensional data such as clinical variables. The
discrimination performance of such multi-modality NTCP models and the choice of
fusion strategy will depend on the distribution and quality of both types of
data. We encourage future DL NTCP studies to report on different fusion
strategies to better justify their choice of DL pipeline.Comment: 10 pages, 4 figures, 3 table
Prediction of Total Drug Clearance in Humans Using Animal Data: Proposal of a Multimodal Learning Method Based on Deep Learning
Research into pharmacokinetics plays an important role in the development process of new drugs. Accurately predicting human pharmacokinetic parameters from preclinical data can increase the success rate of clinical trials. Since clearance (CL) which indicates the capacity of the entire body to process a drug is one of the most important parameters, many methods have been developed. However, there are still rooms to be improved for practical use in drug discovery research; "improving CL prediction accuracy" and "understanding the chemical structure of compounds in terms of pharmacokinetics". To improve those, this research proposes a multimodal learning method based on deep learning that takes not only the chemical structure of a drug but also rat CL as inputs. Good results were obtained compared with the conventional animal scale-up method; the geometric mean fold error was 2.68 and the proportion of compounds with prediction errors of 2-fold or less was 48.5%. Furthermore, it was found to be possible to infer the partial structure useful for CL prediction by a structure contributing factor inference method. The validity of these results of structural interpretation of metabolic stability was confirmed by chemists
Artificial General Intelligence for Radiation Oncology
The emergence of artificial general intelligence (AGI) is transforming
radiation oncology. As prominent vanguards of AGI, large language models (LLMs)
such as GPT-4 and PaLM 2 can process extensive texts and large vision models
(LVMs) such as the Segment Anything Model (SAM) can process extensive imaging
data to enhance the efficiency and precision of radiation therapy. This paper
explores full-spectrum applications of AGI across radiation oncology including
initial consultation, simulation, treatment planning, treatment delivery,
treatment verification, and patient follow-up. The fusion of vision data with
LLMs also creates powerful multimodal models that elucidate nuanced clinical
patterns. Together, AGI promises to catalyze a shift towards data-driven,
personalized radiation therapy. However, these models should complement human
expertise and care. This paper provides an overview of how AGI can transform
radiation oncology to elevate the standard of patient care in radiation
oncology, with the key insight being AGI's ability to exploit multimodal
clinical data at scale
Morphological Profiling for Drug Discovery in the Era of Deep Learning
Morphological profiling is a valuable tool in phenotypic drug discovery. The
advent of high-throughput automated imaging has enabled the capturing of a wide
range of morphological features of cells or organisms in response to
perturbations at the single-cell resolution. Concurrently, significant advances
in machine learning and deep learning, especially in computer vision, have led
to substantial improvements in analyzing large-scale high-content images at
high-throughput. These efforts have facilitated understanding of compound
mechanism-of-action (MOA), drug repurposing, characterization of cell
morphodynamics under perturbation, and ultimately contributing to the
development of novel therapeutics. In this review, we provide a comprehensive
overview of the recent advances in the field of morphological profiling. We
summarize the image profiling analysis workflow, survey a broad spectrum of
analysis strategies encompassing feature engineering- and deep learning-based
approaches, and introduce publicly available benchmark datasets. We place a
particular emphasis on the application of deep learning in this pipeline,
covering cell segmentation, image representation learning, and multimodal
learning. Additionally, we illuminate the application of morphological
profiling in phenotypic drug discovery and highlight potential challenges and
opportunities in this field.Comment: 44 pages, 5 figure, 5 table
- …