Search CORE

12 research outputs found

Investigation of a Data Split Strategy Involving the Time Axis in Adverse Event Prediction Using Machine Learning

Author: Kusuhara Hiroyuki
Mizuno Tadahaya
Morita Katsuhisa
Publication venue
Publication date: 19/04/2022
Field of study

Adverse events are a serious issue in drug development and many prediction methods using machine learning have been developed. The random split cross-validation is the de facto standard for model building and evaluation in machine learning, but care should be taken in adverse event prediction because this approach tends to be overoptimistic compared with the real-world situation. The time split, which uses the time axis, is considered suitable for real-world prediction. However, the differences in model performance obtained using the time and random splits are not fully understood. To understand the differences, we compared the model performance between the time and random splits using eight types of compound information as input, eight adverse events as targets, and six machine learning algorithms. The random split showed higher area under the curve values than did the time split for six of eight targets. The chemical spaces of the training and test datasets of the time split were similar, suggesting that the concept of applicability domain is insufficient to explain the differences derived from the splitting. The area under the curve differences were smaller for the protein interaction than for the other datasets. Subsequent detailed analyses suggested the danger of confounding in the use of knowledge-based information in the time split. These findings indicate the importance of understanding the differences between the time and random splits in adverse event prediction and suggest that appropriate use of the splitting strategies and interpretation of results are necessary for the real-world prediction of adverse events.Comment: 20 pages, 4 figure

arXiv.org e-Print Archive

Difficulty in learning chirality for Transformer fed with SMILES

Author: Kusuhara Hiroyuki
Mizuno Tadahaya
Nemoto Shumpei
Yoshikai Yasuhiro
Publication venue
Publication date: 05/04/2023
Field of study

Recent years have seen development of descriptor generation based on representation learning of extremely diverse molecules, especially those that apply natural language processing (NLP) models to SMILES, a literal representation of molecular structure. However, little research has been done on how these models understand chemical structure. To address this, we investigated the relationship between the learning progress of SMILES and chemical structure using a representative NLP model, the Transformer. The results suggest that while the Transformer learns partial structures of molecules quickly, it requires extended training to understand overall structures. Consistently, the accuracy of molecular property predictions using descriptors generated from models at different learning steps was similar from the beginning to the end of training. Furthermore, we found that the Transformer requires particularly long training to learn chirality and sometimes stagnates with low translation accuracy due to misunderstanding of enantiomers. These findings are expected to deepen understanding of NLP models in chemistry.Comment: 20 pages, 6 figure

arXiv.org e-Print Archive

How does Transformer model evolve to learn diverse chemical structures?

Author: Tadahaya Mizuno (10579584)
Publication venue
Publication date: 17/02/2024
Field of study

source code

FigShare

Difficulty in chirality recognition for Transformer architectures learning chemical structures from string representations

Author: Hiroyuki Kusuhara
Shumpei Nemoto
Tadahaya Mizuno
Yasuhiro Yoshikai
Publication venue: Nature Portfolio
Publication date: 01/02/2024
Field of study

Abstract Recent years have seen rapid development of descriptor generation based on representation learning of extremely diverse molecules, especially those that apply natural language processing (NLP) models to SMILES, a literal representation of molecular structure. However, little research has been done on how these models understand chemical structure. To address this black box, we investigated the relationship between the learning progress of SMILES and chemical structure using a representative NLP model, the Transformer. We show that while the Transformer learns partial structures of molecules quickly, it requires extended training to understand overall structures. Consistently, the accuracy of molecular property predictions using descriptors generated from models at different learning steps was similar from the beginning to the end of training. Furthermore, we found that the Transformer requires particularly long training to learn chirality and sometimes stagnates with low performance due to misunderstanding of enantiomers. These findings are expected to deepen the understanding of NLP models in chemistry

Directory of Open Access Journals

Development of a Novel Platform of Proteome Profiling Based on an Easy-to-Handle and Informative 2D-DIGE System

Author: Hiroyuki Kusuhara
Megumi Hori
Michiaki Kohno
Setsuo Kinoshita
Tadahaya Mizuno
Publication venue: 'Pharmaceutical Society of Japan'
Publication date
Field of study

Crossref

Differential Roles of Ubiquitination in the Degradation Mechanism of Cell Surface–Resident Bile Salt Export Pump and Multidrug Resistance–Associated Protein 2

Author: Hayashi
Hisamitsu Hayashi
Kaori Inamura
Kensuke Aida
Tadahaya Mizuno
Yuichi Sugiyama
Publication venue: 'American Society for Pharmacology & Experimental Therapeutics (ASPET)'
Publication date
Field of study

Crossref

Evaluation of Organic Anion Transporter 1A2-knock-in Mice as a Model of Human Blood-brain Barrier

Author: Eckhardt
Gao
Hiroyuki Kusuhara
Lowry
Mina Umetsu
Sprowl
Tadahaya Mizuno
Tatsuki Mochizuki
Tetsuya Terasaki
Yamato Sano
Yasuo Uchida
Publication venue: 'American Society for Pharmacology & Experimental Therapeutics (ASPET)'
Publication date
Field of study

Crossref

Intestinal Atp8b1 dysfunction causes hepatic choline deficiency and steatohepatitis

Author: Akinari Fukuda
Ayano Inui
Daiki Abukawa
Hiroyuki Kusuhara
Hisamitsu Hayashi
Mitsuyoshi Suzuki
Mureo Kasahara
Ryutaro Tamura
Satoru Takahashi
Satoshi Nakano
Seiichi Shimizu
Seisuke Sakamoto
Seiya Mizuno
Shunsaku Kaji
Tadahaya Mizuno
Tatsuya Okamoto
Tomohiro Ando
Yoh Zen
Yoshihiro Azuma
Yusuke Sabu
Publication venue: Nature Portfolio
Publication date: 01/11/2023
Field of study

Abstract Choline is an essential nutrient, and its deficiency causes steatohepatitis. Dietary phosphatidylcholine (PC) is digested into lysoPC (LPC), glycerophosphocholine, and choline in the intestinal lumen and is the primary source of systemic choline. However, the major PC metabolites absorbed in the intestinal tract remain unidentified. ATP8B1 is a P4-ATPase phospholipid flippase expressed in the apical membrane of the epithelium. Here, we use intestinal epithelial cell (IEC)-specific Atp8b1-knockout (Atp8b1IEC-KO) mice. These mice progress to steatohepatitis by 4 weeks. Metabolomic analysis and cell-based assays show that loss of Atp8b1 in IEC causes LPC malabsorption and thereby hepatic choline deficiency. Feeding choline-supplemented diets to lactating mice achieves complete recovery from steatohepatitis in Atp8b1IEC-KO mice. Analysis of samples from pediatric patients with ATP8B1 deficiency suggests its translational potential. This study indicates that Atp8b1 regulates hepatic choline levels through intestinal LPC absorption, encouraging the evaluation of choline supplementation therapy for steatohepatitis caused by ATP8B1 dysfunction

Directory of Open Access Journals