Search CORE

9 research outputs found

Efficient enumeration of monocyclic chemical graphs with given path frequencies.

Author: Akutsu Tatsuya
Nagamochi Hiroshi
Suzuki Masaki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/05/2014
Field of study

[Background]The enumeration of chemical graphs (molecular graphs) satisfying given constraints is one of the fundamental problems in chemoinformatics and bioinformatics because it leads to a variety of useful applications including structure determination and development of novel chemical compounds. [Results]We consider the problem of enumerating chemical graphs with monocyclic structure (a graph structure that contains exactly one cycle) from a given set of feature vectors, where a feature vector represents the frequency of the prescribed paths in a chemical compound to be constructed and the set is specified by a pair of upper and lower feature vectors. To enumerate all tree-like (acyclic) chemical graphs from a given set of feature vectors, Shimizu et al. and Suzuki et al. proposed efficient branch-and-bound algorithms based on a fast tree enumeration algorithm. In this study, we devise a novel method for extending these algorithms to enumeration of chemical graphs with monocyclic structure by designing a fast algorithm for testing uniqueness. The results of computational experiments reveal that the computational efficiency of the new algorithm is as good as those for enumeration of tree-like chemical compounds.[Conclusions] We succeed in expanding the class of chemical graphs that are able to be enumerated efficiently

Crossref

PubMed Central

Kyoto University Research Information Repository

A novel method for inference of chemical compounds of cycle index two with desired properties based on artificial neural networks and integer programming

Author: Akutsu Tatsuya
Nagamochi Hiroshi
Shurbevski Aleksandar
Wang Chenxi
Zhu Jianshen
Publication venue: 'MDPI AG'
Publication date: 01/05/2020
Field of study

Inference of chemical compounds with desired properties is important for drug design, chemo-informatics, and bioinformatics, to which various algorithmic and machine learning techniques have been applied. Recently, a novel method has been proposed for this inference problem using both artificial neural networks (ANN) and mixed integer linear programming (MILP). This method consists of the training phase and the inverse prediction phase. In the training phase, an ANN is trained so that the output of the ANN takes a value nearly equal to a given chemical property for each sample. In the inverse prediction phase, a chemical structure is inferred using MILP and enumeration so that the structure can have a desired output value for the trained ANN. However, the framework has been applied only to the case of acyclic and monocyclic chemical compounds so far. In this paper, we significantly extend the framework and present a new method for the inference problem for rank-2 chemical compounds (chemical graphs with cycle index 2). The results of computational experiments using such chemical properties as octanol/water partition coefficient, melting point, and boiling point suggest that the proposed method is much more useful than the previous method

Multidisciplinary Digital Publishing Institute

Kyoto University Research Information Repository

Efficient enumeration of monocyclic chemical graphs with given path frequencies

Author: A Cayley
BG Buchanan
C Jordan
E Byvatov
EM Luks
F Harary
Faulon J-L
G Pólya
GH Bakir
GH Bakir
H Fujiwara
H Kashima
H Mauser
H Nagamochi
Hiroshi Nagamochi
JL Faulon
K Funatsu
L Bytautas
L Bytautas
L Bytautas
L Kier
LH Hall
M Deshpande
M Fürer
M Kanehisa
M Shimizu
M Suzuki
Masaki Suzuki
N Ueda
R Gugisch
S Nakano
S Nakano
T Akutsu
T Fink
T Miyao
Tatsuya Akutsu
WWL Wong
Y Ishida
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A novel method for inference of acyclic chemical compounds with bounded branch-height based on artificial neural networks and integer programming

Author: Akutsu Tatsuya
Azam Naveed Ahmed
Nagamochi Hiroshi
Shi Yu
Shurbevski Aleksandar
Sun Yanming
Zhao Liang
Zhu Jianshen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Analysis of chemical graphs is becoming a major research topic in computational molecular biology due to its potential applications to drug design. One of the major approaches in such a study is inverse quantitative structure activity/property relationship (inverse QSAR/QSPR) analysis, which is to infer chemical structures from given chemical activities/properties. Recently, a novel two-phase framework has been proposed for inverse QSAR/QSPR, where in the first phase an artificial neural network (ANN) is used to construct a prediction function. In the second phase, a mixed integer linear program (MILP) formulated on the trained ANN and a graph search algorithm are used to infer desired chemical structures. The framework has been applied to the case of chemical compounds with cycle index up to 2 so far. The computational results conducted on instances with n non-hydrogen atoms show that a feature vector can be inferred by solving an MILP for up to n=40, whereas graphs can be enumerated for up to n=15. When applied to the case of chemical acyclic graphs, the maximum computable diameter of a chemical structure was up to 8. In this paper, we introduce a new characterization of graph structure, called “branch-height” based on which a new MILP formulation and a new graph search algorithm are designed for chemical acyclic graphs. The results of computational experiments using such chemical properties as octanol/water partition coefficient, boiling point and heat of combustion suggest that the proposed method can infer chemical acyclic graphs with around n=50 and diameter 30

Directory of Open Access Journals

Kyoto University Research Information Repository

A Novel Method for Inference of Acyclic Chemical Compounds with Bounded Branch-height Based on Artificial Neural Networks and Integer Programming

Author: Akutsu Tatsuya
Azam Naveed Ahmed
Nagamochi Hiroshi
Shi Yu
Shurbevski Aleksandar
Sun Yanming
Zhao Liang
Zhu Jianshen
Publication venue
Publication date: 21/09/2020
Field of study

Analysis of chemical graphs is a major research topic in computational molecular biology due to its potential applications to drug design. One approach is inverse quantitative structure activity/property relationship (inverse QSAR/QSPR) analysis, which is to infer chemical structures from given chemical activities/properties. Recently, a framework has been proposed for inverse QSAR/QSPR using artificial neural networks (ANN) and mixed integer linear programming (MILP). This method consists of a prediction phase and an inverse prediction phase. In the first phase, a feature vector

f(G)

of a chemical graph

G

is introduced and a prediction function

\psi

on a chemical property

\pi

is constructed with an ANN. In the second phase, given a target value

y^*

of property

\pi

, a feature vector

x^*

is inferred by solving an MILP formulated from the trained ANN so that

\psi(x^*)

is close to

y^*

and then a set of chemical structures

G^*

such that

f(G^*)= x^*

is enumerated by a graph search algorithm. The framework has been applied to the case of chemical compounds with cycle index up to 2. The computational results conducted on instances with

n

non-hydrogen atoms show that a feature vector

x^*

can be inferred for up to around

n=40

whereas graphs

G^*

can be enumerated for up to

n=15

. When applied to the case of chemical acyclic graphs, the maximum computable diameter of

G^*

was around up to around 8. We introduce a new characterization of graph structure, "branch-height," based on which an MILP formulation and a graph search algorithm are designed for chemical acyclic graphs. The results of computational experiments using properties such as octanol/water partition coefficient, boiling point and heat of combustion suggest that the proposed method can infer chemical acyclic graphs

G^*

with

n=50

and diameter 30

arXiv.org e-Print Archive

Directory of Open Access Journals

Kyoto University Research Information Repository

Supplementary information Efficient Enumeration of Monocyclic Chemical Graphs with Given Path Frequencies Masaki Suzuki, Hiroshi Nagamochi

Author: Tatsuya Akutsu
Publication venue
Publication date
Field of study

tree G = T + xy has no heavy edge incident to C and we check if π(G) = T according to the definition of parent in Case 1. Then we compute σ ∗ (v) for all vertices v ∈ V (T) and σ ∗ (e) for all simple edges e in C, as discussed in the definition of parent π in Case 1. If xy ∈ E ∗ (C) then T + xy is a child of T; otherwise T + xy is not a child of T. To compute σ ∗ , we need to know the signature σ(Tw) of each tree Tw with w ∈ N(v) and v ∈ V (T). Since C contains the root of T, all these trees Tw appear as subtrees of T rooted at w and we know their signature σ(Tw) from the codes δ(τ) and M(τ) on the labeling dfs(τ). With an adequate data structure, we can compute σ(Tw) of each tree Tw in O(1) time and testing whether T = π(T +xy) can be done in O(|V (C) | 2) time due to verifying that c ∗ (xy) is the lexicographically maximum code among c ∗ (e) for at most |V (C) | simple edges e in C, whose length are 2|V (C) | − 1. Case II. C does not contain the root (centroid) of T: Since the centroid is not in C, G = T +xy has a heavy edge v ∗ w ∗ incident to a vertex v ∗ in C and we check if π(G) = T according to the definition of parent in Case 2. In Case 2, only one of the two edges in C incident to v ∗ is removed from G to define its parent π(G). Hence G = T + xy can be a child of T only when xy is incident to v ∗ , i.e., one of x and y (say x) is equal to v ∗ and the other y is a descendant of v ∗ = x (otherwis

CiteSeerX

RESEARCH ARTICLE

Author
Publication venue
Publication date
Field of study

Efficient enumeration of monocyclic chemical graphs with given path frequencies Masaki Suzuki 1,HiroshiNagamochi 1 * and Tatsuya Akutsu 2* Background: The enumeration of chemical graphs (molecular graphs) satisfying given constraints is one of the fundamental problems in chemoinformatics and bioinformatics because it leads to a variety of useful applications including structure determination and development of novel chemical compounds. Results: We consider the problem of enumerating chemical graphs with monocyclic structure (a graph structure that contains exactly one cycle) from a given set of feature vectors, where a feature vector represents the frequency of the prescribed paths in a chemical compound to be constructed and the set is specified by a pair of upper and lower feature vectors. To enumerate all tree-like (acyclic) chemical graphs from a given set of feature vectors, Shimizu et al. and Suzuki et al. proposed efficient branch-and-bound algorithms based on a fast tree enumeration algorithm. In this study, we devise a novel method for extending these algorithms to enumeration of chemical graphs with monocyclic structure by designing a fast algorithm for testing uniqueness. The results of computational experiments reveal that the computational efficiency of the new algorithm is as good as those for enumeration of tree-like chemical compounds. Conclusions: We succeed in expanding the class of chemical graphs that are able to be enumerated efficiently

CiteSeerX