Search CORE

5 research outputs found

On Maximum Common Subgraph Problems in Series-Parallel Graphs

Author: Kriege Nils
Kurpicz Florian
Mutzel Petra
Publication venue: Springer International Publishing
Publication date: 20/08/2021
Field of study

Inductive queries for a drug designing robot scientist

Author: A. Lingas
C. Hansch
C.A. Lipinski
D.R. Jones
D.R. Jones
H. Blockeel
J. Matousek
L. Raedt De
R.D. King
R.D. King
T. Gärtner
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments

Lirias

Crossref

Bournemouth University Research Online

The University of Manchester - Institutional Repository

DIAL UCLouvain

Open Babel: An open chemical toolbox

Author: A Amini
A Andronico
A Bender
A Gakh
A Karwath
A Maunz
A Maunz
A Poater
A Rappe
AA Gakh
AD Hill
B-b Yan
BD McKay
C Helma
C Reynès
Chris Morley
CR Jacob
Craig A James
CW Bullock
D Filimonov
D Lagorce
D Lagorce
D Weininger
DC Bas
DC Lonie
DR Koes
F Fontaine
Geoffrey R Hutchison
GL Holliday
HL Morgan
I Wallach
I Wallach
IV Filippov
IV Tetko
J Ahmed
J Ahmed
J Kazius
J Myers
J Wang
J Wang
JH Chen
JJ Langham
JL Melville
JL Sharman
K Fogel
K Martin
L Fabian
L Liu
L Schietgat
M Brüstle
M Buehler
M Dehmer
M Konyk
M Krier
M Kuhn
MA Meineke
MA Miteva
Michael Banck
MJ Gómez
N O'Boyle
N Zonta
NM O'Boyle
NM O'Boyle
Noel M O'Boyle
O Sperandio
P Lind
P Murray-Rust
P Murray-Rust
P Murray-Rust
P Murray-Rust
P Rydberg
P Tosco
P Tosco
R Esposito
RA Bauer
RA Bauer
RS Armen
S Arbor
S Ingsriswang
SV Trepalin
T Cheng
T Halgren
T Halgren
T Halgren
T Halgren
T Halgren
T Kogej
T Pencheva
Tim Vandermeersch
TWH Backman
U Schmidt
VV Mihaleva
William H Green
X Jiang
X Wang
YD Paila
Z Huang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background: A frequent problem in computational modeling is the interconversion of chemical structures between different formats. While standard interchange formats exist (for example, Chemical Markup Language) and de facto standards have arisen (for example, SMILES format), the need to interconvert formats is a continuing problem due to the multitude of different application areas for chemistry data, differences in the data stored by different formats (0D versus 3D, for example), and competition between software along with a lack of vendorneutral formats. Results: We discuss, for the first time, Open Babel, an open-source chemical toolbox that speaks the many languages of chemical data. Open Babel version 2.3 interconverts over 110 formats. The need to represent such a wide variety of chemical and molecular data requires a library that implements a wide range of cheminformatics algorithms, from partial charge assignment and aromaticity detection, to bond order perception and canonicalization. We detail the implementation of Open Babel, describe key advances in the 2.3 release, and outline a variety of uses both in terms of software products and scientific research, including applications far beyond simple format interconversion. Conclusions: Open Babel presents a solution to the proliferation of multiple chemical file formats. In addition, it provides a variety of useful utilities from conformer searching and 2D depiction, to filtering, batch conversion, and substructure and similarity searching. For developers, it can be used as a programming library to handle chemical data in areas such as organic chemistry, drug design, materials science, and computational chemistry. It is freely available under an open-source license fro

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Irish Universities

PubMed Central

Cork Open Research Archive

An efficiently computable graph-based metric for the classification of small molecules

Author: Blockeel Hendrik
Bruynooghe Maurice
Ramon Jan
Schietgat Leander
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2008
Field of study

In machine learning, there has been an increased interest in metrics on structured data. The application we focus on is drug discovery. Although graphs have become very popular for the representation of molecules, a lot of operations on graphs are NP-complete. Representing the molecules as outerplanar graphs, a subclass within general graphs, and using the block-and-bridge preserving subgraph isomorphism, we define a metric and we present an algorithm for computing it in polynomial time. We evaluate this metric and more generally also the block-and-bridge preserving matching operator on a large dataset of molecules, obtaining favorable results.status: publishe

Lirias

Comparing graphs

Author: Kriege Nils Morten
Publication venue
Publication date
Field of study

Graphs are a well-studied mathematical concept, which has become ubiquitous to represent structured data in many application domains like computer vision, social network analysis or chem- and bioinformatics. The ever-increasing amount of data in these domains requires to efficiently organize and extract information from large graph data sets. In this context techniques for comparing graphs are fundamental, e.g., in order to obtain meaningful similarity measures between graphs. These are a prerequisite for the application of a variety of data mining algorithms to the domain of graphs. Hence, various approaches to graph comparison evolved and are wide-spread in practice. This thesis is dedicated to two different strategies for comparing graphs: maximum common subgraph problems and graph kernels. We study maximum common subgraph problems, which are based on classical graph-theoretical concepts for graph comparison and are NP-hard in the general case. We consider variants of the maximum common subgraph problem in restricted graph classes, which are highly relevant for applications in cheminformatics. We develop a polynomial-time algorithm, which allows to compute a maximum common subgraph under block and bridge preserving isomorphism in series-parallel graphs. This generalizes the problem of computing maximum common biconnected subgraphs in series-parallel graphs. We show that previous approaches to this problem, which are based on the separators represented by standard graph decompositions, fail. We introduce the concept of potential separators to overcome this issue and use them algorithmically to solve the problem in series-parallel graphs. We present algorithms with improved bounds on running time for the subclass of outerplanar graphs. Finally, we establish a sufficient condition for maximum common subgraph variants to allow derivation of graph distance metrics. This leads to polynomial-time computable graph distance metrics in restricted graph classes. This progress constitutes a step towards solving practically relevant maximum common subgraph problems in polynomial time. The second contribution of this thesis is to graph kernels, which have their origin in specific data mining algorithms. A key property of graph kernels is that they allow to consider a large (possibly infinite) number of features and can support graphs with arbitrary annotation, while being efficiently computable. The main contributions of this part of the thesis are (i) the development of novel graph kernels, which are especially designed for attributed graphs with arbitrary annotations and (ii) the systematic study of implicit and explicit mapping into a feature space for computation of graph kernels w.r.t. its impact on the running time and the ability to consider arbitrary annotations. We propose graph kernels based on bijections between subgraphs and walks of fixed length. In an experimental study we show that these approaches provide a viable alternative to known techniques, in particular for graphs with complex annotations

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung