15,974 research outputs found
QSAR study for carcinogenicity in a large set of organic compounds
In our continuing efforts to find out acceptable Absorption, Distribution, Metabolization, Elimination and Toxicity (ADMET) properties of organic compounds, we establish linear QSAR models for the carcinogenic potential prediction of 1464 compounds taken from the "Galvez data set", that include many marketed drugs. More than a thousand of geometry-independent molecular descriptors are simultaneously analyzed, obtained with the softwares E-Dragon and Recon. The variable subset selection method employed is the Replacement Method, and also the improved version Enhanced Replacement Method. The established models are properly validated through an external test set of compounds, and by means of the Leave-Group-Out Cross Validation method. In addition, we apply the Y-Randomization strategy and analyze the Applicability Domain of the developed model. Finally, we compare the results obtained in present study with the previous ones from the literature. The novelty of present work relies on the development of an alternative predictive structure-carcinogenicity relationship in a large heterogeneous set of organic compounds, by only using a reduced number of geometry independent molecular descriptors.Fil: Duchowicz, Pablo Román. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas. Centro CientĂfico TecnolĂłgico Conicet - La Plata. Instituto de Investigaciones FisicoquĂmicas TeĂłricas y Aplicadas. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Investigaciones FisicoquĂmicas TeĂłricas y Aplicadas; ArgentinaFil: Comelli, Nieves Carolina. Universidad Nacional de Catamarca. Facultad de Ciencias Agrarias; ArgentinaFil: Ortiz, Erlinda del Valle. Universidad Nacional de Catamarca. Facultad de TecnologĂa y Ciencias Aplicadas; ArgentinaFil: Castro, Eduardo Alberto. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas. Centro CientĂfico TecnolĂłgico Conicet - La Plata. Instituto de Investigaciones FisicoquĂmicas TeĂłricas y Aplicadas. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Investigaciones FisicoquĂmicas TeĂłricas y Aplicadas; Argentin
kLog: A Language for Logical and Relational Learning with Kernels
We introduce kLog, a novel approach to statistical relational learning.
Unlike standard approaches, kLog does not represent a probability distribution
directly. It is rather a language to perform kernel-based learning on
expressive logical and relational representations. kLog allows users to specify
learning problems declaratively. It builds on simple but powerful concepts:
learning from interpretations, entity/relationship data modeling, logic
programming, and deductive databases. Access by the kernel to the rich
representation is mediated by a technique we call graphicalization: the
relational representation is first transformed into a graph --- in particular,
a grounded entity/relationship diagram. Subsequently, a choice of graph kernel
defines the feature space. kLog supports mixed numerical and symbolic data, as
well as background knowledge in the form of Prolog or Datalog programs as in
inductive logic programming systems. The kLog framework can be applied to
tackle the same range of tasks that has made statistical relational learning so
popular, including classification, regression, multitask learning, and
collective classification. We also report about empirical comparisons, showing
that kLog can be either more accurate, or much faster at the same level of
accuracy, than Tilde and Alchemy. kLog is GPLv3 licensed and is available at
http://klog.dinfo.unifi.it along with tutorials
The benefits of in silico modeling to identify possible small-molecule drugs and their off-target interactions
Accepted for publication in a future issue of Future Medicinal Chemistry.The research into the use of small molecules as drugs continues to be a key driver in the development of molecular databases, computer-aided drug design software and collaborative platforms. The evolution of computational approaches is driven by the essential criteria that a drug molecule has to fulfill, from the affinity to targets to minimal side effects while having adequate absorption, distribution, metabolism, and excretion (ADME) properties. A combination of ligand- and structure-based drug development approaches is already used to obtain consensus predictions of small molecule activities and their off-target interactions. Further integration of these methods into easy-to-use workflows informed by systems biology could realize the full potential of available data in the drug discovery and reduce the attrition of drug candidates.Peer reviewe
ChemRL-GEM: Geometry Enhanced Molecular Representation Learning for Property Prediction
Effective molecular representation learning is of great importance to
facilitate molecular property prediction, which is a fundamental task for the
drug and material industry. Recent advances in graph neural networks (GNNs)
have shown great promise in applying GNNs for molecular representation
learning. Moreover, a few recent studies have also demonstrated successful
applications of self-supervised learning methods to pre-train the GNNs to
overcome the problem of insufficient labeled molecules. However, existing GNNs
and pre-training strategies usually treat molecules as topological graph data
without fully utilizing the molecular geometry information. Whereas, the
three-dimensional (3D) spatial structure of a molecule, a.k.a molecular
geometry, is one of the most critical factors for determining molecular
physical, chemical, and biological properties. To this end, we propose a novel
Geometry Enhanced Molecular representation learning method (GEM) for Chemical
Representation Learning (ChemRL). At first, we design a geometry-based GNN
architecture that simultaneously models atoms, bonds, and bond angles in a
molecule. To be specific, we devised double graphs for a molecule: The first
one encodes the atom-bond relations; The second one encodes bond-angle
relations. Moreover, on top of the devised GNN architecture, we propose several
novel geometry-level self-supervised learning strategies to learn spatial
knowledge by utilizing the local and global molecular 3D structures. We compare
ChemRL-GEM with various state-of-the-art (SOTA) baselines on different
molecular benchmarks and exhibit that ChemRL-GEM can significantly outperform
all baselines in both regression and classification tasks. For example, the
experimental results show an overall improvement of 8.8% on average compared to
SOTA baselines on the regression tasks, demonstrating the superiority of the
proposed method
Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?
The organization and mining of malaria genomic and post-genomic data is
highly motivated by the necessity to predict and characterize new biological
targets and new drugs. Biological targets are sought in a biological space
designed from the genomic data from Plasmodium falciparum, but using also the
millions of genomic data from other species. Drug candidates are sought in a
chemical space containing the millions of small molecules stored in public and
private chemolibraries. Data management should therefore be as reliable and
versatile as possible. In this context, we examined five aspects of the
organization and mining of malaria genomic and post-genomic data: 1) the
comparison of protein sequences including compositionally atypical malaria
sequences, 2) the high throughput reconstruction of molecular phylogenies, 3)
the representation of biological processes particularly metabolic pathways, 4)
the versatile methods to integrate genomic data, biological representations and
functional profiling obtained from X-omic experiments after drug treatments and
5) the determination and prediction of protein structures and their molecular
docking with drug candidate structures. Progresses toward a grid-enabled
chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa
Gode -- Integrating Biochemical Knowledge Graph into Pre-training Molecule Graph Neural Network
The precise prediction of molecular properties holds paramount importance in
facilitating the development of innovative treatments and comprehending the
intricate interplay between chemicals and biological systems. In this study, we
propose a novel approach that integrates graph representations of individual
molecular structures with multi-domain information from biomedical knowledge
graphs (KGs). Integrating information from both levels, we can pre-train a more
extensive and robust representation for both molecule-level and KG-level
prediction tasks with our novel self-supervision strategy. For performance
evaluation, we fine-tune our pre-trained model on 11 challenging chemical
property prediction tasks. Results from our framework demonstrate our
fine-tuned models outperform existing state-of-the-art models.Comment: It's an ongoing work. We're exploring the ability of Gode on other
task
- …