Search CORE

8,462 research outputs found

Combining multiple classifications of chemical structures using consensus clustering

Author: Adamson
Ayad
Ben-Dor
Bertolacci
Boecker
Boulis
Brown
Chia-Wei Chu
Chu
Downs
Dunbar
Engels
Everitt
Feher
Filkov
Fowlkes
Fred
Gionis
Goder
Gordon
Hert
Hudson
Jarvis
John D. Holliday
Menard
Monti
Peter Willett
Rand
Raymond
Rogers
Santos
Schuffenhauer
Strehl
Szekely
Topchy
Varin
Varin
Ward
Willett
Willett
Willett
Willett
Yin
Zhao
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Consensus clustering involves combining multiple clusterings of the same set of objects to achieve a single clustering that will, hopefully, provide a better picture of the groupings that are present in a dataset. This Letter reports the use of consensus clustering methods on sets of chemical compounds represented by 2D fingerprints. Experiments with DUD, IDAlert, MDDR and MUV data suggests that consensus methods are unlikely to result in significant improvements in clustering effectiveness as compared to the use of a single clustering method. (C) 2012 Elsevier Ltd. All rights reserved

Crossref

White Rose Research Online

Clustering files of chemical structures using the Szekely-Rizzo generalization of Ward's method

Author: Bureau R.
Mueller C.
Varin T.
Willett P.
Publication venue: 'Elsevier BV'
Publication date: 01/09/2009
Field of study

Ward's method is extensively used for clustering chemical structures represented by 2D fingerprints. This paper compares Ward clusterings of 14 datasets (containing between 278 and 4332 molecules) with those obtained using the Szekely–Rizzo clustering method, a generalization of Ward's method. The clusters resulting from these two methods were evaluated by the extent to which the various classifications were able to group active molecules together, using a novel criterion of clustering effectiveness. Analysis of a total of 1400 classifications (Ward and Székely–Rizzo clustering methods, 14 different datasets, 5 different fingerprints and 10 different distance coefficients) demonstrated the general superiority of the Székely–Rizzo method. The distance coefficient first described by Soergel performed extremely well in these experiments, and this was also the case when it was used in simulated virtual screening experiments

White Rose Research Online

Machine learning and its applications in reliability analysis systems

Author: Hong Hui-ling
Publication venue
Publication date: 01/01/1994
Field of study

In this thesis, we are interested in exploring some aspects of Machine Learning (ML) and its application in the Reliability Analysis systems (RAs). We begin by investigating some ML paradigms and their- techniques, go on to discuss the possible applications of ML in improving RAs performance, and lastly give guidelines of the architecture of learning RAs. Our survey of ML covers both levels of Neural Network learning and Symbolic learning. In symbolic process learning, five types of learning and their applications are discussed: rote learning, learning from instruction, learning from analogy, learning from examples, and learning from observation and discovery. The Reliability Analysis systems (RAs) presented in this thesis are mainly designed for maintaining plant safety supported by two functions: risk analysis function, i.e., failure mode effect analysis (FMEA) ; and diagnosis function, i.e., real-time fault location (RTFL). Three approaches have been discussed in creating the RAs. According to the result of our survey, we suggest currently the best design of RAs is to embed model-based RAs, i.e., MORA (as software) in a neural network based computer system (as hardware). However, there are still some improvement which can be made through the applications of Machine Learning. By implanting the 'learning element', the MORA will become learning MORA (La MORA) system, a learning Reliability Analysis system with the power of automatic knowledge acquisition and inconsistency checking, and more. To conclude our thesis, we propose an architecture of La MORA

Durham e-Theses

Thresholds of Toxicological Concern for Cosmetics-Related Substances: New Database, Thresholds, and Enrichment of Chemical Space

Author: Alan R. Boobis
Andrew Worth
Antignac
Bailey
Barlow
Bassan
Batke
Becker
Benz
Bercu
Bhatia
Blackburn
Brüschweiler
Buchholzer
Carthew
Chihae Yang
Conover
Cramer
Cronin
Detlef Keller
Dewhurst
Escher
European Chemicals Agency (ECHA)
European Chemicals Agency (ECHA)
European Commission
European Commission
European Commission
European Food Safety Authority (EFSA)
European Food Safety Authority (EFSA)
European Food Safety Authority (EFSA)
European Food Safety Authority (EFSA) & World Health Organization (WHO)
European Medicines Agency (EMEA)
European Medicines Agency (EMEA)
Feigenbaum
Gocht
Health Canada
Heli M. Hollnagel
Hennes
Houeto
Kirk B. Arvidson
Klimisch
Kristi L. Muldoon Jacobs
Kroes
Kroes
Laabs
Lapenna
Mark T.D. Cronin
Melching-Kollmuß
Mons
Munro
Partosch
Patlewicz
Pinalli
Richard
Roberts
SCCS NfG
SCCS SCHER and SCENIHR
Shapiro
Silverman
Stanard
Steven Enoch
Susan M. Barlow
Susan P. Felter
Tluczkiewicz
US Department of Health and Human Services
US Environmental Protection Agency (EPA)
US Environmental Protection Agency (EPA)
US Environmental Protection Agency (EPA)
US Food and Drug Administration (FDA)
US Food and Drug Administration (FDA)
US Food and Drug Administration (FDA)
Vessela Vitcheva
Williams
Worth
Yang
Yang
Publication venue: 'Elsevier BV'
Publication date: 28/08/2017
Field of study

A new dataset of cosmetics-related chemicals for the Threshold of Toxicological Concern (TTC) approach has been compiled, comprising 552 chemicals with 219, 40, and 293 chemicals in Cramer Classes I, II, and III, respectively. Data were integrated and curated to create a database of No-/Lowest-Observed-Adverse-Effect Level (NOAEL/LOAEL) values, from which the final COSMOS TTC dataset was developed. Criteria for study inclusion and NOAEL decisions were defined, and rigorous quality control was performed for study details and assignment of Cramer classes. From the final COSMOS TTC dataset, human exposure thresholds of 42 and 7.9 μg/kg-bw/day were derived for Cramer Classes I and III, respectively. The size of Cramer Class II was insufficient for derivation of a TTC value. The COSMOS TTC dataset was then federated with the dataset of Munro and colleagues, previously published in 1996, after updating the latter using the quality control processes for this project. This federated dataset expands the chemical space and provides more robust thresholds. The 966 substances in the federated database comprise 245, 49 and 672 chemicals in Cramer Classes I, II and III, respectively. The corresponding TTC values of 46, 6.2 and 2.3 μg/kg-bw/day are broadly similar to those of the original Munro dataset

LJMU Research Online (Liverpool John Moores University)

JRC Publications Repository

Crossref

Spiral - Imperial College Digital Repository

ClassyFire: automated chemical classification with a comprehensive, computable taxonomy

Author: Christoph Steinbeck
Craig Knox
David S. Wishart
Eoin Fahy
Evan Bolton
Gareth Owen
Janna Hastings
Leonid Chepelev
Roman Eisner
Russell Greiner
Shankar Subramanian
Yannick Djoumbou Feunang
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

Additional file 5. Use cases. Text-based search on the ClassyFire web server. (A) Building the query. (B) Sparteine, one of the returned compounds

Springer - Publisher Connector

FigShare

Computational Approaches to Drug Profiling and Drug-Protein Interactions

Author: Scott Oliver B.
Publication venue: UCL (University College London)
Publication date: 28/03/2023
Field of study

Despite substantial increases in R&D spending within the pharmaceutical industry, denovo drug design has become a time-consuming endeavour. High attrition rates led to a long period of stagnation in drug approvals. Due to the extreme costs associated with introducing a drug to the market, locating and understanding the reasons for clinical failure is key to future productivity. As part of this PhD, three main contributions were made in this respect. First, the web platform, LigNFam enables users to interactively explore similarity relationships between ‘drug like’ molecules and the proteins they bind. Secondly, two deep-learning-based binding site comparison tools were developed, competing with the state-of-the-art over benchmark datasets. The models have the ability to predict offtarget interactions and potential candidates for target-based drug repurposing. Finally, the open-source ScaffoldGraph software was presented for the analysis of hierarchical scaffold relationships and has already been used in multiple projects, including integration into a virtual screening pipeline to increase the tractability of ultra-large screening experiments. Together, and with existing tools, the contributions made will aid in the understanding of drug-protein relationships, particularly in the fields of off-target prediction and drug repurposing, helping to design better drugs faster

UCL Discovery

Gamma-based clustering via ordered means with application to gene-expression analysis

Author: Chung Lisa M.
Newton Michael A.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 09/11/2012
Field of study

Discrete mixture models provide a well-known basis for effective clustering algorithms, although technical challenges have limited their scope. In the context of gene-expression data analysis, a model is presented that mixes over a finite catalog of structures, each one representing equality and inequality constraints among latent expected values. Computations depend on the probability that independent gamma-distributed variables attain each of their possible orderings. Each ordering event is equivalent to an event in independent negative-binomial random variables, and this finding guides a dynamic-programming calculation. The structuring of mixture-model components according to constraints among latent means leads to strict concavity of the mixture log likelihood. In addition to its beneficial numerical properties, the clustering method shows promising results in an empirical study.Comment: Published in at http://dx.doi.org/10.1214/10-AOS805 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Active Learning for drug discovery

Author: Williams Kevin Stewart
Publication venue
Publication date: 20/05/2014
Field of study

Aberystwyth Research Portal

Development of soft computing and applications in agricultural and biological engineering

Author: Fang Alex
Hoffmann Wesley C.
Huang Yanbo
Lacey Ronald E.
Lan Yubin
Thomson Steven J.
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2010
Field of study

Soft computing is a set of “inexact” computing techniques, which are able to model and analyze very complex problems. For these complex problems, more conventional methods have not been able to produce cost-effective, analytical, or complete solutions. Soft computing has been extensively studied and applied in the last three decades for scientific research and engineering computing. In agricultural and biological engineering, researchers and engineers have developed methods of fuzzy logic, artificial neural networks, genetic algorithms, decision trees, and support vector machines to study soil and water regimes related to crop growth, analyze the operation of food processing, and support decision-making in precision farming. This paper reviews the development of soft computing techniques. With the concepts and methods, applications of soft computing in the field of agricultural and biological engineering are presented, especially in the soil and water context for crop management and decision support in precision agriculture. The future of development and application of soft computing in agricultural and biological engineering is discussed

DigitalCommons@University of Nebraska