Search CORE

8,785 research outputs found

Substituting random forest for multiple linear regression improves binding affinity prediction of scoring functions: Cyscore as a case study

Author
Publication venue: BioMed Central
Publication date
Field of study

Springer - Publisher Connector

Rational Design of Novel Antiviral Compounds Through Computational Approaches

Author: MUSMUCA IRA
Publication venue
Publication date: 04/02/2011
Field of study

Pubblicazioni Aperte Digitali Interateneo Sapienza

Archivio della ricerca- Università di Roma La Sapienza

Rational Design of Novel Antiviral Compounds Through Computational Approaches

Author: MUSMUCA IRA
Publication venue
Publication date: 04/02/2011
Field of study

Archivio della ricerca- Università di Roma La Sapienza

In Silico and In Vitro Investigation into the Next Generation of New Psychoactive Substances

Author: Botha Michelle Jennifer
Publication venue
Publication date: 29/03/2019
Field of study

New Psychoactive Substances (NPS) were designed to be legal alternatives to existing established recreational drugs. They have fast become a very popular and up until 2016, NPS were legal, cheap and freely accessible via the internet and high street “head shops”. The rapid expansion in the number of these drugs has reached epidemic proportions, whereby hundreds of NPS have been developed and sold within the last five-year period. As NPS are synthesized in clandestine laboratories there is little to no control in the manufacture, dosage and packaging of these drugs. The public health risks posed by these drugs are therefore far-reaching. Fatalities and severe adverse reactions associated with these compounds have become an ongoing challenge to healthcare services, primarily because these drugs have not previously been abused and therefore there is little pharmacological information available regarding NPS. There are a number of different biological receptors that are implicated in the effects of NPS and the mechanism of action for the majority of these drugs is still largely unknown. It is of great importance to try and establish an understanding of how various classes of NPS interact on a molecular level. In this thesis, structure-based and ligand-based in Silico methodologies were employed to gain a better understanding of how NPS may interact with monoamine transporters (MAT). Key findings included both molecular docking studies and a number of robust and predictive QSAR models for the dopamine and serotonin transporters provided insight into how promiscuity of NPS between the different MAT isoforms could arise. In addition, pharmacophore models were generated to identify chemical entities that were structurally dissimilar to known existing NPS that had the potential to interact with the cannabinoid 1 receptor (CB1) and hence were hypothesised could elicit similar biological responses to known potent synthetic cannabinoids. Thirteen of these compounds were identified and carried forward for in vitro and ex vivo analyses, where preliminary results have shown that two compounds activate the CB1 receptor. Further optimisation of these compounds could yield a novel SC scaffold that was previously unseen. Additionally, the compounds identified and the methodology employed in the generation of these new chemical scaffolds could be used to guide Early Warning Systems (EWS) and facilitate law enforcement with respect to emergent NPS

University of Hertfordshire Research Archive

Evaluating Overfit and Underfit in Models of Network Community Structure

Author: Clauset Aaron
Ghasemian Amir
Hosseinmardi Homa
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

A common data mining task on networks is community detection, which seeks an unsupervised decomposition of a network into structural groups based on statistical regularities in the network's connectivity. Although many methods exist, the No Free Lunch theorem for community detection implies that each makes some kind of tradeoff, and no algorithm can be optimal on all inputs. Thus, different algorithms will over or underfit on different inputs, finding more, fewer, or just different communities than is optimal, and evaluation methods that use a metadata partition as a ground truth will produce misleading conclusions about general accuracy. Here, we present a broad evaluation of over and underfitting in community detection, comparing the behavior of 16 state-of-the-art community detection algorithms on a novel and structurally diverse corpus of 406 real-world networks. We find that (i) algorithms vary widely both in the number of communities they find and in their corresponding composition, given the same input, (ii) algorithms can be clustered into distinct high-level groups based on similarities of their outputs on real-world networks, and (iii) these differences induce wide variation in accuracy on link prediction and link description tasks. We introduce a new diagnostic for evaluating overfitting and underfitting in practice, and use it to roughly divide community detection methods into general and specialized learning algorithms. Across methods and inputs, Bayesian techniques based on the stochastic block model and a minimum description length approach to regularization represent the best general learning approach, but can be outperformed under specific circumstances. These results introduce both a theoretically principled approach to evaluate over and underfitting in models of network community structure and a realistic benchmark by which new methods may be evaluated and compared.Comment: 22 pages, 13 figures, 3 table

arXiv.org e-Print Archive

Crossref

Machine-learning scoring functions for identifying native poses of ligands docked to known and novel proteins

Author
Publication venue: BioMed Central
Publication date: 17/04/2015
Field of study

Springer - Publisher Connector

Consensus clustering and functional interpretation of gene-expression data

Author: Kellam P.
Liu X.
Martin Nigel
Orengo C.A.
Swift S.
Tucker A.
Vinciotti V.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Microarray analysis using clustering algorithms can suffer from lack of inter-method consistency in assigning related gene-expression profiles to clusters. Obtaining a consensus set of clusters from a number of clustering methods should improve confidence in gene-expression analysis. Here we introduce consensus clustering, which provides such an advantage. When coupled with a statistically based gene functional analysis, our method allowed the identification of novel genes regulated by NFκB and the unfolded protein response in certain B-cell lymphomas

Springer - Publisher Connector

UCL Discovery

PubMed Central

Birkbeck Institutional Research Online

Spiral - Imperial College Digital Repository

Brunel University Research Archive

Empirical Potential Function for Simplified Protein Models: Combining Contact and Local Sequence-Structure Descriptors

Author: Adamian
Anfinsen
Avbelj
Bahar
Bastolla
Betancourt
Brooks
Buchete
Cannata
Chiu
Cline
Crasto
Dill
Dill
Dima
Dobson
Fain
Fitzkee
Fletcher
Friedrichs
Gan
Goldstein
Guntert
Hao
Head-Gordon
Hou
Hu
Hunter
Joachims
Kolinski
Kolodny
Kuang
Lazaridis
Levinthal
Levitt
Lezon
Li
Li
Liang
Loose
Lu
Maiorov
McConkey
McGuffin
Mirny
Miyazawa
Murphy
Park
Park
Pearlman
Pei
Przytycka
Riddle
Sagot
Samudrala
Samudrala
Schölkopf
Shortle
Shortle
Simons
Simons
Simons
Thomas
Tobi
Tobi
Tsai
Vendruscolo
Vendruscolo
Vriend
Wang
Wang
Xia
Xia
Zhang
Zhang
Zhang
Zhou
Publication venue: 'Wiley'
Publication date: 01/01/2006
Field of study

An effective potential function is critical for protein structure prediction and folding simulation. Simplified protein models such as those requiring only

C_\alpha

or backbone atoms are attractive because they enable efficient search of the conformational space. We show residue specific reduced discrete state models can represent the backbone conformations of proteins with small RMSD values. However, no potential functions exist that are designed for such simplified protein models. In this study, we develop optimal potential functions by combining contact interaction descriptors and local sequence-structure descriptors. The form of the potential function is a weighted linear sum of all descriptors, and the optimal weight coefficients are obtained through optimization using both native and decoy structures. The performance of the potential function in test of discriminating native protein structures from decoys is evaluated using several benchmark decoy sets. Our potential function requiring only backbone atoms or

C_\alpha

atoms have comparable or better performance than several residue-based potential functions that require additional coordinates of side chain centers or coordinates of all side chain atoms. By reducing the residue alphabets down to size 5 for local structure-sequence relationship, the performance of the potential function can be further improved. Our results also suggest that local sequence-structure correlation may play important role in reducing the entropic cost of protein folding.Comment: 20 pages, 5 figures, 4 tables. In press, Protein

arXiv.org e-Print Archive

Crossref

Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening

Author: Cang Zixuan
Mu Lin
Wei Guowei
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 27/08/2017
Field of study

This work introduces a number of algebraic topology approaches, such as multicomponent persistent homology, multi-level persistent homology and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. Multicomponent persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for chemical and biological problems. Extensive numerical experiments involving more than 4,000 protein-ligand complexes from the PDBBind database and near 100,000 ligands and decoys in the DUD database are performed to test respectively the scoring power and the virtual screening power of the proposed topological approaches. It is demonstrated that the present approaches outperform the modern machine learning based methods in protein-ligand binding affinity predictions and ligand-decoy discrimination

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare