Search CORE

16 research outputs found

Blinded Predictions and Post Hoc Analysis of the Second Solubility Challenge Data: Exploring Training Data and Feature Set Selection for Machine and Deep Learning Models

Author: Baxter Andrew
Carter James W.
Conn Jonathan G.M.
Conn Justin J.A.
Engkvist Ola
Llinas Antonio
Mcdonagh James L.
Palmer David S.
Pickett Stephen D.
Ratkova Ekaterina L.
Subramanian Vigneshwari
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2022
Field of study

Accurate methods to predict solubility from molecular structure are highly sought after in the chemical sciences. To assess the state of the art, the American Chemical Society organized a "Second Solubility Challenge"in 2019, in which competitors were invited to submit blinded predictions of the solubilities of 132 drug-like molecules. In the first part of this article, we describe the development of two models that were submitted to the Blind Challenge in 2019 but which have not previously been reported. These models were based on computationally inexpensive molecular descriptors and traditional machine learning algorithms and were trained on a relatively small data set of 300 molecules. In the second part of the article, to test the hypothesis that predictions would improve with more advanced algorithms and higher volumes of training data, we compare these original predictions with those made after the deadline using deep learning models trained on larger solubility data sets consisting of 2999 and 5697 molecules. The results show that there are several algorithms that are able to obtain near state-of-the-art performance on the solubility challenge data sets, with the best model, a graph convolutional neural network, resulting in an RMSE of 0.86 log units. Critical analysis of the models reveals systematic differences between the performance of models using certain feature sets and training data sets. The results suggest that careful selection of high quality training data from relevant regions of chemical space is critical for prediction accuracy but that other methodological issues remain problematic for machine learning solubility models, such as the difficulty in modeling complex chemical spaces from sparse training data sets

Chalmers Research

Blinded predictions and post-hoc analysis of the second solubility challenge data : exploring training data and feature set selection for machine and deep learning models

Author: Baxter Andrew
Carter James W.
Conn Jonathan G. M.
Conn Justin J. A.
Engkvist Ola
Llinas Antonio
McDonagh James L
Palmer David S.
Pickett Stephen D.
Ratkova Ekaterina L.
Subramanian Vigneshwari
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2023
Field of study

Accurate methods to predict solubility from molecular structure are highly sought after in the chemical sciences. To assess the state-of-the-art, the American Chemical Society organised a “Second Solubility Challenge” in 2019, in which competitors were invited to submit blinded predictions of the solubilities of 132 drug-like molecules. In the first part of this article, we describe the development of two models that were submitted to the Blind Challenge in 2019, but which have not previously been reported. These models were based on computationally inexpensive molecular descriptors and traditional machine learning algorithms, and were trained on a relatively small dataset of 300 molecules. In the second part of the article, to test the hypothesis that predictions would improve with more advanced algorithms and higher volumes of training data, we compare these original predictions with those made after the deadline using deep learning models trained on larger solubility datasets consisting of 2999 and 5697 molecules. The results show that there are several algorithms that are able to obtain near state-of-the-art performance on the solubility challenge datasets, with the best model, a graph convolutional neural network, resulting in a RMSE of 0.86 log units. Critical analysis of the models reveal systematic di↵erences between the performance of models using certain feature sets and training datasets. The results suggest that careful selection of high quality training data from relevant regions of chemical space is critical for prediction accuracy, but that other methodological issues remain problematic for machine learning solubility models, such as the difficulty in modelling complex chemical spaces from sparse training datasets

University of Strathclyde Institutional Repository

Chalmers Research

MIBiG 3.0: a community-driven effort to annotate experimentally validated biosynthetic gene clusters

Author: Aguilar C
Al-Salihi SAA
Alanjary M
Aleti G
Augustijn HE
Avalon NE
Avelar-Rivas JA
Avitia-Domínguez LA
Barona-Gómez F
Bernaldo-Agüero J
Bielinski VA
Biermann F
Blin K
Booth TJ
Carrion Bravo VJ
Castelo-Branco R
Chagas FO
Chevrette MG
Collemare J
Cruz-Morales P
Du C
Duncan KR
Egbert S
Gavriilidou A
Gayrard D
Gutiérrez-García K
Haslinger K
Helfrich EJN
Jati AP
Kalkreuter E
Kalyvas N
Kang KB
Kautsar S
Kim W
Kunjapur AM
Lee S
Li Y-X
Lin G-M
Linington RG
Loureiro C
Louwen JJR
Louwen NLL
Lund G
Medema MH
Meijer D
Navarro-Muñoz JC
Parra J
Philmus B
Pourmohsenin B
Pronk LJU
Recchia MJJ
Rego A
Reitz ZL
Rex DAB
Robinson S
Rosas-Becerra LR
Roxborough ET
Schorn MA
Scobie DJ
Selem-Mojica N
Singh KS
Sokolova N
Tang X
Terlouw BR
Tørring T
Udwary D
van der Hooft JJJ
van Santen JA
Vigneshwari A
Vind K
Vromans SPJM
Waschulin V
Weber T
Williams SE
Winter JM
Witte TE
Xie H
Yang D
Yu J
Zaroubi L
Zdouc M
Zhong Z
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Newcastle University E-Prints

MIBiG 3.0: a community-driven effort to annotate experimentally validated biosynthetic gene clusters

Author: Aguilar C.
Al-Salihi S.A.A.
Alaniary M.
Aleti G.
Augustijn H.E.
Avalar-Rivas J.A.
Avalon N.E.
Avitia Dominguez L.A.
Barona Gomez F.
Bernaldo-Agüero J.
Bielinski V.A.
Biermann F.
Blin K.
Booth T.J.
Carrion Bravo V.J.
Castelo-Branco R.
Chagas F.O.
Chevrette M.G.
Collemare J.
Cruz-Moreles P.
Du C.
Duncan K.R.
Egbert S.
Gavriilidou A.
Gayrard D.
Gutiérez-Garcia K.
Haslinger K.
Helfrich E.J.N.
Hooft J.J.J. van der
Jati A.P.
Kalkreuter E.
Kalyvas N.
Kang K.B.
Kautsar S.
Kim. W.
Kunjapur A.M.
Lee S.
Li Y.X.
Lin G.
Linington R.G.
Loureiro C.
Louwen J.J.R.
Louwen N.L.L.
Lung G.
Medema M.H.
Meijer D.
Navarr-Munoz J.C.
Parra J.
Philmus B.
Prong L.J.U.
Purmohsenin B.
Recchia M.J.J.
Rego A.
Reitz Z.L.
Rex D.A.B.
Robinson S.
Rosas Becerra L.R.
Roxborough E.T.
Santen J.A. van
Schorn M.A.
Scobie D.J.
Selem-Mojica N.
Singh K.S.
Sokolova N.
Tang X.
Terlouw B.R.
Torring T.
Udwary D.
Vigneshwari A.
Vind K.
Vromans S.P.J.M.
Waschulin V.
Weber T.
Williams S.E.
Winter J.M.
Witte T.E.
Xie H.
Yang D.
Yu J.
Zaroubi L.
Zdouc M.
Zhong Z.
Publication venue
Publication date: 18/11/2022
Field of study

Microbial Biotechnolog

Leiden University Scholary Publications

MIBiG 3.0 : a community-driven effort to annotate experimentally validated biosynthetic gene clusters

Author: Aguilar César
Al-Salihi Suhad A.A.
Alanjary Mohammad
Aleti Gajender
Augustijn Hannah E.
Avalon Nicole E.
Avelar-Rivas J. Abraham
Avitia-Domínguez Luis A.
Balaya Rex Devasahayam Arokia
Barona-Gómez Francisco
Bernaldo-Agüero Jordan
Bielinski Vincent A.
Biermann Friederike
Blin Kai
Booth Thomas J.
Carrion Bravo Victor J.
Castelo-Branco Raquel
Chagas Fernanda O.
Chevrette Marc G.
Collemare Jérôme
Cruz-Morales Pablo
Du Chao
Duncan Katherine R.
Egbert Susan
Gavriilidou Athina
Gayrard Damien
Gutiérrez-García Karina
Haslinger Kristina
Helfrich Eric J.N.
Jati Afif P.
Kalkreuter Edward
Kalyvas Nikolaos
Kang Kyo B.
Kautsar Satria
Kim Wonyong
Kunjapur Aditya M.
Lee Sanghoon
Li Yong-Xin
Lin Geng-Min
Linington Roger G.
Loureiro Catarina
Louwen Joris J.R.
Louwen Nico L.L.
Lund George
Medema Marnix H.
Meijer David
Navarro-Muñoz Jorge C.
Parra Jonathan
Philmus Benjamin
Pourmohsenin Bita
Pronk Lotte J.U.
Recchia Michael J.J.
Rego Adriana
Reitz Zachary L.
Robinson Serina
Rosas-Becerra L. Rodrigo
Roxborough Eve T.
Schorn Michelle A.
Scobie Darren J.
Selem-Mojica Nelly
Singh Kumar Saurabh
Sokolova Nika
Tang Xiaoyu
Terlouw Barbara R.
Tørring Thomas
Udwary Daniel
van der Hooft Justin J.J.
van Santen Jeffrey A.
Vigneshwari Aruna
Vind Kristiina
Vromans Sophie P.J.M.
Waschulin Valentin
Weber Tilmann
Williams Sam E.
Winter Jaclyn M.
Witte Thomas E.
Xie Huali
Yang Dong
Yu Jingwei
Zaroubi Liana
Zdouc Mitja
Zhong Zheng
Publication venue
Publication date: 18/11/2022
Field of study

With an ever-increasing amount of (meta)genomic data being deposited in sequence databases, (meta)genome mining for natural product biosynthetic pathways occupies a critical role in the discovery of novel pharmaceutical drugs, crop protection agents and biomaterials. The genes that encode these pathways are often organised into biosynthetic gene clusters (BGCs). In 2015, we defined the Minimum Information about a Biosynthetic Gene cluster (MIBiG): a standardised data format that describes the minimally required information to uniquely characterise a BGC. We simultaneously constructed an accompanying online database of BGCs, which has since been widely used by the community as a reference dataset for BGCs and was expanded to 2021 entries in 2019 (MIBiG 2.0). Here, we describe MIBiG 3.0, a database update comprising large-scale validation and re-annotation of existing entries and 661 new entries. Particular attention was paid to the annotation of compound structures and biological activities, as well as protein domain selectivities. Together, these new features keep the database up-to-date, and will provide new opportunities for the scientific community to use its freely available data, e.g. for the training of new machine learning models to predict sequence-structure-function relationships for diverse natural products. MIBiG 3.0 is accessible online at https://mibig.secondarymetabolites.org/

University of Strathclyde Institutional Repository

ZENODO

eScholarship - University of California

Warwick Research Archives Portal Repository

Digital.CSIC

Online Research Database In Technology

Explore Bristol Research

Morpho-physiological characterization of barnyard millet mutants for salt tolerance

Author: Hemavathy A T
Meena S
Ramesh T
Vanniarajan C
Vetriventhan M
Vigneshwari L
Publication venue: Indian Society of Plant Breeders
Publication date: 20/10/2023
Field of study

The Indian Barnyard millet (Echinochloa frumentacea), is a climate-resilient crop noted for its wide adaptability, short growth cycle, high nutritional value and stress tolerance. The present study was conducted to assess the salt tolerance level of the barnyard millet mutants. Twenty-five barnyard millet mutants along with check (MDU1 and CO(KV)2) were subjected to varying concentrations of sodium bicarbonate (NaHCO3) salt stress under controlled conditions. Germination percentage, root and shoot length, fresh and dry weight of seedlings were recorded, and stress tolerance indices were calculated. The analysis of variance revealed significant variation among mutants in response to salt stress. Highly tolerant mutants exhibited improved germination percentages and maintained favourable water relations under stress. The Relative Salt Injury Rate (RSIR) increased with higher salt levels, indicating increased sensitivity. Correlation analysis revealed a significant relationship between salt tolerance traits. Principal Component Analysis (PCA) helped to identify the main characteristics that caused variations among mutants. The significant contributors to this variation were Vigour Index (VI), Relative Growth Rate (RGR), Relative Water Content (RWC), and RSIR. Cluster analysis categorized mutants into four clusters, clearly distinguishing highly tolerant mutants from susceptible ones. Based on the findings promising salt-tolerant mutants, such as ACM21022, ACM21016, ACM21017, ACM21024, and ACM21014, were identified with the potential to contribute to future millet breeding programs

ICRISAT Open Access Repository

Exploring the therapeutic potential of Neem (Azadirachta Indica) for the treatment of prostate cancer: a literature review.

Author: Batra Neelu
De Souza Cristabelle
Ghosh Paramita M
Kumar Vigneshwari Easwar
Le Uyen
Nambiar Roshni
Verma Rashmi
Vinall Ruth L
Yuen Ashley
Publication venue: eScholarship, University of California
Publication date: 01/07/2022
Field of study

Background and objectiveMultiple studies have demonstrated the medical potency of plant extracts and specific phytochemicals as therapeutics for prostate cancer (PCa) patients. Of note, the Neem plant known for its role as an antibiotic and anti-inflammatory is underexplored with an untapped potential for further development. This review focuses on extracts and phytochemicals derived from the Neem tree (Latin name; Azadirachta indica), commonly used throughout Southeast Asia for the prevention and treatment of a wide array of diseases including cancer. To date, there are more than 130 biologically active compounds that have been isolated from the Neem tree including azadirachtin, nimbolinin, nimbin, nimbidin, nimbidol, which have demonstrated a wide range of biological activities including anti-microbial, anti-fertility, anti-inflammatory, anti-arthritic, hepatoprotective, anti-diabetic, anti-ulcer, and anti-cancer effects. Very few scientific reports focus on the benefits of Neem in PCa, even though this herb has been used to prevent the disease and its progression for years in complementary and alternative medicine.MethodsWe used the search engines like PubMed, InCommon and Google using the key words: "Neem", "Cancer", "Prostate Cancer" and related words to find the information and data within the time frame from 1980-2022 for our article study.Key content and findingsHere, we provide an overview of Neem extracts and phytochemical derivatives with a focus on their known potential and ability to inhibit specific cellular signaling pathways and processes which drive PCa incidence and progression.ConclusionsThe information presented here indicate that Neem and its derivatives have a therapeutic potential for the treatment of PCa when used as a single agent or in combination with conventional chemotherapeutics

PubMed Central

eScholarship - University of California

Recommended from our members

CRISPR-Cas9 screen of E3 ubiquitin ligases identifies TRAF2 and UHRF1 as regulators of HIV latency in primary human T cells

Author: Bediako Yaw
Bouhaddou Mehdi
Braberg Hannes
Eckhardt Manon
Haas Kelsey M
Haas Paige
Hiatt Joseph
Hultquist Judd F
Kaake Robyn M
Krogan Nevan J
Kumar Vigneshwari Easwar
Marson Alexander
McGregor Michael J
Olwal Charles Ochieng'
Ott Melanie
Rathore Ujjwal
Soucheray Margaret
Stevenson Erica
Swaney Danielle L
Turner-Groth Autumn
Zuliani-Alvarez Lorena
Publication venue: eScholarship, University of California
Publication date: 10/04/2024
Field of study

During HIV infection of CD4+ T cells, ubiquitin pathways are essential to viral replication and host innate immune response; however, the role of specific E3 ubiquitin ligases is not well understood. Proteomics analyses identified 116 single-subunit E3 ubiquitin ligases expressed in activated primary human CD4+ T cells. Using a CRISPR-based arrayed spreading infectivity assay, we systematically knocked out 116 E3s from activated primary CD4+ T cells and infected them with NL4-3 GFP reporter HIV-1. We found 10 E3s significantly positively or negatively affected HIV infection in activated primary CD4+ T cells, including UHRF1 (pro-viral) and TRAF2 (anti-viral). Furthermore, deletion of either TRAF2 or UHRF1 in three JLat models of latency spontaneously increased HIV transcription. To verify this effect, we developed a CRISPR-compatible resting primary human CD4+ T cell model of latency. Using this system, we found that deletion of TRAF2 or UHRF1 initiated latency reactivation and increased virus production from primary human resting CD4+ T cells, suggesting these two E3s represent promising targets for future HIV latency reversal strategies.ImportanceHIV, the virus that causes AIDS, heavily relies on the machinery of human cells to infect and replicate. Our study focuses on the host cell's ubiquitination system which is crucial for numerous cellular processes. Many pathogens, including HIV, exploit this system to enhance their own replication and survival. E3 proteins are part of the ubiquitination pathway that are useful drug targets for host-directed therapies. We interrogated the 116 E3s found in human immune cells known as CD4+ T cells, since these are the target cells infected by HIV. Using CRISPR, a gene-editing tool, we individually removed each of these enzymes and observed the impact on HIV infection in human CD4+ T cells isolated from healthy donors. We discovered that 10 of the E3 enzymes had a significant effect on HIV infection. Two of them, TRAF2 and UHRF1, modulated HIV activity within the cells and triggered an increased release of HIV from previously dormant or "latent" cells in a new primary T cell assay. This finding could guide strategies to perturb hidden HIV reservoirs, a major hurdle to curing HIV. Our study offers insights into HIV-host interactions, identifies new factors that influence HIV infection in immune cells, and introduces a novel methodology for studying HIV infection and latency in human immune cells

eScholarship - University of California

Efficient generation of isogenic primary human myeloid cells using CRISPR-Cas9 ribonucleoproteins.

Author: Bouzidi Mohamed S
Budzik Jonathan M
Cavero Devin A
Cox Jeffery S
Dang Eric V
Ernst Joel D
Fontaine Krystal A
Gordon David E
Haas Kelsey M
Hiatt Joseph
Hultquist Judd F
Krogan Nevan J
Kumar Vigneshwari Easwar
Lee Youjin
Marson Alexander
McGregor Michael J
Meyer-Franke Anke
Pillai Satish K
Rathore Ujjwal
Roth Theodore L
Shifrut Eric
Wojcechowskyj Jason A
Wu David
Zheng Weihao
Publication venue: eScholarship, University of California
Publication date: 01/05/2021
Field of study

Genome engineering of primary human cells with CRISPR-Cas9 has revolutionized experimental and therapeutic approaches to cell biology, but human myeloid-lineage cells have remained largely genetically intractable. We present a method for the delivery of CRISPR-Cas9 ribonucleoprotein (RNP) complexes by nucleofection directly into CD14+ human monocytes purified from peripheral blood, leading to high rates of precise gene knockout. These cells can be efficiently differentiated into monocyte-derived macrophages or dendritic cells. This process yields genetically edited cells that retain transcript and protein markers of myeloid differentiation and phagocytic function. Genetic ablation of the restriction factor SAMHD1 increased HIV-1 infection >50-fold, demonstrating the power of this system for genotype-phenotype interrogation. This fast, flexible, and scalable platform can be used for genetic studies of human myeloid cells in immune signaling, inflammation, cancer immunology, host-pathogen interactions, and beyond, and could facilitate the development of myeloid cellular therapies

PubMed Central

eScholarship - University of California

Overview of Multifaceted Role and Significance of Heat Shock Proteins During Inflammation, Apoptosis and Other Diseases

Author: A Bloemendal
A Kannappan
A Pugliese
A Scheffold
AS Lee
B Balasubramaniam
B Balasubramaniam
B Jin
B Shanmuganathan
BE Oyinloye
CJ Workman
D Lanneau
D Rajagopal
DG Garbuz
E Marcenaro
F Hauet-Broere
F Stelter
G Verdile
GM Shankar
H Jee
IM de Kleer
JC Young
K Lundberg
L Ji
L Vigneshwari
M Ghayour-Mobarhan
M Miyara
MG Kim
P Srivastava
S Gowrishankar
S Han
S Marudhupandiyan
S Muthamil
S Muthamil
S Sathya
SL DeMeester
SW Lowe
T Hoshino
T Koliński
U Prithika
V Lobo
W Van Eden
YH Edrey
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref