Search CORE

26 research outputs found

Deep Learning Accurately Predicts Estrogen Receptor Status in Breast Cancer Metabolomics Data

Author: Fadhl M. Alakwaa (4620529)
Kumardeep Chaudhary (326177)
Lana X. Garmire (203041)
Publication venue
Publication date
Field of study

Metabolomics holds the promise as a new technology to diagnose highly heterogeneous diseases. Conventionally, metabolomics data analysis for diagnosis is done using various statistical and machine learning based classification methods. However, it remains unknown if deep neural network, a class of increasingly popular machine learning methods, is suitable to classify metabolomics data. Here we use a cohort of 271 breast cancer tissues, 204 positive estrogen receptor (ER+), and 67 negative estrogen receptor (ER−) to test the accuracies of feed-forward networks, a deep learning (DL) framework, as well as six widely used machine learning models, namely random forest (RF), support vector machines (SVM), recursive partitioning and regression trees (RPART), linear discriminant analysis (LDA), prediction analysis for microarrays (PAM), and generalized boosted models (GBM). DL framework has the highest area under the curve (AUC) of 0.93 in classifying ER+/ER– patients, compared to the other six machine learning algorithms. Furthermore, the biological interpretation of the first hidden layer reveals eight commonly enriched significant metabolomics pathways (adjusted P-value <0.05) that cannot be discovered by other machine learning methods. Among them, protein digestion and absorption and ATP-binding cassette (ABC) transporters pathways are also confirmed in integrated analysis between metabolomics and gene expression data in these samples. In summary, deep learning method shows advantages for metabolomics based breast cancer ER status classification, with both the highest prediction accuracy (AUC = 0.93) and better revelation of disease biology. We encourage the adoption of feed-forward networks based deep learning method in the metabolomics research community for classification

FigShare

Additional file 1: Figure S1. of Prediction of anticancer molecules using hybrid model developed on molecules screened against NCI-60 cancer cell lines

Author: Ankur Gautam (326176)
Gajendra Raghava (3427013)
Harinder Singh (205467)
Kumardeep Chaudhary (326177)
Rahul Kumar (136775)
Sandeep Singh (66811)
Publication venue
Publication date
Field of study

Counts of Functional groups present in anticancer and non-anticancer molecules. Table S1. Shows frequency of occurrence of MCS in anticancer and non-anticancer compounds according to LibMCS module of Chemaxon. Structures were search using jcsearch module of Chemaxon with substructure search option. Table S2. The individual performance of best 126 selected fingerprints using MCC based approach. Table S3. Performance of hybrid method developed using 126 fingerprints on different sensitivity. (DOC 356 kb

Springer - Publisher Connector

FigShare

The performance of motif-based model developed on main dataset.

Author: Ankur Gautam (326176)
Gajendra P. S. Raghava (50436)
Kumardeep Chaudhary (326177)
Pallavi Kapoor (326175)
Rahul Kumar (136775)
Sudheer Gupta (458415)
Publication venue
Publication date
Field of study

PCP; probability of correct prediction.</p

FigShare

Sequence logos of (A) first ten residues of N-terminus and (B) last ten residues of C-terminus of toxic peptides, where size of residue is proportional to its propensity (main dataset).

Author: Ankur Gautam (326176)
Gajendra P. S. Raghava (50436)
Kumardeep Chaudhary (326177)
Pallavi Kapoor (326175)
Rahul Kumar (136775)
Sudheer Gupta (458415)
Publication venue
Publication date
Field of study

Sequence logos of (A) first ten residues of N-terminus and (B) last ten residues of C-terminus of toxic peptides, where size of residue is proportional to its propensity (main dataset).</p

FigShare

The performance of quantitative matix based method on various datasets.

Author: Ankur Gautam (326176)
Gajendra P. S. Raghava (50436)
Kumardeep Chaudhary (326177)
Pallavi Kapoor (326175)
Rahul Kumar (136775)
Sudheer Gupta (458415)
Publication venue
Publication date
Field of study

MCC, Matthew’s correlation coefficient; AUC, area under the curve.</p

FigShare

Overview of datasets’ creation.

Author: Ankur Gautam (326176)
Gajendra P. S. Raghava (50436)
Kumardeep Chaudhary (326177)
Pallavi Kapoor (326175)
Rahul Kumar (136775)
Sudheer Gupta (458415)
Publication venue
Publication date
Field of study

Overview of datasets’ creation.</p

FigShare

Schematic representation of ToxinPred webserver.

Author: Ankur Gautam (326176)
Gajendra P. S. Raghava (50436)
Kumardeep Chaudhary (326177)
Pallavi Kapoor (326175)
Rahul Kumar (136775)
Sudheer Gupta (458415)
Publication venue
Publication date
Field of study

Schematic representation of ToxinPred webserver.</p

FigShare

Data_Sheet_1.DOC

Author: Gajendra P. S. Raghava (50436)
Kumardeep Chaudhary (326177)
Meenu Sharma (430808)
Piyush Agrawal (1557022)
Rajesh Kumar (72124)
Sherry Bhalla (4886938)
Publication venue
Publication date
Field of study

This paper describes in silico models developed using a wide range of peptide features for predicting antifungal peptides (AFPs). Our analyses indicate that certain types of residue (e.g., C, G, H, K, R, Y) are more abundant in AFPs. The positional residue preference analysis reveals the prominence of the particular type of residues (e.g., R, V, K) at N-terminus and a certain type of residues (e.g., C, H) at C-terminus. In this study, models have been developed for predicting AFPs using a wide range of peptide features (like residue composition, binary profile, terminal residues). The support vector machine based model developed using compositional features of peptides achieved maximum accuracy of 88.78% on the training dataset and 83.33% on independent or validation dataset. Our model developed using binary patterns of terminal residues of peptides achieved maximum accuracy of 84.88% on training and 84.64% on validation dataset. We benchmark models developed in this study and existing methods on a dataset containing compositionally similar antifungal and non-AFPs. It was observed that binary based model developed in this study preforms better than any model/method. In order to facilitate scientific community, we developed a mobile app, standalone and a user-friendly web server ‘Antifp’ (http://webs.iiitd.edu.in/raghava/antifp).</p

FigShare

Maximum and minimum scoring residues at every position as observed in quantitative matrix (main dataset).

Author: Ankur Gautam (326176)
Gajendra P. S. Raghava (50436)
Kumardeep Chaudhary (326177)
Pallavi Kapoor (326175)
Rahul Kumar (136775)
Sudheer Gupta (458415)
Publication venue
Publication date
Field of study

Maximum and minimum scoring residues at every position as observed in quantitative matrix (main dataset).</p

FigShare

Overall architecture of TumorHoPe database.

Author: Ankur Gautam (326176)
Gajendra P. S. Raghava (50436)
Harinder Singh (205467)
Kumardeep Chaudhary (326177)
Pallavi Kapoor (326175)
Rahul Kumar (136775)
Publication venue
Publication date
Field of study

Overall architecture of TumorHoPe database.</p

FigShare