28 research outputs found

    Additional file 4: Table S3. of Network-based analysis of transcriptional profiles from chemical perturbations experiments

    No full text
    Aggregate Network-related modules with connectivity specifically altered by compound groups. (XLSX 12 kb

    Genomic Models of Short-Term Exposure Accurately Predict Long-Term Chemical Carcinogenicity and Identify Putative Mechanisms of Action

    No full text
    <div><p>Background</p><p>Despite an overall decrease in incidence of and mortality from cancer, about 40% of Americans will be diagnosed with the disease in their lifetime, and around 20% will die of it. Current approaches to test carcinogenic chemicals adopt the 2-year rodent bioassay, which is costly and time-consuming. As a result, fewer than 2% of the chemicals on the market have actually been tested. However, evidence accumulated to date suggests that gene expression profiles from model organisms exposed to chemical compounds reflect underlying mechanisms of action, and that these toxicogenomic models could be used in the prediction of chemical carcinogenicity.</p><p>Results</p><p>In this study, we used a rat-based microarray dataset from the NTP DrugMatrix Database to test the ability of toxicogenomics to model carcinogenicity. We analyzed 1,221 gene-expression profiles obtained from rats treated with 127 well-characterized compounds, including genotoxic and non-genotoxic carcinogens. We built a classifier that predicts a chemical's carcinogenic potential with an AUC of 0.78, and validated it on an independent dataset from the Japanese Toxicogenomics Project consisting of 2,065 profiles from 72 compounds. Finally, we identified differentially expressed genes associated with chemical carcinogenesis, and developed novel data-driven approaches for the molecular characterization of the response to chemical stressors.</p><p>Conclusion</p><p>Here, we validate a toxicogenomic approach to predict carcinogenicity and provide strong evidence that, with a larger set of compounds, we should be able to improve the sensitivity and specificity of the predictions. We found that the prediction of carcinogenicity is tissue-dependent and that the results also confirm and expand upon previous studies implicating DNA damage, the peroxisome proliferator-activated receptor, the aryl hydrocarbon receptor, and regenerative pathology in the response to carcinogen exposure.</p></div

    Putative Modes of Action of carcinogenic chemical compounds.

    No full text
    <p><b>a</b>) Classification performance (AUC, averaged over 100 iterations of random resampling) of a random forest classifier as a function of the number of gene sets used as predictors. 150 gene sets are needed to reach maximum AUC, while 50 are sufficient to get 99% of the expected maximum AUC. <b>b</b>) Heatmaps of the top 50 pathways as ranked by their variable importance derived from a random forest classifier of hepato-carcinogenicity. Rows correspond to pathways, clustered into biological processes; columns correspond to chemical compounds. The left and right heatmaps show all non-carcinogenic and carcinogenic compounds, respectively. Only profiles corresponding to maximum duration and dose treatments, with replicates averaged, are displayed. A detailed version of the right heatmap with all pathways and compounds labeled is available in Figure S11. <b>c</b>) Details of the biological processes associated with the clustering, showing the single differentially regulated pathways and their variable importance ranking, as well as the driving genes.</p

    ROC curve and variable importance for carcinogenicity prediction.

    No full text
    <p>ROC curve of random forest classification in liver of: <b>a</b>) genotoxicity and <b>b</b>) carcinogenicity. For carcinogenicity, tissue specific class labels from the carcinogenicity potency data base (CPDB) were used. The red curves show the mean of the 200 reruns, whereas the dashed curves indicate the first and third quartile respectively. The teal dot indicates a classifier assigning equal costs to false positives (FP) and false negatives (FN) (zero-one loss), whereas the blue dot indicates a classifier assigning a cost of 5 for FN and 1 for FP. <b>c</b>) Variable Importance of the random forest model. Blue denotes genes that are down-regulated in the carcinogenic group, whereas red denotes up-regulation.</p

    AUC for different time points and doses in TG-GATEs.

    No full text
    <p>Comparison the prediction results based on differing a times and doses in the repeat subset of TG-GATEs. Each classification was performed 200 times. The table reports the mean AUC as well as the 95% confidence intervals.</p

    Defining the carcinogenome.

    No full text
    <p><b>a</b>) Hierarchical clustering of 191 profiles/138 compounds (columns) and genes (rows), with each compound represented by the vector of ‘treatment <i>vs</i>. control’ differential expression t-scores. The heatmap is color-coded according to the significance level (q-values) of the corresponding t-scores. Notice the right cluster (top purple color bar) and its enrichment in carcinogenic (red) compounds (Fisher test p = 8.5×10<sup>−6</sup>). <b>b</b>) Top 10 genes ranked according to the number of compounds inducing their significant up-/down-regulation (FDR≤0.01 and fold-change≥1.5. See complete list in Table S28 in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0102579#pone.0102579.s002" target="_blank">File S2</a>). Each gene was also tested for its association with carcinogenicity across compounds (‘Enrichment’ columns) by performing a Fisher test between the gene status (0: not differentially expressed; 1: differentially expressed) and the compounds' status (+ =  carcinogenic; − =  non-carcinogenic). <b>c</b>) Contingency table detailing the distribution of the genes whose compound-induced up-/down-regulation pattern is significantly associated with carcinogenicity status of the compounds.</p

    Classification learning curves as a function of the number of chemicals for: a) genotoxicity and b) carcinogenicity in liver.

    No full text
    <p>The actual AUC values are in red and include the 95% confidence interval for each value. The predicted values of a fitted linear regression model are shown in blue.</p

    Classification results overview.

    No full text
    <p>Random resampling classification results on the DrugMatrix (top) as well as the TG-GATEs (bottom) datasets using 200 iterations. In addition, the results of a model trained on all DrugMatrix samples and tested on TG-GATEs (middle) are shown. Results based on the regular gene expression data and on the data projected onto pathway space (canonical pathways of MSigDB – C2:CP, see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0102579#s4" target="_blank">Methods</a>) are reported. For each testing scheme, <u>a</u>rea <u>u</u>nder the receiver operating characteristic (ROC) <u>c</u>urve (AUC), as well as accuracy, sensitivity and specificity of a classifier trained with a zero-one loss function (FP:FN  = 1∶1), and 95% confidence intervals are reported.</p

    Validation of prediction using pathological items.

    No full text
    <p>The first column shows the concordance between the high confidence predicted liver samples that were treated for 29 days at the highest dose level and fully data-driven histopathological score (H-score<sub>d</sub>), whereas the second column indicates the concordance with the manually derived score (H-score<sub>m</sub>).</p
    corecore