52 research outputs found
Additional file 2: of IntEREst: intron-exon retention estimator
Tab delimited text file that includes the differentially retained introns, when comparing the ZRSR2mut samples to the controls using the IntEREst-DESeq2. (TSV 634 kb
Classification performance.
<p>The mean accuracy values obtained over the 30 bootstrap iterations. Acc – is the overall accuracy, F – is the F-score, G – is the G-score. The highest values are highlighted in bold. NOTE: all the corresponding standard deviations are less than 0.02.</p><p>Classification performance.</p
Evaluation process.
<p>The full dataset is a matrix with thousands of features (<i>e.g.</i> genes) in rows and tens or hundreds of samples (belonging to different classes) in columns. For each sample, the outcome (class) is given. The dataset is randomly divided into training and test sets using a stratified random selection (1). Within the training set, relevant features are selected using the compared methods (2). The FPRF method identifies a wide set of relevant features using a fuzzy pattern discovery technique and ranks them applying a RF-based procedure (3). The most n-relevant features are then selected with n = 30, 50, 100, 150 and 200 (4). The different sets of features are used to evaluate the stability and the corresponding classification performance. For each set of selected features an RF-based classifier is trained on the training set (5). After training, the classifiers are asked to predict the outcome of the test set patients (6). The predicted outcome is compared with the true outcome and the number of correctly classified samples is noted. Steps 1–6 are repeated 30 times, and the resulting evaluation metrics are obtained by averaging over the 30 runs.</p
Selection consistency analysis.
<p>The number of significantly self-consistent and all the selected genes by a given method during the 30 bootstrap iterations. <i>ns</i> – the number of significantly self-consistent genes found, <i>tot</i> – the number of different features selected over the 30 bootstrap iterations, mnsf – the mean number of selected features. The highest values are highlighted in bold.</p><p>Selection consistency analysis.</p
Running time.
<p>Evaluation of the running time represented as the mean over 30 bootstrap iterations. All methods investigated in this study were run single-threaded. For the proposed method the running time is compiled considering the sum of the execution times spent for the feature selection and prioritization steps.</p><p>Running time.</p
Overview of the analyzed datasets.
<p>For each dataset, the number of samples, the number of features/genes after pre-processing the data, the number of classes and samples specified for each class are reported.</p><p>Overview of the analyzed datasets.</p
Additional file 1 of MVDA: a multi-view genomic data integration methodology
It contains a section for each step of the methodology in which the tables and figures with the results for each dataset are reported. (PDF 1495 kb
Venn diagrams showing the number of probes differentially expressed in response to pepsin- and trypsin-digested gliadin (PT-G) (Figure 1a) compared to medium control (MED-CTL) and the blank pepsin- and trypsin (PT) control.
<p>The probes that were affected by PT treatment compared to MED-CTL are also displayed. <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0066307#pone-0066307-g001" target="_blank">Figure 1b</a> shows the probes affected by PT, PT-G and p31-43 peptide compared to MED-CTL. The numbers in parenthesis represent the number of probes obtained after multiple testing correction as described in materials and methods.</p
Additional file 2 of MVDA: a multi-view genomic data integration methodology
Each sheet refers to each dataset analysed, reporting the results of the single-view clustering patients. Clustering errors for each algorithm and each cut of feature are also reported. (XLSX 54 kb
- …