23 research outputs found
Role of simulated batch effects in genetic interaction data on similarity measure performance.
<p>(A) shows the performance of similarity measures on the query side of the <i>S. cerevisiae</i> genetic interaction network when simulated intermediate batch effects were added to the data. The batch effects were added by creating random batches of size 5 and for each batch, Gaussian noise (ÎŒâ=â0 and Ïâ=â0.02) was added. Furthermore, Gaussian noise (ÎŒâ=â0 and Ïâ=â0.02) was added to entire dataset. (B) A stronger batch effect signature and noise was added (ÎŒâ=â0, Ïâ=â0.04 for both batch effect and noise) (C), (D) are similar plots for the query side of the <i>S.pombe</i> genetic interaction data (ÎŒâ=â0, Ïâ=â1 for (C), and ÎŒâ=â0, Ïâ=â2 for (D)). The bar plot on the upper right corner in each section shows the area under the precision-recall curve (AUPRC) above the background for each similarity measure. The area was calculated by summation of the areas of trapezoids at in increments of 2<sup>n</sup> (log<sub>2</sub> units). The bars are sorted by their respective areas above background.</p
Similarity measures evaluated in this study.
<p>Similarity measures evaluated in this study.</p
Role of noise in the genetic interaction data on similarity measure performance.
<p>In each panel, simulated noise was added to the <i>S. cerevisiae</i> genetic interaction data, and query correlations were used for comparing the similarity measures. The simulated noise conditions are (A) false negatives â95% of the significant interactions whose absolute value of interaction is greater than 0.08 were randomly set to 0, (B) false positives â values were randomly sampled from the set of genetic interactions whose absolute interaction value were greater than 0.08 and were randomly substituted in place of randomly selected non-interactions. This random sampling was repeated until 10 times the number of significant interactions were added as false positives in the original data, and (C) Gaussian noise - random values from a Gaussian distribution of mean 0 and standard deviation 0.08 were added to all values (interactions and non-interactions) in the dataset. The bar plot on the upper right corner in each section shows the area under the precision-recall curve (AUPRC) above the background for each similarity measure. The area was calculated by summation of the areas of trapezoids at in increments of 2<sup>n</sup> (log<sub>2</sub> units). The bars are sorted by their respective areas above background.</p
Comparison of similarity measures applied to genetic interaction datasets.
<p>Gene pair correlations derived from each similarity measure were benchmarked against a Gene Ontology-based standard using precision-recall statistics. The comparison was conducted on (A) <i>S. cerevisiae</i> genetic interaction data (Costanzo <i>et al.</i> 2010) - query genesâ similarities, (B) <i>S. cerevisiae</i> genetic interaction data - array genesâ similarities, (C) <i>S. pombe</i> genetic interaction data (Ryan <i>et al.</i> 2012) - query genesâ similarities, and (D) <i>S. pombe</i> genetic interaction data â array genesâ similarities. The horizontal dotted line shows the background precision expected from randomized ranking of gene pairs. The bar plot on the upper right corner in each section shows the area under the precision-recall curve (AUPRC) above the background for each similarity measure. The area was calculated by summation of the areas of trapezoids at increments of 2<sup>n</sup> (log<sub>2</sub> units). The bars are sorted by their respective areas above background.</p
Role of thresholding genetic interaction data in the performance of similarity measures.
<p>The precision-recall plots were compared on the query side of the <i>S. cerevisiae</i> genetic interaction data at several thresholds (A) Δ<â0.08 - only negative genetic interactions at intermediate threshold, (B) Δ<â0.2 - only negative genetic interactions at a stringent threshold, (C) Δ >0.08 - only positive genetic interactions at an intermediate threshold, (D) Δ >0.2 - only positive genetic interactions at a stringent threshold, (E) |Δ| >0.08, negative and positive interaction at an intermediate threshold, and (F) |Δ| >0.2, negative and positive interaction at a stringent threshold. The bar plot on the upper right corner in each section shows the area under the precision-recall curve (AUPRC) above the background for each similarity measure. The area was calculated by summation of the areas of trapezoids at in increments of 2<sup>n</sup> (log<sub>2</sub> units). The bars are sorted by their respective areas above background.</p
AdditionalDataS5
This file contains the complete list of yeast strains present on the diagnostic array, which was used for genetic interaction screens in this study
Legislative Documents
Also, variously referred to as: Senate bills; Senate documents; Senate legislative documents; legislative documents; and General Court documents
AdditionalDataS6
This file contains the trigenic interactions list of MDY2-MTC1 and digenic interaction list of MDY2 and MTC1 corresponding to Fig. 3. The âTetrad Analysisâ tab contains confirmations results obtained from tetrad analysis: SS is synthetic sick, SL is synthetic lethal. The âGenetic interactionsâ tab contains columns that are annotated with âCellMapâ since they contain genetic interactions from (7) downloaded from theCellMap.org (26) as well as scores derived in this study
Data File S9. High and low interaction degree genes
This file lists the negative and positive interaction degree associated with every nonessential deletion (sn#), essential TS (tsq#), and DAmP (damp#) query mutant strain screened against the DMA (âquery degree X DMAâ tab) and/or TSA (âquery degree X TSAâ tab). A subset of strains were found to carry a second, spontaneous suppressor mutation that affected fitness of the query mutant strain. Strains carrying a suppressor mutation mapped through SGA analysis are indicated (â-suppâ). Query mutants comprising the 20% highest and lowest degree groups of strains are indicated. Furthermore, a âCo-batch signalâ rank is provided for every query (see âCo-batch filtering of query mutant strainsâ). Low ranks correspond to evidence for lingering batch effects. Another column, â Gene with correlated GI profiles that are co-annotated with the query gene (%)", provides the percent of correlated gene pairs that are co-annotated to the particular query. A low negative interaction degree (e.g. 20% lowest negative interaction degree) coupled with a low co-batch rank (e.g. < ~0.2) and a low fraction of correlated pairs that share a similar functional annotation with a given query strain (e.g. < ~0.15) may be indicative of a low confidence screen. However, these criteria should be considered as loose indicators and not definitive metrics of screen quality and thus, should not be used as strict filters on the global interaction dataset. Another list (âQueries removed - batch effectsâ tab) indicates ~300 query strains that exhibited severe systematic batch effects and thus were removed from the indicated data set. Finally, two additional tabs provide the negative and positive interaction degree associated with every nonessential (ânonessential array degreeâ tab) and essential (âessential array degreeâ tab) array mutant, respectively
Data File S6. Genetic profile similarity-based hierarchy analysis
The first tab (âGene to hierarchy cluster mappingâ) lists the clusters identified at each level of the genetic interaction-based hierarchy and the deletion and TS allele array mutants assigned to each cluster. Examples of clusters described in the main text are highlighted. The subsequent 9 tabs indicate enrichment of clusters resolved at the specified profile similarity range for specific cell compartments (Cyclops_enrich), biological processes (GO BP_enrich), protein complexes (complex_enrich) and KEGG pathways (KEGG_enrich). The final tab in the file indicates the clusters used to map the functional distribution of negative and positive interactions shown in Fig. 5D