23 research outputs found

    Role of simulated batch effects in genetic interaction data on similarity measure performance.

    No full text
    <p>(A) shows the performance of similarity measures on the query side of the <i>S. cerevisiae</i> genetic interaction network when simulated intermediate batch effects were added to the data. The batch effects were added by creating random batches of size 5 and for each batch, Gaussian noise (Ό = 0 and σ = 0.02) was added. Furthermore, Gaussian noise (Ό = 0 and σ = 0.02) was added to entire dataset. (B) A stronger batch effect signature and noise was added (Ό = 0, σ = 0.04 for both batch effect and noise) (C), (D) are similar plots for the query side of the <i>S.pombe</i> genetic interaction data (Ό = 0, σ = 1 for (C), and Ό = 0, σ = 2 for (D)). The bar plot on the upper right corner in each section shows the area under the precision-recall curve (AUPRC) above the background for each similarity measure. The area was calculated by summation of the areas of trapezoids at in increments of 2<sup>n</sup> (log<sub>2</sub> units). The bars are sorted by their respective areas above background.</p

    Similarity measures evaluated in this study.

    No full text
    <p>Similarity measures evaluated in this study.</p

    Role of noise in the genetic interaction data on similarity measure performance.

    No full text
    <p>In each panel, simulated noise was added to the <i>S. cerevisiae</i> genetic interaction data, and query correlations were used for comparing the similarity measures. The simulated noise conditions are (A) false negatives –95% of the significant interactions whose absolute value of interaction is greater than 0.08 were randomly set to 0, (B) false positives – values were randomly sampled from the set of genetic interactions whose absolute interaction value were greater than 0.08 and were randomly substituted in place of randomly selected non-interactions. This random sampling was repeated until 10 times the number of significant interactions were added as false positives in the original data, and (C) Gaussian noise - random values from a Gaussian distribution of mean 0 and standard deviation 0.08 were added to all values (interactions and non-interactions) in the dataset. The bar plot on the upper right corner in each section shows the area under the precision-recall curve (AUPRC) above the background for each similarity measure. The area was calculated by summation of the areas of trapezoids at in increments of 2<sup>n</sup> (log<sub>2</sub> units). The bars are sorted by their respective areas above background.</p

    Comparison of similarity measures applied to genetic interaction datasets.

    No full text
    <p>Gene pair correlations derived from each similarity measure were benchmarked against a Gene Ontology-based standard using precision-recall statistics. The comparison was conducted on (A) <i>S. cerevisiae</i> genetic interaction data (Costanzo <i>et al.</i> 2010) - query genes’ similarities, (B) <i>S. cerevisiae</i> genetic interaction data - array genes’ similarities, (C) <i>S. pombe</i> genetic interaction data (Ryan <i>et al.</i> 2012) - query genes’ similarities, and (D) <i>S. pombe</i> genetic interaction data – array genes’ similarities. The horizontal dotted line shows the background precision expected from randomized ranking of gene pairs. The bar plot on the upper right corner in each section shows the area under the precision-recall curve (AUPRC) above the background for each similarity measure. The area was calculated by summation of the areas of trapezoids at increments of 2<sup>n</sup> (log<sub>2</sub> units). The bars are sorted by their respective areas above background.</p

    Role of thresholding genetic interaction data in the performance of similarity measures.

    No full text
    <p>The precision-recall plots were compared on the query side of the <i>S. cerevisiae</i> genetic interaction data at several thresholds (A) Δ<−0.08 - only negative genetic interactions at intermediate threshold, (B) Δ<−0.2 - only negative genetic interactions at a stringent threshold, (C) Δ >0.08 - only positive genetic interactions at an intermediate threshold, (D) Δ >0.2 - only positive genetic interactions at a stringent threshold, (E) |Δ| >0.08, negative and positive interaction at an intermediate threshold, and (F) |Δ| >0.2, negative and positive interaction at a stringent threshold. The bar plot on the upper right corner in each section shows the area under the precision-recall curve (AUPRC) above the background for each similarity measure. The area was calculated by summation of the areas of trapezoids at in increments of 2<sup>n</sup> (log<sub>2</sub> units). The bars are sorted by their respective areas above background.</p

    AdditionalDataS6

    No full text
    This file contains the trigenic interactions list of MDY2-MTC1 and digenic interaction list of MDY2 and MTC1 corresponding to Fig. 3. The ‘Tetrad Analysis’ tab contains confirmations results obtained from tetrad analysis: SS is synthetic sick, SL is synthetic lethal. The ‘Genetic interactions’ tab contains columns that are annotated with ‘CellMap’ since they contain genetic interactions from (7) downloaded from theCellMap.org (26) as well as scores derived in this study

    Data File S6. Genetic profile similarity-based hierarchy analysis

    No full text
    The first tab (“Gene to hierarchy cluster mapping”) lists the clusters identified at each level of the genetic interaction-based hierarchy and the deletion and TS allele array mutants assigned to each cluster. Examples of clusters described in the main text are highlighted. The subsequent 9 tabs indicate enrichment of clusters resolved at the specified profile similarity range for specific cell compartments (Cyclops_enrich), biological processes (GO BP_enrich), protein complexes (complex_enrich) and KEGG pathways (KEGG_enrich). The final tab in the file indicates the clusters used to map the functional distribution of negative and positive interactions shown in Fig. 5D

    Data File S9. High and low interaction degree genes

    No full text
    This file lists the negative and positive interaction degree associated with every nonessential deletion (sn#), essential TS (tsq#), and DAmP (damp#) query mutant strain screened against the DMA (“query degree X DMA” tab) and/or TSA (“query degree X TSA” tab). A subset of strains were found to carry a second, spontaneous suppressor mutation that affected fitness of the query mutant strain. Strains carrying a suppressor mutation mapped through SGA analysis are indicated (“-supp”). Query mutants comprising the 20% highest and lowest degree groups of strains are indicated. Furthermore, a “Co-batch signal” rank is provided for every query (see “Co-batch filtering of query mutant strains”). Low ranks correspond to evidence for lingering batch effects. Another column, “ Gene with correlated GI profiles that are co-annotated with the query gene (%)", provides the percent of correlated gene pairs that are co-annotated to the particular query. A low negative interaction degree (e.g. 20% lowest negative interaction degree) coupled with a low co-batch rank (e.g. < ~0.2) and a low fraction of correlated pairs that share a similar functional annotation with a given query strain (e.g. < ~0.15) may be indicative of a low confidence screen. However, these criteria should be considered as loose indicators and not definitive metrics of screen quality and thus, should not be used as strict filters on the global interaction dataset. Another list (“Queries removed - batch effects” tab) indicates ~300 query strains that exhibited severe systematic batch effects and thus were removed from the indicated data set. Finally, two additional tabs provide the negative and positive interaction degree associated with every nonessential (“nonessential array degree” tab) and essential (“essential array degree” tab) array mutant, respectively
    corecore