Search CORE

15 research outputs found

Cellular component GO terms showing highly divergent predictability by the network and the classifier ensemble

Author: Chase Krumpelman (51052)
Edward M Marcotte (16050)
Wan Kyu Kim (14123)
Publication venue
Publication date
Field of study

Specific functional biases of cellular component (CC) terms between the network and classifier ensemble are highlighted by plotting the difference between AUCand AUCfor Gene Ontology (GO) terms showing the largest ΔAUC = max(AUC, 0.5) - max(AUC, 0.5). The plot was generated using the same method as in Figure 4. GO, Gene Ontology.Copyright information:Taken from "Inferring mouse gene functions from genomic-scale data using a combined functional network/classification strategy"http://genomebiology.com/2008/9/S1/S5Genome Biology 2008;9(Suppl 1):S5-S5.Published online 27 Jun 2008PMCID:PMC2447539.</p

The Francis Crick Institute

Diverse yeast gene loss-of-function phenotypes are predictable using guilt-by-association in a functional gene network

Author: Edward M Marcotte (16050)
Insuk Lee (41276)
Kriston L McGary (41275)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Broad network-based predictability of gene loss-of-function phenotypes"http://genomebiology.com/2007/8/12/R258Genome Biology 2007;8(12):R258-R258.Published online 5 Dec 2007PMCID:PMC2246260. Predictability is measured in a receiver operating characteristic plot of the true positive rate (sensitivity) versus false positive rate (1 - specificity) for predicting genes giving rise to ten specific loss-of-function phenotypes, as well as for essential genes whose disruption produces nonviable yeast [4]. For each phenotype, each gene in the yeast genome was prioritized by the sum of the weights of its network linkages to the seed genes associated with the phenotype. Genes with higher scores are more tightly linked to the seed set and therefore more likely to give rise to the phenotype. Each phenotype was evaluated using leave-one-out cross-validation, omitting genes from the seed set for the purposes of evaluation. More predictable phenotypes tend toward the top-left corner of the graph; random predictability is indicated by the diagonal. For clarity, the line connecting the final point of each graph to the top right corner has been omitted. FN, false negative; FP, false positive; TN, true negative; TP, true positive

The Francis Crick Institute

Biological process GO terms showing highly divergent predictability by the network and the classifier ensemble

Author: Chase Krumpelman (51052)
Edward M Marcotte (16050)
Wan Kyu Kim (14123)
Publication venue
Publication date
Field of study

Specific functional biases of biological process (BP) terms between the network and classifier ensemble are highlighted by plotting the difference between AUCand AUCfor GO terms showing the largest ΔAUC = max(AUC, 0.5) - max(AUC, 0.5). The GO terms with ancestor-descendant relationships were merged to the ancestor term to remove redundancy. The Gene Ontology (GO) terms wit

The Francis Crick Institute

A plot of seed set size versus predictability of the phenotype shows no significant correlation

Author: Edward M Marcotte (16050)
Insuk Lee (41276)
Kriston L McGary (41275)
Publication venue
Publication date
Field of study

The Francis Crick Institute

Overall performance of the various algorithms' capacity to predict mouse gene GO annotation

Author: Chase Krumpelman (51052)
Edward M Marcotte (16050)
Wan Kyu Kim (14123)
Publication venue
Publication date
Field of study

The performance of each general strategy ('network', network-based prediction including expression data; 'network', network-based prediction excluding expression data; and 'classifier', naïve Bayes classifiers) as well as several methods of combining the networkand classifier scores ('mean', arithmetic mean of network and classifier scores; 'min', minimum of their scores; and 'max', maximum of their scores) is plotted as the mean AUC and the average APR across all Gene Ontology (GO) annotations averaged across ten-fold predictions in the indicated hierarchies (BP, biological process; CC, cellular component; MF, molecular function) and annotation specificities (terms annotating 3 to 10, 11 to 30, 31 to 100, or 101 to 300 genes). The network approach clearly outperforms the classification approach on the infrequent annotations ('3 to 10' and '11 to 30'), while the two methods perform nearly equivalently on the frequent annotations ('31 to 100' and '101 to 300'). The mean and max combinations generally perform slightly better than either of their constituents (networkand classifier). The full network shows a significant advantage over the slim network for CC terms and, to a lesser degree, for BP and MF terms. AUC, area under the receiver operating characteristic.Copyright information:Taken from "Inferring mouse gene functions from genomic-scale data using a combined functional network/classification strategy"http://genomebiology.com/2008/9/S1/S5Genome Biology 2008;9(Suppl 1):S5-S5.Published online 27 Jun 2008PMCID:PMC2447539.</p

The Francis Crick Institute

Quantitative cell morphology phenotypes are predicted significantly better than random expectation

Author: Edward M Marcotte (16050)
Insuk Lee (41276)
Kriston L McGary (41275)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Broad network-based predictability of gene loss-of-function phenotypes"http://genomebiology.com/2007/8/12/R258Genome Biology 2007;8(12):R258-R258.Published online 5 Dec 2007PMCID:PMC2246260. In contrast, genes whose disruption decreases population co-efficient of variance (CV) were not predictable. A histogram plotting the distribution of the area under the receiver operating characteristic (ROC) curve (AUC) values for 562 quantitative morphological phenotypes shows a significantly higher proportion of high AUC values than for 1,000 size-matched random gene sets. Separate analyses of phenotypes associated with morphologic features and phenotypes associated with cell-to-cell variability in the morphologic features reveals asymmetry in predictability. Sets of genes whose disruption causes the 40 largest or smallest mean values of a morphological feature (middle plots) are significantly more predictable than random gene sets (left side). By contrast, although the sets of genes whose disruption most increase the CV tend to be predictable (high AUC), those that most decrease the CV are not (low AUC). Box-and-whisker plots are drawn as in Figure 3. A comparison of the median phenotypic CVs observed for deletion strains versus replicate analyses of wild-type cells shows that deletion strains with the most reduced CVs are essentially wild-type-like in character, whereas those with the most increased CVs show significantly more cell-to-cell variability than wild-type cells. These latter knockout strains carry deletions for genes predominantly involved in maintaining genomic integrity. This trend is therefore likely to have arisen from nonclonal genetic variation in these strains, recapitulating the classic mutator phenotype

The Francis Crick Institute

Yeast genes with human orthologs linked to the same diseases are predicted better than random expectation

Author: Edward M Marcotte (16050)
Insuk Lee (41276)
Kriston L McGary (41275)
Publication venue
Publication date
Field of study

The Francis Crick Institute

Quantitative gene expression assessment identifies appropriate cell line models for individual cervical cancer pathways-6

Author: Edward M Marcotte (16050)
Mark W Carlson (59001)
Vishwanath R Iyer (16049)
Publication venue
Publication date
Field of study

The Francis Crick Institute

Quantitative gene expression assessment identifies appropriate cell line models for individual cervical cancer pathways-4

Author: Edward M Marcotte (16050)
Mark W Carlson (59001)
Vishwanath R Iyer (16049)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Quantitative gene expression assessment identifies appropriate cell line models for individual cervical cancer pathways"http://www.biomedcentral.com/1471-2164/8/117BMC Genomics 2007;8():117-117.Published online 10 May 2007PMCID:PMC1878486.ey, while the pathways where only one or two cell lines are adequate models are white. The pathway example "RNA Processing" indicates some cell lines were anti-correlated and therefore a quantitative analysis was needed to identify better models that could be used to study this pathway. Error bars were generated from the correlation of a single cell line for each pathway and calculating the standard deviation. The pathways shown here represented a minimum of four cell lines or growth conditions. Numbers in parenthesis indicate how many cell lines were used to calculate the correlation. B: The highest and lowest pathway correlations between normal cervix and cervical cancer. The JNK cascade has a high correlation between normal and tumor, and is modeled well by most cell lines (Figure 5A). Mitosis and a number of other pathways involved in growth and regulation show poor correlation in their gene expression between normal and tumor, as expected. Numbers in parenthesis indicate how many genes were used to calculate the Pearson correlation coefficient

The Francis Crick Institute

Quantitative gene expression assessment identifies appropriate cell line models for individual cervical cancer pathways-1

Author: Edward M Marcotte (16050)
Mark W Carlson (59001)
Vishwanath R Iyer (16049)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Quantitative gene expression assessment identifies appropriate cell line models for individual cervical cancer pathways"http://www.biomedcentral.com/1471-2164/8/117BMC Genomics 2007;8():117-117.Published online 10 May 2007PMCID:PMC1878486.es, whereas numbered labels represent biological replicates. The primary separation occurs between cell lines (dashed bar) and cervical tissue (solid bar). All GOG samples and CHTN samples #1, 2, 8, 12 and 13 were invasive cervical cancer biopsies. CHTN samples #6, 10, and 11 were normal cervix. Most replicates clustered together, indicating the data was of high quality. Spots present on the microarray that had a median intensity over background of at least 150 and were present in 80% of the arrays were included in the analysis, resulting in 8,338 genes. B: Singular value decomposition of transcriptional profiles reveals general relationships among the samples, positioned here as the projections among the first 3 singular components (accounting for 40% of the variance, [see Additional file ]. Again, cell lines were separated from cervical tissue

The Francis Crick Institute