Article thumbnail

Predictive accuracy of phylogenetic profiling when we control for the influence of the Open World Assumption.

By Nives Škunca (105906) and Christophe Dessimoz (18084)

Abstract

<p>Two sets of experiments are denoted with colours: experiments when we include only the well-annotated proteins (purple) and experiments where we randomly remove 60% of the available annotations (red). Dashed and full lines connect the dots of the mean AUPRC scores for two sets of experiments: random sub-selection of genomes (full lines) and sub-selection to keep maximum diversity among the selected genomes (dashed lines). Each dot represents the mean AUPRC for the GO terms we use in annotating. The final point denotes the mean AUPRC score when we include all the available bacteria in the used OMA database release (1078 bacteria).</p

Topics: Biological Sciences, Gene function, input data, Open World Assumption, maximising phylogenetic diversity, gene presence, genome, accuracy, Phylogenetic Profiling, annotations need, input data size, annotation databases
Year: 2015
DOI identifier: 10.1371/journal.pone.0114701.g004
OAI identifier: oai:figshare.com:article/1322748
Provided by: FigShare
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • https://figshare.com/articles/... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.