36 research outputs found
Bidirectional Best Hits Miss Many Orthologs in Duplication-Rich Clades such as Plants and Animals.
Bidirectional best hits (BBH), which entails identifying the pairs of genes in two di.erent genomes that are more similar to each other than either is to any other gene in the other genome, is a simple and widely used method to infer orthology. A recent study has analysed the link between BBH and orthology in bacteria and archaea and concluded that, given the very high consistency in BBH they observed among triplets of neighboring genes, a high proportion of BBH are likely to be bona fide orthologs. However, limited by their analysis setup, the previous study could not easily test the reverse question: which proportion of orthologs are BBH? In this follow-up study, we consider this question in theory and answer it based on conceptual arguments, simulated data, and real biological data from all three domains of life. Our analyses corroborate the findings of the previous study, but also show that because of the high rate of gene duplication in plants and animals, as much as 60% of orthologous relations are missed by the BBH criterion
Who Watches the Watchmen? An Appraisal of Benchmarks for Multiple Sequence Alignment
Multiple sequence alignment (MSA) is a fundamental and ubiquitous technique
in bioinformatics used to infer related residues among biological sequences.
Thus alignment accuracy is crucial to a vast range of analyses, often in ways
difficult to assess in those analyses. To compare the performance of different
aligners and help detect systematic errors in alignments, a number of
benchmarking strategies have been pursued. Here we present an overview of the
main strategies--based on simulation, consistency, protein structure, and
phylogeny--and discuss their different advantages and associated risks. We
outline a set of desirable characteristics for effective benchmarking, and
evaluate each strategy in light of them. We conclude that there is currently no
universally applicable means of benchmarking MSA, and that developers and users
of alignment tools should base their choice of benchmark depending on the
context of application--with a keen awareness of the assumptions underlying
each benchmarking strategy.Comment: Revie
Recommended from our members
Standardised Benchmarking in the Quest for Orthologs
The identification of evolutionarily related genes across different species—orthologs in particular—forms the backbone of many comparative, evolutionary, and functional genomic analyses. Achieving high accuracy in orthology inference is thus essential. Yet the true evolutionary history of genes, required to ascertain orthology, is generally unknown. Furthermore, orthologs are used for very different applications across different phyla, with different requirements in terms of the precision-recall trade-off. As a result, assessing the performance of orthology inference methods remains difficult for both users and method developers. Here, we present a community effort to establish standards in orthology benchmarking and facilitate orthology benchmarking through an automated web-based service (http://orthology.benchmarkservice.org). Using this new service, we characterise the performance of 15 well-established orthology inference methods and resources on a battery of 20 different benchmarks. Standardised benchmarking provides a way for users to identify the most effective methods for the problem at hand, sets a minimal requirement for new tools and resources, and guides the development of more accurate orthology inference methods