1 research outputs found
Family-based Homology Detection via Pairwise Sequence Comparison
The function of an unknown biological sequence can often be accurately inferred by identifying sequences homologous to the original sequence. Given a query set of known homologs, there exist at least three general classes of techniques for #nding additional homologs: pairwise sequence comparisons, motif analysis, and hidden Markov modeling. Pairwise sequence comparisons are typically employed when only a single query sequence is known. Hidden Markov models #HMMs#, on the other hand, are usually trained with sets of more than 100 sequences. Motif-based methods fall in between these two extremes. The currentwork compares the performance of representative examples of these three homology detection techniques---using the BLAST, MEME and HMMER software---across a wide range of protein families, using query sets of varying sizes. A linear combination of multiple pairwise sequence comparisons outperforms motif-based and HMM methods for all query set sizes. Furthermore, heuristic pairwise com..