Location of Repository

Towards Automatic Detecting of Overlapping Genes- Clustered BLAST Analysis of Viral

By Klaus Neuhaus, Daniela Oelke, David Fürst, Siegfried Scherer and Daniel A. Keim

Abstract

Abstract. Overlapping genes (encoded on the same DNA strand but in different frames) are thought to be rare and, therefore, were largely neglected in the past. In a test set of 800 viruses we found more than 350 potential overlapping open reading frames of>500 bp which generate BLAST hits, indicating a possible biological function. Interestingly, five overlaps with more than 2000 bp were found, the largest may even contain triple overlaps. In order to perform the vast amount of BLAST searches required to test all detected open reading frames, we compared two clustering strategies (BLASTCLUST and k-means) and queried the database with one representative only. Our results show that this approach achieves a significant speed-up while retaining a high quality of the results (>99 % precision compared to single queries) for both clustering methods. Future wet lab experiments are needed to show whether the detected overlapping reading frames are biologically functional. Key words: overlapping genes, clustering, BLAST analysi

Year: 2011
OAI identifier: oai:CiteSeerX.psu:10.1.1.186.1739
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.inf.uni-konstanz.de... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.