Skip to main content
Article thumbnail
Location of Repository

TACOA – Taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach

By Naryttza N Diaz, Lutz Krause, Alexander Goesmann, Karsten Niehaus and Tim W Nattkemper
Topics: Methodology Article
Publisher: BioMed Central
OAI identifier:
Provided by: PubMed Central

Suggested articles


  1. (2008). A: The Pfam protein families database. Nucleic Acids Res
  2. (1997). AM: Compositional biases of bacterial genomes and evolutionary implications.
  3. (2003). Ancient horizontal gene transfer. Nature Reviews
  4. (1997). AR: DNA sequencing with chainterminating inhibitors.
  5. (2001). Aravind L: Horizontal gene transfer in prokaryotes: quantification and classification. Annu Rev Microbiol
  6. (2000). Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics
  7. (2006). Buydens LM: KNN-kernel density-based clustering for high-dimensional multivariate data.
  8. (2001). Capturing whole-genome characteristics in short sequences using a naïve Bayesian classifier. Genome Res
  9. (2004). Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature
  10. (1996). DeLong EF: Characterization of uncultivated prokaryotes: isolation and analysis of a 40-kilobase-pair genome fragment from a planktonic marine archaeon.
  11. (2006). Dubitzky W: Instance-based concept learning from multiclass DNA microarray data.
  12. (2005). Environments shape the nucleotide composition of genomes. EMBO Rep
  13. (2007). Gaasterland T: DarkHorse: a method for genome-wide prediction of horizontal gene transfer. Genome Biol
  14. (1999). Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA.
  15. (2007). Get the most out of your metagenome: computational analysis of environmental sequence data. Curr Opin Microbiol
  16. (2004). Glöckner FO: Application of tetranucleotide frequencies for the assignment of genomic fragments. Environ Microbiol
  17. (2004). Glöckner FO: TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences.
  18. (2008). Goesmann A: Taxonomic composition and gene content of a methane-producing microbial community isolated from a biogas reactor.
  19. (2000). Horizontal gene transfer in bacterial and archaeal complete genomes. Genome Res
  20. (2006). Ikemura T: A novel bioinformatics tool for phylogenetic classification of genomic sequence fragments derived from mixed genomes of uncultured environmental microbes. Polar Biosci
  21. (2005). Ikemura T: Novel phylogenetic studies of genomic sequence fragments derived from uncultured microbe mixtures in environmental and clinical samples. DNA Res
  22. (2008). JD: Horizontal gene transfer in eukaryotic evolution. Nature Reviews Genetics
  23. (2005). JM: Genome sequencing in microfabricated high-density picolitre reactors. Nature
  24. (1997). Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res
  25. (1967). Nearest Neighbor Pattern Classification.
  26. (2008). Phylogenetic classification of short environmental DNA fragments. Nucleic Acids Res
  27. (2002). Rapp BA: Database resources of the National Center for Biotechnology Information:
  28. (2008). Reliability and applications of statistical methods based on oligonucleotide frequencies in bacterial and archaeal genomes.
  29. (2007). Rigoutsos I: Accurate phylogenetic classification of variable-length DNA fragments. Nat Methods
  30. (2006). S: In silico prediction of yeast deletion phenotypes. Genet Mol Res
  31. (2007). Schuster S: MEGAN analysis of metagenomic data. Genome Res
  32. (1998). Shotgun sequencing of the human genome. Science
  33. (2008). Skjerve E, Ussery D: Investigations of oligonucleotide usage variance within and between prokaryotes. PLoS Comput Biol
  34. (2008). Tang S: Binning sequences using very sparse labels within a metagenome. BMC Bioinformatics
  35. (2002). The Elements of Statistical Learning
  36. (2000). The genome sequence of the thermoacidiphilic scavender Thermoplasma acidophilum. Nature
  37. (2005). The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res
  38. (2007). Using machine learning algorithms to guide rehabilitation planning for home care clients. BMC Medical Informatics and Decision Making
  39. (1995). Wholegenome random sequencing and assembly of Haemophilus influenzae Rd. Science
  40. (2008). Ya-Zhi H: Characteristics of oligonucleotide frequencies across genomes: Conservation versus variation, strand symmetry, and evolutionary implications.
  41. (1975). Yang C: A vector space model for automatic indexing.

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.