Skip to main content
Article thumbnail
Location of Repository

Generalized DNA Barcode Design Based on Hamming Codes

By Leonid V. Bystrykh


The diversity and scope of multiplex parallel sequencing applications is steadily increasing. Critically, multiplex parallel sequencing applications methods rely on the use of barcoded primers for sample identification, and the quality of the barcodes directly impacts the quality of the resulting sequence data. Inspection of the recent publications reveals a surprisingly variable quality of the barcodes employed. Some barcodes are made in a semi empirical fashion, without quantitative consideration of error correction or minimal distance properties. After systematic comparison of published barcode sets, including commercially distributed barcoded primers from Illumina and Epicentre, methods for improved, Hamming code-based sequences are suggested and illustrated. Hamming barcodes can be employed for DNA tag designs in many different ways while preserving minimal distance and error-correcting properties. In addition, Hamming barcodes remain flexible with regard to essential biological parameters such as sequence redundancy and GC content. Wider adoption of improved Hamming barcodes is encouraged in multiplex parallel sequencing applications

Topics: Research Article
Publisher: Public Library of Science
OAI identifier:
Provided by: PubMed Central

Suggested articles


  1. (2010). A 454 multiplex sequencing method for rapid and reliable genotyping of highly polymorphic genes in large-scale studies.
  2. (1948). A mathematical theory of communication.
  3. (2007). A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing.
  4. (2011). Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing.
  5. (2011). Barcodes for DNA sequencing with guaranteed error correction capability. Electronics Lett 47;4): 236 p.
  6. (2009). BARCRAWL and BARTAB: software tools for the design and implementation of barcoded primers for highly multiplexed DNA sequencing.
  7. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Soviet physics-
  8. (1997). Demonstration of a word design strategy for DNA computing on surfaces.
  9. (2003). DNA sequence design based on template strategy.
  10. (2003). DNA sequence-based bar codes for tracking the origins of expressed sequence tags from a maize cDNA library constructed using multiple mRNA sources.
  11. (2009). DNA Sudoku– harnessing high-throughput sequencing for multiplexed specimen analysis.
  12. (1950). Error Detecting and Error Correcting Codes.
  13. (2007). Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex.
  14. (2010). Highly-multiplexed barcode sequencing: an efficient method for parallel analysis of pooled samples.
  15. (2011). Identification of errors introduced during high throughput sequencing of the T cell receptor repertoire.
  16. (2008). Identification of genetic variants using bar-coded multiplexed sequencing.
  17. (1996). Normalization and subtraction: two approaches to facilitate gene discovery.
  18. (1960). Polynomial codes over certain finite fields.
  19. (2007). Targeted highthroughput sequencing of tagged nucleic acid samples.
  20. (2007). The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing.
  21. (2000). Universal DNA tag systems: a combinatorial design scheme.

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.