Skip to main content
Article thumbnail
Location of Repository

An audio CAPTCHA to distinguish humans from computers

By Gao Haichang, Honggang Liu, Dan Yao, Xiyang Liu and Uwe Aickelin


CAPTCHAs are employed as a security measure to differentiate human users from bots. A new sound-based\ud CAPTCHA is proposed in this paper, which exploits the gaps\ud between human voice and synthetic voice rather than relays on the auditory perception of human. The user is required to read out a given sentence, which is selected randomly from a specified book. The generated audio file will be analyzed automatically to judge whether the user is a human or not. In this paper, the design of the new CAPTCHA, the analysis of the audio files, and the choice of the audio frame window function are described in detail. And also, some experiments are conducted to fix the critical threshold and the coefficients of three indicators to ensure the security. The proposed audio CAPTCHA is proved accessible to users. The user study has shown that the human success rate reaches approximately 97% and the pass rate of attack software using Microsoft SDK 5.1 is only 4%. The experiments also indicated that it could be solved\ud by most human users in less than 14 seconds and the average\ud time is only 7.8 seconds

Publisher: IEEE Computer Society
Year: 2010
OAI identifier:
Provided by: Nottingham ePrints

Suggested articles


  1. (2008). A Projection-based Segmentation Algorithm for Breaking MSN and YAHOO CAPTCHAs,
  2. (1977). A unified approach to short-time Fourier analysis and synthesis, doi
  3. (2007). Asirra: a CAPTCHA that exploits interest-aligned manual image categorization.
  4. (2003). BaffleText: a Human Interactive Proof,
  5. (2008). Breaking Audio CAPTCHAs.
  6. (2003). CAPTCHA: Using hard AI problems for security.
  7. (1973). Design and simulation of a speech analysis-synthesis system based on short-time Fourier analysis, doi
  8. (1997). Effects of phase on the perception of intervocalic stop consonants, Speech Communication, doi
  9. (2009). Evaluating existing audio CAPTCHAs and an interface optimized for non-visual use.
  10. (2008). Machine learning attacks against the Asirra CAPTCHA. doi
  11. (2001). Pessimal Print: A Reverse Turing Test,
  12. (2009). sequenced tagged Captcha: generation and its analysis.
  13. (2010). site accessed on
  14. (2004). Telling Human and Computers Apart Automatically,
  15. (2008). Towards a Universally Usable CAPTCHA. doi
  16. (2003). Usefulness of Phase Spectrum in Human Speech Perception, doi
  17. Using Machine Learning to Break Visual Human Interaction Proofs (HIPs),
  18. (2009). What’s Up CAPTCHA? A CAPTCHA Based on Image Orientation.

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.