175 research outputs found

    Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

    Full text link
    Recognizing arbitrary multi-character text in unconstrained natural photographs is a hard problem. In this paper, we address an equally hard sub-problem in this domain viz. recognizing arbitrary multi-digit numbers from Street View imagery. Traditional approaches to solve this problem typically separate out the localization, segmentation, and recognition steps. In this paper we propose a unified approach that integrates these three steps via the use of a deep convolutional neural network that operates directly on the image pixels. We employ the DistBelief implementation of deep neural networks in order to train large, distributed neural networks on high quality images. We find that the performance of this approach increases with the depth of the convolutional network, with the best performance occurring in the deepest architecture we trained, with eleven hidden layers. We evaluate this approach on the publicly available SVHN dataset and achieve over 96%96\% accuracy in recognizing complete street numbers. We show that on a per-digit recognition task, we improve upon the state-of-the-art, achieving 97.84%97.84\% accuracy. We also evaluate this approach on an even more challenging dataset generated from Street View imagery containing several tens of millions of street number annotations and achieve over 90%90\% accuracy. To further explore the applicability of the proposed system to broader text recognition tasks, we apply it to synthetic distorted text from reCAPTCHA. reCAPTCHA is one of the most secure reverse turing tests that uses distorted text to distinguish humans from bots. We report a 99.8%99.8\% accuracy on the hardest category of reCAPTCHA. Our evaluations on both tasks indicate that at specific operating thresholds, the performance of the proposed system is comparable to, and in some cases exceeds, that of human operators

    The robustness of animated text CAPTCHAs

    Get PDF
    PhD ThesisCAPTCHA is standard security technology that uses AI techniques to tells computer and human apart. The most widely used CAPTCHA are text-based CAPTCHA schemes. The robustness and usability of these CAPTCHAs relies mainly on the segmentation resistance mechanism that provides robustness against individual character recognition attacks. However, many CAPTCHAs have been shown to have critical flaws caused by many exploitable invariants in their design, leaving only a few CAPTCHA schemes resistant to attacks, including ReCAPTCHA and the Wikipedia CAPTCHA. Therefore, new alternative approaches to add motion to the CAPTCHA are used to add another dimension to the character cracking algorithms by animating the distorted characters and the background, which are also supported by tracking resistance mechanisms that prevent the attacks from identifying the main answer through frame-toframe attacks. These technologies are used in many of the new CAPTCHA schemes including the Yahoo CAPTCHA, CAPTCHANIM, KillBot CAPTCHAs, non-standard CAPTCHA and NuCAPTCHA. Our first question: can the animated techniques included in the new CAPTCHA schemes provide the required level of robustness against the attacks? Our examination has shown many of the CAPTCHA schemes that use the animated features can be broken through tracking attacks including the CAPTCHA schemes that uses complicated tracking resistance mechanisms. The second question: can the segmentation resistance mechanism used in the latest standard text-based CAPTCHA schemes still provide the additional required level of resistance against attacks that are not present missed in animated schemes? Our test against the latest version of ReCAPTCHA and the Wikipedia CAPTCHA exposed vulnerability problems against the novel attacks mechanisms that achieved a high success rate against them. The third question: how much space is available to design an animated text-based CAPTCHA scheme that could provide a good balance between security and usability? We designed a new animated text-based CAPTCHA using guidelines we designed based on the results of our attacks on standard and animated text-based CAPTCHAs, and we then tested its security and usability to answer this question. ii In this thesis, we put forward different approaches to examining the robustness of animated text-based CAPTCHA schemes and other standard text-based CAPTCHA schemes against segmentation and tracking attacks. Our attacks included several methodologies that required thinking skills in order to distinguish the animated text from the other animated noises, including the text distorted by highly tracking resistance mechanisms that displayed them partially as animated segments and which looked similar to noises in other CAPTCHA schemes. These attacks also include novel attack mechanisms and other mechanisms that uses a recognition engine supported by attacking methods that exploit the identified invariants to recognise the connected characters at once. Our attacks also provided a guideline for animated text-based CAPTCHAs that could provide resistance to tracking and segmentation attacks which we designed and tested in terms of security and usability, as mentioned before. Our research also contributes towards providing a toolbox for breaking CAPTCHAs in addition to a list of robustness and usability issues in the current CAPTCHA design that can be used to provide a better understanding of how to design a more resistant CAPTCHA scheme

    On the security of text-based 3D CAPTCHAs

    Get PDF
    CAPTCHAs have become a standard security mechanism that are used to deter automated abuse of online services intended for humans. However, many existing CAPTCHA schemes to date have been successfully broken. As such, a number of CAPTCHA developers have explored alternative methods of designing CAPTCHAs. 3D CAPTCHAs is a design alternative that has been proposed to overcome the limitations of traditional CAPTCHAs. These CAPTCHAs are designed to capitalize on the human visual system\u27s natural ability to perceive 3D objects from an image. The underlying security assumption is that it is difficult for a computer program to identify the 3D content. This paper investigates the robustness of text-based 3D CAPTCHAs. In particular, we examine three existing text-based 3D CAPTCHA schemes that are currently deployed on a number of websites. While the direct use of Optical Character Recognition (OCR) software is unable to correctly solve these textbased 3D CAPTCHA challenges, we highlight certain patterns in the 3D CAPTCHAs can be exploited to identify important information within the CAPTCHA. By extracting this information, this paper demonstrates that automated attacks can be used to solve these 3D CAPTCHAs with a high degree of success

    A security analysis of automated Chinese turing tests

    Get PDF
    Text-based Captchas have been widely used to deter misuse of services on the Internet. However, many designs have been broken. It is intellectually interesting and practically relevant to look for alternative designs, which are currently a topic of active research. We motivate the study of Chinese Captchas as an interesting alternative design - counterintuitively, it is possible to design Chinese Captchas that are universally usable, even to those who have never studied Chinese language. More importantly, we ask a fundamental question: is the segmentation-resistance principle established for Roman-character based Captchas applicable to Chinese based designs? With deep learning techniques, we offer the first evidence that computers do recognize individual Chinese characters well, regardless of distortion levels. This suggests that many real-world Chinese schemes are insecure, in contrast to common beliefs. Our result offers an essential guideline to the design of secure Chinese Captchas, and it is also applicable to Captchas using other large-alphabet languages such as Japanese

    A simple generic attack on text captchas

    Get PDF
    Text-based Captchas have been widely deployed across the Internet to defend against undesirable or malicious bot programs. Many attacks have been proposed; these fine prior art advanced the scientific understanding of Captcha robustness, but most of them have a limited applicability. In this paper, we report a simple, low-cost but powerful attack that effectively breaks a wide range of text Captchas with distinct design features, including those deployed by Google, Microsoft, Yahoo!, Amazon and other Internet giants. For all the schemes, our attack achieved a success rate ranging from 5% to 77%, and achieved an average speed of solving a puzzle in less than 15 seconds on a standard desktop computer (with a 3.3GHz Intel Core i3 CPU and 2 GB RAM). This is to date the simplest generic attack on text Captchas. Our attack is based on Log-Gabor filters; a famed application of Gabor filters in computer security is John Daugman’s iris recognition algorithm. Our work is the first to apply Gabor filters for breaking Captchas

    AN ALGORITHM TO ANALYZE STRENGTH OF CAPTCHA

    Get PDF
    CAPTCHA stands for Completely Automated Public Turing Tests to Tell Computers and Humans Apart. The CAPTCHAs have been widely used across the Internet to defend against undesirable and malicious bot programs. It was observed that an alarming number of CAPTCHAs could be broken by the technique of Image Processing and Artificial Neural Network. Many Researchers have tried to break a CAPTCHA so as to design robust CAPTCHA , but it is essential to generate a strong CAPTCHA that will resist bot attack. This paper has proposed algorithm to analyze the strength of CAPTCHAs using simple image processing techniques such as Preprocessing, Segmentation and Character recognition which in turn helps to improve the robustness and usability of CAPTCHA in Internet System. The experimental result shows the proposed algorithm gives 75 % accuracy to analyze the strength of CAPTCHA

    A Framework for Devanagari Script-based Captcha

    Full text link
    Human Interactive Proofs (HIPs) are automatic reverse Turing tests designed to distinguish between various groups of users. Completely Automatic Public Turing test to tell Computers and Humans Apart (CAPTCHA) is a HIP system that distinguish between humans and malicious computer programs. Many CAPTCHAs have been proposed in the literature that text-graphical based, audio-based, puzzle-based and mathematical questions-based. The design and implementation of CAPTCHAs fall in the realm of Artificial Intelligence. We aim to utilize CAPTCHAs as a tool to improve the security of Internet based applications. In this paper we present a framework for a text-based CAPTCHA based on Devanagari script which can exploit the difference in the reading proficiency between humans and computer programs. Our selection of Devanagari script-based CAPTCHA is based on the fact that it is used by a large number of Indian languages including Hindi which is the third most spoken language. There is potential for an exponential rise in the applications that are likely to be developed in that script thereby making it easy to secure Indian language based applications.Comment: 10 pages, 8 Figures, CCSEA 2011 - First International Conference, Chennai, July 15-17, 201

    Using Generative Adversarial Networks to Break and Protect Text Captchas

    Get PDF
    Text-based CAPTCHAs remains a popular scheme for distinguishing between a legitimate human user and an automated program. This article presents a novel genetic text captcha solver based on the generative adversarial network. As a departure from prior text captcha solvers that require a labor-intensive and time-consuming process to construct, our scheme needs significantly fewer real captchas but yields better performance in solving captchas. Our approach works by first learning a synthesizer to automatically generate synthetic captchas to construct a base solver. It then improves and fine-tunes the base solver using a small number of labeled real captchas. As a result, our attack requires only a small set of manually labeled captchas, which reduces the cost of launching an attack on a captcha scheme. We evaluate our scheme by applying it to 33 captcha schemes, of which 11 are currently used by 32 of the top-50 popular websites. Experimental results demonstrate that our scheme significantly outperforms four prior captcha solvers and can solve captcha schemes where others fail. As a countermeasure, we propose to add imperceptible perturbations onto a captcha image. We demonstrate that our countermeasure can greatly reduce the success rate of the attack
    corecore