220 research outputs found

    A Framework for Devanagari Script-based Captcha

    Full text link
    Human Interactive Proofs (HIPs) are automatic reverse Turing tests designed to distinguish between various groups of users. Completely Automatic Public Turing test to tell Computers and Humans Apart (CAPTCHA) is a HIP system that distinguish between humans and malicious computer programs. Many CAPTCHAs have been proposed in the literature that text-graphical based, audio-based, puzzle-based and mathematical questions-based. The design and implementation of CAPTCHAs fall in the realm of Artificial Intelligence. We aim to utilize CAPTCHAs as a tool to improve the security of Internet based applications. In this paper we present a framework for a text-based CAPTCHA based on Devanagari script which can exploit the difference in the reading proficiency between humans and computer programs. Our selection of Devanagari script-based CAPTCHA is based on the fact that it is used by a large number of Indian languages including Hindi which is the third most spoken language. There is potential for an exponential rise in the applications that are likely to be developed in that script thereby making it easy to secure Indian language based applications.Comment: 10 pages, 8 Figures, CCSEA 2011 - First International Conference, Chennai, July 15-17, 201

    Word Searching in Scene Image and Video Frame in Multi-Script Scenario using Dynamic Shape Coding

    Full text link
    Retrieval of text information from natural scene images and video frames is a challenging task due to its inherent problems like complex character shapes, low resolution, background noise, etc. Available OCR systems often fail to retrieve such information in scene/video frames. Keyword spotting, an alternative way to retrieve information, performs efficient text searching in such scenarios. However, current word spotting techniques in scene/video images are script-specific and they are mainly developed for Latin script. This paper presents a novel word spotting framework using dynamic shape coding for text retrieval in natural scene image and video frames. The framework is designed to search query keyword from multiple scripts with the help of on-the-fly script-wise keyword generation for the corresponding script. We have used a two-stage word spotting approach using Hidden Markov Model (HMM) to detect the translated keyword in a given text line by identifying the script of the line. A novel unsupervised dynamic shape coding based scheme has been used to group similar shape characters to avoid confusion and to improve text alignment. Next, the hypotheses locations are verified to improve retrieval performance. To evaluate the proposed system for searching keyword from natural scene image and video frames, we have considered two popular Indic scripts such as Bangla (Bengali) and Devanagari along with English. Inspired by the zone-wise recognition approach in Indic scripts[1], zone-wise text information has been used to improve the traditional word spotting performance in Indic scripts. For our experiment, a dataset consisting of images of different scenes and video frames of English, Bangla and Devanagari scripts were considered. The results obtained showed the effectiveness of our proposed word spotting approach.Comment: Multimedia Tools and Applications, Springe

    A Technique for Character Segmentation in Middle zone of Handwritten Hindi words using Hybrid Approach

    Get PDF
    India is a country where people talk in multilingual and write in multi-script. Devanagari is one of the most popular scripts in India, which is used to write Hindi, Sanskrit, Sindhi, Marathi and Nepali Languages. This research work is performed on Hindi language. A large number of precious and essential documents are available in handwritten form, which needs to be converted into editable form. The existence of Optical Character Recognition (OCR) makes this task easier to convert handwritten text in editable form. Character segmentation is an important phase of OCR, which segment the characters from handwritten words. This enhances the accuracy of OCR system. In this paper a hybrid approach is used to segment the characters that contain single and multiple touching characters within a word. The proposed system is tested on a dataset of various handwritten words written by different writers. The dataset of proposed system contains more than 300 handwritten words in Hindi language. Accuracy of the proposed hybrid system is evaluated to 96% which is better than that of existing techniques
    • …
    corecore