5,823 research outputs found

    Kannada Character Recognition System A Review

    Full text link
    Intensive research has been done on optical character recognition ocr and a large number of articles have been published on this topic during the last few decades. Many commercial OCR systems are now available in the market, but most of these systems work for Roman, Chinese, Japanese and Arabic characters. There are no sufficient number of works on Indian language character recognition especially Kannada script among 12 major scripts in India. This paper presents a review of existing work on printed Kannada script and their results. The characteristics of Kannada script and Kannada Character Recognition System kcr are discussed in detail. Finally fusion at the classifier level is proposed to increase the recognition accuracy.Comment: 12 pages, 8 figure

    Image Skew Detection and Correction in Regular Images and Document Images

    Get PDF
    During any Document scanning and processing of regular images in our daily life activities image skew is a very important part that should be kept in mind before processing the images. Skew is generally referred to the degree of rotation of an image in comparison with its actual position. So before proceeding to any further activity with the images we need to assure the skew of an image is correct or not. So detection of skew of an image would be the first thing to be applied to regular images some times and specially scanned documents when transforming them to appropriate format. There are different algorithms for detection of skew of an image that have been implemented in different kind of works. The basic and very commonly used one is Scan line based skew detection. In this technique several lines are passed through the image from left to right, right to left, top to bottom and bottom to top and then the number of black pixels encountered in different projection of line are counted. The projection with maximum black pixels encountered is to be taken to consider the skew of the image. There is another approaches like Hough transform, Base-point method etc. In Hough transform method the pixel value is calculated for each value of θ. The angle producing maximum variance is considered to be the skew angle of the image. These two algorithms have been implemented and the results have been represented to compare the accurac

    Algorithms for document image skew estimation

    Full text link
    A new projection profile based skew estimation algorithm was developed. This algorithm extracts fiducial points representing character elements by decoding a JBIG compressed image without reconstructing the original image. These points are projected along parallel lines into an accumulator array to determine the maximum alignment and the corresponding skew angle. Methods for characterizing the performance of skew estimation techniques were also investigated. In addition to the new skew estimator, three projection based algorithms were implemented and tested using 1,246 single column text zones extracted from a sample of 460 page images. Linear regression analyses of the experimental results indicate that our new skew estimation algorithm performs competitively with the other three techniques. These analyses also show that estimators using connected components as a fiducial representation perform worse than the others on the entire set of text zones. It is shown that all of the algorithms are sensitive to typographical features. The number of text lines in a zone significantly affects the accuracy of the connected component based methods. We also developed two aggregate measures of skew for entire pages. Experiments performed on the 460 unconstrained pages indicate the need to filter non-text features from consideration. Graphic and noise elements from page images contribute a significant amount of the error for the JBIG algorithm

    Design and Implementation Recognition System for Handwritten Hindi/Marathi Document

    Get PDF
    In the present scenario most of the importance is given for the “paperless office” there by more and more communication and storage of documents is performed digitally. Documents and files which are present in Hindi and Marathi languages that were once stored physically on paper are now being converted into electronic form in order to facilitate quicker additions, searches, and modifications, as well as to prolong the life of such records. Because of this, there is a great demand of such software, which automatically extracts, analyze, recognize and store information from physical documents for later retrieval. Skew detection is used for text line position determination in Digitized documents, automated page orientation, and skew angle detection for binary document images, skew detection in handwritten scripts, in compensation for Internet audio applications and in the correction of scanned documents

    Extraction of Projection Profile, Run-Histogram and Entropy Features Straight from Run-Length Compressed Text-Documents

    Full text link
    Document Image Analysis, like any Digital Image Analysis requires identification and extraction of proper features, which are generally extracted from uncompressed images, though in reality images are made available in compressed form for the reasons such as transmission and storage efficiency. However, this implies that the compressed image should be decompressed, which indents additional computing resources. This limitation induces the motivation to research in extracting features directly from the compressed image. In this research, we propose to extract essential features such as projection profile, run-histogram and entropy for text document analysis directly from run-length compressed text-documents. The experimentation illustrates that features are extracted directly from the compressed image without going through the stage of decompression, because of which the computing time is reduced. The feature values so extracted are exactly identical to those extracted from uncompressed images.Comment: Published by IEEE in Proceedings of ACPR-2013. arXiv admin note: text overlap with arXiv:1403.778

    Image Segmentation and Multiple skew estimation, correction in printed and handwritten documents

    Get PDF
    Analysis of handwritten document has always been a challenging task in the field of image processing. Various algorithms have been developed in finding solution to this problem. The algorithms implemented here for segmentation and skew detection works not only on printed or scanned document images but for also handwritten document images which creates an edge over other methodologies. Here Line segmentation for both printed and handwritten document image is done using two methods namely Histogram projections and Hough Transform assuming that input document image consists of no major skews. For Histogram Projection to work correct, the document must not contain even slight skews. Hough transform gives better results than the former case. Word Segmentation can be done using the connected components analysis. Here, we first identify connected components in the printed or handwritten document image. A methodology is being used here which detects multiple skews in multi handwritten documents or printed ones. Using clustering algorithms, we detect multiple skew blocks in a handwritten document image or printed document image or a combination of both. The algorithm used here also works for skewed multi handwritten text blocks
    corecore