149 research outputs found

    Page layout analysis and classification in complex scanned documents

    Get PDF
    Page layout analysis has been extensively studied since the 1980`s, particularly after computers began to be used for document storage or database units. For efficient document storage and retrieval from a database, a paper document would be transformed into its electronic version. Algorithms and methodologies are used for document image analysis in order to segment a scanned document into different regions such as text, image or line regions. To contribute a novel approach in the field of page layout analysis and classification, this algorithm is developed for both RGB space and grey-scale scanned documents without requiring any specific document types, and scanning techniques. In this thesis, a page classification algorithm is proposed which mainly applies wavelet transform, Markov random field (MRF) and Hough transform to segment text, photo and strong edge/ line regions in both color and gray-scale scanned documents. The algorithm is developed to handle both simple and complex page layout structures and contents (text only vs. book cover that includes text, lines and/or photos). The methodology consists of five modules. In the first module, called pre-processing, image enhancements techniques such as image scaling, filtering, color space conversion or gamma correction are applied in order to reduce computation time and enhance the scanned document. The techniques, used to perform the classification, are employed on the one-fourth resolution input image in the CIEL*a*b* color space. In the second module, the text detection module uses wavelet analysis to generate a text-region candidate map which is enhanced by applying a Run Length Encoding (RLE) technique for verification purposes. The third module, photo detection, initially uses block-wise segmentation which is based on basis vector projection technique. Then, MRF with maximum a-posteriori (MAP) optimization framework is utilized to generate photo map. Next, Hough transform is applied to locate lines in the fourth module. Techniques for edge detection, edge linkages, and line-segment fitting are used to detect strong-edges in the module as well. After those three classification maps are obtained, in the last module a final page layout map is generated by using K-Means. Features are extracted to classify the intersection regions and merge into one classification map with K-Means clustering. The proposed technique is tested on several hundred images and its performance is validated by utilizing Confusion Matrix (CM). It shows that the technique achieves an average of 85% classification accuracy rate in text, photo, and background regions on a variety of scanned documents like articles, magazines, business-cards, dictionaries or newsletters etc. More importantly, it performs independently from a scanning process and an input scanned document (RGB or gray-scale) with comparable classification quality

    Unsupervised Text Extraction from G-Maps

    Full text link
    This paper represents an text extraction method from Google maps, GIS maps/images. Due to an unsupervised approach there is no requirement of any prior knowledge or training set about the textual and non-textual parts. Fuzzy CMeans clustering technique is used for image segmentation and Prewitt method is used to detect the edges. Connected component analysis and gridding technique enhance the correctness of the results. The proposed method reaches 98.5% accuracy level on the basis of experimental data sets.Comment: Proc. IEEE Conf. #30853, International Conference on Human Computer Interactions (ICHCI'13), Chennai, India, 23-24 Aug., 201

    Test Segmentation of MRC Document Compression and Decompression by Using MATLAB

    Get PDF
    Abstract-The mixed raster content (MRC) standard specifies a framework for document compression which can dramatically improve the compression/ quality tradeoff as compared to traditional lossy image compression algorithms. The key to MRC compression is the separation of the document into foreground and background layers, represented as a binary mask. Therefore, the resulting quality and compression ratio of a MRC document encoder is highly dependent upon the segmentation algorithm used to compute the binary mask. The incorporated multi scale framework is used in order to improve the segmentation accuracy of text with varying size. In this paper, we propose a novel multi scale segmentation scheme for MRC document encoding based on the sequential application of two algorithms. The first algorithm, cost optimized segmentation (COS), is a block wise segmentation algorithm formulated in a global cost optimization framework. The second algorithm, connected component classification (CCC), refines the initial segmentation by classifying feature vectors of connected components using a Markov random field (MRF) model. The combined COS/CCC segmentation algorithms are then incorporated into a multi scale framework in order to improve the segmentation accuracy of text with varying size

    Iterated Classification of Document Images

    Get PDF

    Advances in Character Recognition

    Get PDF
    This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject

    Content-based image retrieval of museum images

    Get PDF
    Content-based image retrieval (CBIR) is becoming more and more important with the advance of multimedia and imaging technology. Among many retrieval features associated with CBIR, texture retrieval is one of the most difficult. This is mainly because no satisfactory quantitative definition of texture exists at this time, and also because of the complex nature of the texture itself. Another difficult problem in CBIR is query by low-quality images, which means attempts to retrieve images using a poor quality image as a query. Not many content-based retrieval systems have addressed the problem of query by low-quality images. Wavelet analysis is a relatively new and promising tool for signal and image analysis. Its time-scale representation provides both spatial and frequency information, thus giving extra information compared to other image representation schemes. This research aims to address some of the problems of query by texture and query by low quality images by exploiting all the advantages that wavelet analysis has to offer, particularly in the context of museum image collections. A novel query by low-quality images algorithm is presented as a solution to the problem of poor retrieval performance using conventional methods. In the query by texture problem, this thesis provides a comprehensive evaluation on wavelet-based texture method as well as comparison with other techniques. A novel automatic texture segmentation algorithm and an improved block oriented decomposition is proposed for use in query by texture. Finally all the proposed techniques are integrated in a content-based image retrieval application for museum image collections

    Pattern Recognition

    Get PDF
    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

    SHIRAZ: an automated histology image annotation system for zebrafish phenomics

    Get PDF
    Histological characterization is used in clinical and research contexts as a highly sensitive method for detecting the morphological features of disease and abnormal gene function. Histology has recently been accepted as a phenotyping method for the forthcoming Zebrafish Phenome Project, a large-scale community effort to characterize the morphological, physiological, and behavioral phenotypes resulting from the mutations in all known genes in the zebrafish genome. In support of this project, we present a novel content-based image retrieval system for the automated annotation of images containing histological abnormalities in the developing eye of the larval zebrafish

    Vision Based Assistive System for Label Detection with Voice Output

    Get PDF
    Abstract--A camera based assistive text reading framework to help blind persons read text labels and product packaging from hand-held object in their daily resides is proposed. To isolate the object from cluttered backgrounds or other surroundings objects in the camera view, we propose an efficient and effective motion based method to define a region of interest (ROI) in the video by asking the user to shake the object. In the extracted ROI, text localization and recognition are conducted to acquire text information. To automatically localize the text regions from the object ROI, we propose a novel text localization algorithm by learning gradient features of stroke orientations and distributions of edge pixels in an Adaboost model. Text characters in the localized text regions are then binarized and recognized by off-theshelf optical character recognition software. The recognized text codes are output to blind users in speech
    • …
    corecore