55 research outputs found

    Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform

    Get PDF
    In this research, off-line handwriting recognition system for Arabic alphabet is introduced. The system contains three main stages: preprocessing, segmentation and recognition stage. In the preprocessing stage, Radon transform was used in the design of algorithms for page, line and word skew correction as well as for word slant correction. In the segmentation stage, Hough transform approach was used for line extraction. For line to words and word to characters segmentation, a statistical method using mathematic representation of the lines and words binary image was used. Unlike most of current handwriting recognition system, our system simulates the human mechanism for image recognition, where images are encoded and saved in memory as groups according to their similarity to each other. Characters are decomposed into a coefficient vectors, using fast wavelet transform, then, vectors, that represent a character in different possible shapes, are saved as groups with one representative for each group. The recognition is achieved by comparing a vector of the character to be recognized with group representatives. Experiments showed that the proposed system is able to achieve the recognition task with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a single character in a text of 15 lines where each line has 10 words on average

    Feature Extraction Methods for Character Recognition

    Get PDF
    Not Include

    Geometric correction of historical Arabic documents

    Get PDF
    Geometric deformations in historical documents significantly influence the success of both Optical Character Recognition (OCR) techniques and human readability. They may have been introduced at any time during the life cycle of a document, from when it was first printed to the time it was digitised by an imaging device. This Thesis focuses on the challenging domain of geometric correction of Arabic historical documents, where background research has highlighted that existing approaches for geometric correction of Latin-script historical documents are not sensitive to the characteristics of text in Arabic documents and therefore cannot be applied successfully. Text line segmentation and baseline detection algorithms have been investigated to propose a new more suitable one for warped Arabic historical document images. Advanced ideas for performing dewarping and geometric restoration on historical Arabic documents, as dictated by the specific characteristics of the problem have been implemented.In addition to developing an algorithm to detect accurate baselines of historical printed Arabic documents the research also contributes a new dataset consisting of historical Arabic documents with different degrees of warping severity.Overall, a new dewarping system, the first for Historical Arabic documents, has been developed taking into account both global and local features of the text image and the patterns of the smooth distortion between text lines. By using the results of the proposed line segmentation and baseline detection methods, it can cope with a variety of distortions, such as page curl, arbitrary warping and fold

    Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform

    Get PDF
    In this research, off-line handwriting recognition system for Arabic alphabet is introduced. The system contains three main stages: preprocessing, segmentation and recognition stage. In the preprocessing stage, Radon transform was used in the design of algorithms for page, line and word skew correction as well as for word slant correction. In the segmentation stage, Hough transform approach was used for line extraction. For line to words and word to characters segmentation, a statistical method using mathematic representation of the lines and words binary image was used. Unlike most of current handwriting recognition system, our system simulates the human mechanism for image recognition, where images are encoded and saved in memory as groups according to their similarity to each other. Characters are decomposed into a coefficient vectors, using fast wavelet transform, then, vectors, that represent a character in different possible shapes, are saved as groups with one representative for each group. The recognition is achieved by comparing a vector of the character to be recognized with group representatives. Experiments showed that the proposed system is able to achieve the recognition task with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a single character in a text of 15 lines where each line has 10 words on average

    Telling Through Type: Typography and Narrative in Legal Briefs

    Get PDF
    Most legal authors today self-publish, using basic word-processing software and letting the software’s default settings determine what their documents will look like when printed. As these settings are not optimized for legal texts, they do so at their peril. The default font Times New Roman, for example, as Chief Judge Frank Easterbrook warns, is utterly inappropriate for long documents [such as] briefs. Commentators have started urging a more deliberate approach to legal typography. Their suggestions, however, have been content-neutral, intended for all legal texts and focused on goals such as legibility and readability. Typography, however, has much greater potential. The shapes, the spacing, of letters and of words can reinforce, complement, and independently create narrative meaning. Or, intentionally or unintentionally, it can cut against it. It can do its work honestly and ethically, or inappropriately and subversively. This article explores how

    A novel approach to handwritten character recognition

    Get PDF
    A number of new techniques and approaches for off-line handwritten character recognition are presented which individually make significant advancements in the field. First. an outline-based vectorization algorithm is described which gives improved accuracy in producing vector representations of the pen strokes used to draw characters. Later. Vectorization and other types of preprocessing are criticized and an approach to recognition is suggested which avoids separate preprocessing stages by incorporating them into later stages. Apart from the increased speed of this approach. it allows more effective alteration of the character images since more is known about them at the later stages. It also allows the possibility of alterations being corrected if they are initially detrimental to recognition. A new feature measurement. the Radial Distance/Sector Area feature. is presented which is highly robust. tolerant to noise. distortion and style variation. and gives high accuracy results when used for training and testing in a statistical or neural classifier. A very powerful classifier is therefore obtained for recognizing correctly segmented characters. The segmentation task is explored in a simple system of integrated over-segmentation. Character classification and approximate dictionary checking. This can be extended to a full system for handprinted word recognition. In addition to the advancements made by these methods. a powerful new approach to handwritten character recognition is proposed as a direction for future research. This proposal combines the ideas and techniques developed in this thesis in a hierarchical network of classifier modules to achieve context-sensitive. off-line recognition of handwritten text. A new type of "intelligent" feedback is used to direct the search to contextually sensible classifications. A powerful adaptive segmentation system is proposed which. when used as the bottom layer in the hierarchical network. allows initially incorrect segmentations to be adjusted according to the hypotheses of the higher level context modules

    Science fictions, cultural facts: a digital humanities approach to a popular literature

    Get PDF
    Human culture has a necessary influence on the content of popular literature – if only because the interests of a contemporary public determine material success or failure. Authors are products of their time, and popular writers will tend to reflect the cultural expectations and values of their readership. It follows that we should be able to find the imprint of human culture in popular literature if we employ suitable methods. Science fiction (SF) is well suited for such investigation, as it is open in scope and subject, and less restricted by content conventions than other genres. As a publishing medium, magazines, specifically, are valuable literary artefacts of popular culture. They contain fiction, editorials, advertising, reader letters, and features on matters of contemporary importance. These all contribute to build an understanding of their cultural environment. In this thesis, I begin by assessing the relevance of SF as a relevant source of popular insights by tasking SF magazine content as a lens to focus on human culture, analysing the genre and its value in contemporary research and society. I review the uses of SF in academic literature, and analyse public surveys to identify the breadth and relevance of its popular appeal. I describe the phenomenological experience of developing a hybrid digital and traditional methodology from the perspective of someone with no history of digital research in the humanities and employ a series of case studies which test the validity of the approach. The case studies provide insights into the cultural history of two topics: the foundations and subsequent development of Scientology; and the changing representations of tropical environments and peoples. An aim of this study is to devise and demonstrate methodology that respects the human experience of literature, but also integrates the value of employing technological approaches that expand the scope of investigation. The primary sources comprise more than 4,000 individual magazine issues – perhaps thirty percent of issues of magazines dedicated to SF in the twentieth century – and complete, or near complete runs of major titles. The value to the research process of having a significant number of sources is to counter the bias contained in the phenomenological bracket of the researcher. The expectations researchers are influenced by contemporary culture, and personal preferences, and this is likely to affect the perceived significance of specific historic texts. This selection bias could lead to the rejection of content that contains relevant insights. To address these issues, I devised a digital humanities methodology for selecting primary sources, and to complement discussion of the results. The results of applying the methodology strongly support the proposal that SF can provide a valuable indicator of cultural values, preferences and expectations – being widespread and commonly appreciated by contemporary audiences. SF is confirmed to be a valuable and relevant source of information on the evolving history of human cultural interests
    • …
    corecore