4,152 research outputs found

    Arabic Type Classification System - Qualitative Classification of Historic Arabic Writing Scripts in the Contemporary Typographic Context

    Get PDF
    The emergence of typography shifted written language into a mechanical tool of transmitting meaning, thereby further reducing the connection of representation of language with the language itself which began with the development of writing systems. Developed from various writing systems and languages, typography is the primary mode of visual communication of language. It has become even more important in the digital world we are living in today. This research examines the relationship of Arabic script conventions and classifications in the context of typographic representation, and how typographic representations of the Arabic language have been distorted due to the influence of Latin typographic guidelines in the development of Arabic typefaces. This history has failed to produce Arabic typefaces that accord with the unique cultural, linguistic and contextual character of the Arabic writing system. To address this, an investigation was carried out, through multiple design research methods and methodologies incorporating typographic studies and theories of embodiment applied to the evolution of the Arabic writing system, calligraphy and typography in the Arab region. The investigation aims to better understand, and respond to problems in the use of typefaces at the intersection of languages and cultures. Through the generation of a typeface classificatory system, linking the ground rules of calligraphic scripts, structural influences of Arabic letterforms, and adapting them into existing typefaces used today, this research proposes a tool to assist designers in the making of typographic decisions in the setting of Arabic language, and in its relationship to roman typography. Key words : Typography, classificatory attributes, Arabic language, culture, linguistics, embodimen

    Legibility in typeface design for screen interfaces

    Get PDF
    This thesis explores the considerations related to the design of a typeface specifically for the use in interface typography. The genre of interface typefaces is outlined and essential attributes and requirements of this category of typefaces are inspected from the viewpoints of legibility, readability and type design practices. The research is based on the analysis of interface typeface samples, interviews with type designers as well as empirical findings documented by designers. These trade practices and design artefacts are contrasted with findings from cognitive psychology and legibility research. Furthermore the author’s design of the «Silta» typeface and its creation process are used to scrutinize and validate these observations. Amongst the crucial factors in the design of interface typefaces the legibility of confusable characters is extensively analysed. Furthermore, the rasterized on-screen rendering of outline based fonts is identified as a major contributing factor requiring special attention in the design, technical production and testing phases of modern fonts. Additionally, the context and use of interface typography and how users interact with interfaces are identified as the cornerstones influencing the design decisions of a typeface for this use. Finally, the aesthetics of interface typography and the motivations for developing specific interface typefaces are touched upon. As evident from the reviewed material, branding and visual identity often appear to be a driving force in the creation of new interface typefaces. However, the necessity for technological innovation and its demonstration equally inspire new design solutions. While technological limitations stemming from digital display media are increasingly becoming of less importance, the changes in reading behaviour and adaptive typography drive current development

    Deep Learning-based Recognition of Devanagari Handwritten Characters

    Get PDF
    Numerous techniques have been used over many years to study handwriting recognition. There are two methods for reading handwriting, one of which is online and the other offline. Image recognition is the main part of the handwriting recognition process. Image recognition gives careful consideration to the picture's dimensions, viewing angle, and image quality. Machine learning and deep learning techniques are the two areas of focus for developers looking to increase the intelligence of computers. A person may learn to perform a task by repeatedly exercising it until they recall how to do it. His brain's neurons begin to work automatically, enabling him to carry out the task he has quickly learned. This and deep learning are fairly similar. It uses a variety of neural network designs to address a range of problems. The convolution neural network (CNN) is a very effective technique for handwriting and picture detection

    How to separate between Machine-Printed/Handwritten and Arabic/Latin Words?

    Get PDF
    This paper gathers some contributions to script and its nature identification. Different sets of features have been employed successfully for discriminating between handwritten and machine-printed Arabic and Latin scripts. They include some well established features, previously used in the literature, and new structural features which are intrinsic to Arabic and Latin scripts. The performance of such features is studied towards this paper. We also compared the performance of five classifiers: Bayes (AODEsr), k-Nearest Neighbor (k-NN), Decision Tree (J48), Support Vector Machine (SVM) and Multilayer perceptron (MLP) used to identify the script at word level. These classifiers have been chosen enough different to test the feature contributions. Experiments have been conducted with handwritten and machine-printed words, covering a wide range of fonts. Experimental results show the capability of the proposed features to capture differences between scripts and the effectiveness of the three classifiers. An average identification precision and recall rates of 98.72% was achieved, using a set of 58 features and AODEsr classifier, which is slightly better than those reported in similar works

    A framework for ancient and machine-printed manuscripts categorization

    Get PDF
    Document image understanding (DIU) has attracted a lot of attention and became an of active fields of research. Although, the ultimate goal of DIU is extracting textual information of a document image, many steps are involved in a such a process such as categorization, segmentation and layout analysis. All of these steps are needed in order to obtain an accurate result from character recognition or word recognition of a document image. One of the important steps in DIU is document image categorization (DIC) that is needed in many situations such as document image written or printed in more than one script, font or language. This step provides useful information for recognition system and helps in reducing its error by allowing to incorporate a category-specific Optical Character Recognition (OCR) system or word recognition (WR) system. This research focuses on the problem of DIC in different categories of scripts, styles and languages and establishes a framework for flexible representation and feature extraction that can be adapted to many DIC problem. The current methods for DIC have many limitations and drawbacks that restrict the practical usage of these methods. We proposed an efficient framework for categorization of document image based on patch representation and Non-negative Matrix Factorization (NMF). This framework is flexible and can be adapted to different categorization problem. Many methods exist for script identification of document image but few of them addressed the problem in handwritten manuscripts and they have many limitations and drawbacks. Therefore, our first goal is to introduce a novel method for script identification of ancient manuscripts. The proposed method is based on patch representation in which the patches are extracted using skeleton map of a document images. This representation overcomes the limitation of the current methods about the fixed level of layout. The proposed feature extraction scheme based on Projective Non-negative Matrix Factorization (PNMF) is robust against noise and handwriting variation and can be used for different scripts. The proposed method has higher performance compared to state of the art methods and can be applied to different levels of layout. The current methods for font (style) identification are mostly proposed to be applied on machine-printed document image and many of them can only be used for a specific level of layout. Therefore, we proposed new method for font and style identification of printed and handwritten manuscripts based on patch representation and Non-negative Matrix Tri-Factorization (NMTF). The images are represented by overlapping patches obtained from the foreground pixels. The position of these patches are set based on skeleton map to reduce the number of patches. Non-Negative Matrix Tri-Factorization is used to learn bases from each fonts (style) and then these bases are used to classify a new image based on minimum representation error. The proposed method can easily be extended to new fonts as the bases for each font are learned separately from the other fonts. This method is tested on two datasets of machine-printed and ancient manuscript and the results confirmed its performance compared to the state of the art methods. Finally, we proposed a novel method for language identification of printed and handwritten manuscripts based on patch representation and Non-negative Matrix Tri-Factorization (NMTF). The current methods for language identification are based on textual data obtained by OCR engine or images data through coding and comparing with textual data. The OCR based method needs lots of processing and the current image based method are not applicable to cursive scripts such as Arabic. In this work we introduced a new method for language identification of machine-printed and handwritten manuscripts based on patch representation and NMTF. The patch representation provides the component of the Arabic script (letters) that can not be extracted simply by segmentation methods. Then NMTF is used for dictionary learning and generating codebooks that will be used to represent document image with a histogram. The proposed method is tested on two datasets of machine-printed and handwritten manuscripts and compared to n-gram features (text-based), texture features and codebook features (imagebased) to validate the performance. The above proposed methods are robust against variation in handwritings, changes in the font (handwriting style) and presence of degradation and are flexible that can be used to various levels of layout (from a textline to paragraph). The methods in this research have been tested on datasets of handwritten and machine-printed manuscripts and compared to state-of-the-art methods. All of the evaluations show the efficiency, robustness and flexibility of the proposed methods for categorization of document image. As mentioned before the proposed strategies provide a framework for efficient and flexible representation and feature extraction for document image categorization. This frame work can be applied to different levels of layout, the information from different levels of layout can be merged and mixed and this framework can be extended to more complex situations and different tasks

    Extraction of textual information from image for information retrieval

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Multi-font Numerals Recognition for Urdu Script based Languages

    Get PDF
    International audienceHandwritten character recognition of Urdu script based languages is one of the most difficult task due to complexities of the script. Urdu script based languages has not received much attestation even this script is used more than 1/6th of the population. The complexities in the script makes more complicated the recognition process. The problem in handwritten numeral recognition is the shape similarity between handwritten numerals and dual style for Urdu. This paper presents a fuzzy rule base, HMM and Hybrid approaches for the recognition of numerals both Urdu and Arabic in unconstrained environment from both online and offline domain for online input. Basically offline domain is used for preprocessing i.e normalization, slant normalization. The proposed system is tested and provides accuracy of 97.1
    corecore