19,657 research outputs found

    Persian/Arabic document Segmentation Based on Pyramidal Image Structure

    Get PDF
    Automatic transformation of paper documents into electronic documents requires document segmentation at the first stage. However, some parameters restrictions such as variations in character font sizes, different text line spacing, and also not uniform document layout structures altogether have made it difficult to design a general-purpose document layout analysis algorithm for many years. Thus in most previously reported methods it is inevitable to include these parameters. This problem becomes excessively acute and severe, especially in Persian/Arabic documents. Since the Persian/Arabic scripts differ considerably from the English scripts, most of the proposed methods for the English scripts do not render good results for the Persian scripts. In this paper, we present a novel parameter-free method for segmenting the Persian/Arabic document images which also works well for English scripts. This method segments the document image into maximal homogeneous regions and identifies them as texts and non-texts based on a pyramidal image structure. In other words the proposed method is capable of document segmentation without considering the character font sizes, text line spacing, and document layout structures. This algorithm is examined for 150 Arabic/Persian and English documents and document segmentation process are done successfully for 96 percent of documents

    Persian/Arabic document Segmentation Based on Pyramidal Image Structure

    Get PDF
    Automatic transformation of paper documents into electronic documents requires document segmentation at the first stage. However, some parameters restrictions such as variations in character font sizes, different text line spacing, and also not uniform document layout structures altogether have made it difficult to design a general-purpose document layout analysis algorithm for many years. Thus in most previously reported methods it is inevitable to include these parameters. This problem becomes excessively acute and severe, especially in Persian/Arabic documents. Since the Persian/Arabic scripts differ considerably from the English scripts, most of the proposed methods for the English scripts do not render good results for the Persian scripts. In this paper, we present a novel parameter-free method for segmenting the Persian/Arabic document images which also works well for English scripts. This method segments the document image into maximal homogeneous regions and identifies them as texts and non-texts based on a pyramidal image structure. In other words the proposed method is capable of document segmentation without considering the character font sizes, text line spacing, and document layout structures. This algorithm is examined for 150 Arabic/Persian and English documents and document segmentation process are done successfully for 96 percent of documents

    Persian/Arabic document Segmentation Based on Pyramidal Image Structure

    Get PDF
    Automatic transformation of paper documents into electronic documents requires document segmentation at the first stage. However, some parameters restrictions such as variations in character font sizes, different text line spacing, and also not uniform document layout structures altogether have made it difficult to design a general-purpose document layout analysis algorithm for many years. Thus in most previously reported methods it is inevitable to include these parameters. This problem becomes excessively acute and severe, especially in Persian/Arabic documents. Since the Persian/Arabic scripts differ considerably from the English scripts, most of the proposed methods for the English scripts do not render good results for the Persian scripts. In this paper, we present a novel parameter-free method for segmenting the Persian/Arabic document images which also works well for English scripts. This method segments the document image into maximal homogeneous regions and identifies them as texts and non-texts based on a pyramidal image structure. In other words the proposed method is capable of document segmentation without considering the character font sizes, text line spacing, and document layout structures. This algorithm is examined for 150 Arabic/Persian and English documents and document segmentation process are done successfully for 96 percent of documents

    Text Line Segmentation of Historical Documents: a Survey

    Full text link
    There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks, a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines),automatic text line segmentation remains an open research field. The objective of this paper is to present a survey of existing methods, developed during the last decade, and dedicated to documents of historical interest.Comment: 25 pages, submitted version, To appear in International Journal on Document Analysis and Recognition, On line version available at http://www.springerlink.com/content/k2813176280456k3

    Culture in the design of mHealth UI:An effort to increase acceptance among culturally specific groups

    Get PDF
    Purpose: Designers of mobile applications have long understood the importance of users’ preferences in making the user experience easier, convenient and therefore valuable. The cultural aspects of groups of users are among the key features of users’ design preferences, because each group’s preferences depend on various features that are culturally compatible. The process of integrating culture into the design of a system has always been an important ingredient for effective and interactive human computer interface. This study aims to investigate the design of a mobile health (mHealth) application user interface (UI) based on Arabic culture. It was argued that integrating certain cultural values of specific groups of users into the design of UI would increase their acceptance of the technology. Design/methodology/approach: A total of 135 users responded to an online survey about their acceptance of a culturally designed mHealth. Findings: The findings showed that culturally based language, colours, layout and images had a significant relationship with users’ behavioural intention to use the culturally based mHealth UI. Research limitations/implications: First, the sample and the data collected of this study were restricted to Arab users and Arab culture; therefore, the results cannot be generalized to other cultures and users. Second, the adapted unified theory of acceptance and use of technology model was used in this study instead of the new version, which may expose new perceptions. Third, the cultural aspects of UI design in this study were limited to the images, colours, language and layout. Practical implications: It encourages UI designers to implement the relevant cultural aspects while developing mobile applications. Originality/value: Embedding Arab cultural aspects in designing UI for mobile applications to satisfy Arab users and enhance their acceptance toward using mobile applications, which will reflect positively on their lives.</p
    • …
    corecore