518 research outputs found

    AutoGraff: towards a computational understanding of graffiti writing and related art forms.

    Get PDF
    The aim of this thesis is to develop a system that generates letters and pictures with a style that is immediately recognizable as graffiti art or calligraphy. The proposed system can be used similarly to, and in tight integration with, conventional computer-aided geometric design tools and can be used to generate synthetic graffiti content for urban environments in games and in movies, and to guide robotic or fabrication systems that can materialise the output of the system with physical drawing media. The thesis is divided into two main parts. The first part describes a set of stroke primitives, building blocks that can be combined to generate different designs that resemble graffiti or calligraphy. These primitives mimic the process typically used to design graffiti letters and exploit well known principles of motor control to model the way in which an artist moves when incrementally tracing stylised letter forms. The second part demonstrates how these stroke primitives can be automatically recovered from input geometry defined in vector form, such as the digitised traces of writing made by a user, or the glyph outlines in a font. This procedure converts the input geometry into a seed that can be transformed into a variety of calligraphic and graffiti stylisations, which depend on parametric variations of the strokes

    Information Preserving Processing of Noisy Handwritten Document Images

    Get PDF
    Many pre-processing techniques that normalize artifacts and clean noise induce anomalies due to discretization of the document image. Important information that could be used at later stages may be lost. A proposed composite-model framework takes into account pre-printed information, user-added data, and digitization characteristics. Its benefits are demonstrated by experiments with statistically significant results. Separating pre-printed ruling lines from user-added handwriting shows how ruling lines impact people\u27s handwriting and how they can be exploited for identifying writers. Ruling line detection based on multi-line linear regression reduces the mean error of counting them from 0.10 to 0.03, 6.70 to 0.06, and 0.13 to 0.02, com- pared to an HMM-based approach on three standard test datasets, thereby reducing human correction time by 50%, 83%, and 72% on average. On 61 page images from 16 rule-form templates, the precision and recall of form cell recognition are increased by 2.7% and 3.7%, compared to a cross-matrix approach. Compensating for and exploiting ruling lines during feature extraction rather than pre-processing raises the writer identification accuracy from 61.2% to 67.7% on a 61-writer noisy Arabic dataset. Similarly, counteracting page-wise skew by subtracting it or transforming contours in a continuous coordinate system during feature extraction improves the writer identification accuracy. An implementation study of contour-hinge features reveals that utilizing the full probabilistic probability distribution function matrix improves the writer identification accuracy from 74.9% to 79.5%

    Content Recognition and Context Modeling for Document Analysis and Retrieval

    Get PDF
    The nature and scope of available documents are changing significantly in many areas of document analysis and retrieval as complex, heterogeneous collections become accessible to virtually everyone via the web. The increasing level of diversity presents a great challenge for document image content categorization, indexing, and retrieval. Meanwhile, the processing of documents with unconstrained layouts and complex formatting often requires effective leveraging of broad contextual knowledge. In this dissertation, we first present a novel approach for document image content categorization, using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant local shape feature that is generic enough to be detected repeatably and is segmentation free. A concise, structurally indexed shape lexicon is learned by clustering and partitioning feature types through graph cuts. Our idea finds successful application in several challenging tasks, including content recognition of diverse web images and language identification on documents composed of mixed machine printed text and handwriting. Second, we address two fundamental problems in signature-based document image retrieval. Facing continually increasing volumes of documents, detecting and recognizing unique, evidentiary visual entities (\eg, signatures and logos) provides a practical and reliable supplement to the OCR recognition of printed text. We propose a novel multi-scale framework to detect and segment signatures jointly from document images, based on the structural saliency under a signature production model. We formulate the problem of signature retrieval in the unconstrained setting of geometry-invariant deformable shape matching and demonstrate state-of-the-art performance in signature matching and verification. Third, we present a model-based approach for extracting relevant named entities from unstructured documents. In a wide range of applications that require structured information from diverse, unstructured document images, processing OCR text does not give satisfactory results due to the absence of linguistic context. Our approach enables learning of inference rules collectively based on contextual information from both page layout and text features. Finally, we demonstrate the importance of mining general web user behavior data for improving document ranking and other web search experience. The context of web user activities reveals their preferences and intents, and we emphasize the analysis of individual user sessions for creating aggregate models. We introduce a novel algorithm for estimating web page and web site importance, and discuss its theoretical foundation based on an intentional surfer model. We demonstrate that our approach significantly improves large-scale document retrieval performance

    A dialogue of forms : letter and digital font design

    Get PDF
    Thesis (M.S.V.S.)--Massachusetts Institute of Technology, Dept. of Architecture, 1986.MICROFICHE COPY AVAILABLE IN ARCHIVES AND ROTCH.Bibliography: leaves 104-120.by Debra Anne Adams.M.S.V.S

    Four cornered code based Chinese character recognition system.

    Get PDF
    by Tham Yiu-Man.Thesis (M.Phil.)--Chinese University of Hong Kong, 1993.Includes bibliographical references.Abstract --- p.iAcknowledgements --- p.iiiTable of Contents --- p.ivChapter Chapter I --- IntroductionChapter 1.1 --- Introduction --- p.1-1Chapter 1.2 --- Survey on Chinese Character Recognition --- p.1-4Chapter 1.3 --- Methodology Adopts in Our System --- p.1-7Chapter 1.4 --- Contributions and Organization of the Thesis --- p.1-11Chapter Chapter II --- Pre-processing and Stroke ExtractionChapter 2.1 --- Introduction --- p.2-1Chapter 2.2 --- Thinning --- p.2-1Chapter 2.2.1 --- Introduction to Thinning --- p.2-1Chapter 2.2.2 --- Proposed Thinning Algorithm Cater for Stroke Extraction --- p.2-6Chapter 2.2.3 --- Thinning Results --- p.2-9Chapter 2.3 --- Stroke Extraction --- p.2-13Chapter 2.3.1 --- Introduction to Stroke Extraction --- p.2-13Chapter 2.3.2 --- Proposed Stroke Extraction Method --- p.2-14Chapter 2.3.2.1 --- Fork point detection --- p.2-16Chapter 2.3.2.2 --- 8-connected fork point merging --- p.2-18Chapter 2.3.2.3 --- Sub-stroke extraction --- p.2-18Chapter 2.3.2.4 --- Fork point merging --- p.2-19Chapter 2.3.2.5 --- Sub-stroke connection --- p.2-24Chapter 2.3.3 --- Stroke Extraction Accuracy --- p.2-27Chapter 2.3.4 --- Corner Detection --- p.2-29Chapter 2.3.4.1 --- Introduction to Corner Detection --- p.2-29Chapter 2.3.4.2 --- Proposed Corner Detection Formulation --- p.2-30Chapter 2.4 --- Concluding Remarks --- p.2-33Chapter Chapter III --- Four Corner CodeChapter 3.1 --- Introduction --- p.3-1Chapter 3.2 --- Deletion of Hook Strokes --- p.3-3Chapter 3.3 --- Stroke Types Selection --- p.3-5Chapter 3.4 --- Probability Formulations of Stroke Types --- p.3-7Chapter 3.4.1 --- Simple Strokes --- p.3-7Chapter 3.4.2 --- Square --- p.3-8Chapter 3.4.3 --- Cross --- p.3-10Chapter 3.4.4 --- Upper Right Corner --- p.3-12Chapter 3.4.5 --- Lower Left Corner --- p.3-12Chapter 3.5 --- Corner Segments Extraction Procedure --- p.3-14Chapter 3.5.1 --- Corner Segment Probability --- p.3-21Chapter 3.5.2 --- Corner Segment Extraction --- p.3-23Chapter 3.6 4 --- C Codes Generation --- p.3-26Chapter 3.7 --- Parameters Determination --- p.3-29Chapter 3.8 --- Sensitivity Test --- p.3-31Chapter 3.9 --- Classification Rate --- p.3-32Chapter 3.10 --- Feedback by Corner Segments --- p.3-34Chapter 3.11 --- Classification Rate with Feedback by Corner Segment --- p.3-37Chapter 3.12 --- Reasons for Mis-classification --- p.3-38Chapter 3.13 --- Suggested Solution to the Mis-interpretation of Stroke Type --- p.3-41Chapter 3.14 --- Reduce Size of Candidate Set by No.of Input Segments --- p.3-43Chapter 3.15 --- Extension to Higher Order Code --- p.3-45Chapter 3.16 --- Concluding Remarks --- p.3-46Chapter Chapter IV --- RelaxationChapter 4.1 --- Introduction --- p.4-1Chapter 4.1.1 --- Introduction to Relaxation --- p.4-1Chapter 4.1.2 --- Formulation of Relaxation --- p.4-2Chapter 4.1.3 --- Survey on Chinese Character Recognition by using Relaxation --- p.4-5Chapter 4.2 --- Relaxation Formulations --- p.4-9Chapter 4.2.1 --- Definition of Neighbour Segments --- p.4-9Chapter 4.2.2 --- Formulation of Initial Probability Assignment --- p.4-12Chapter 4.2.3 --- Formulation of Compatibility Function --- p.4-14Chapter 4.2.4 --- Formulation of Support from Neighbours --- p.4-16Chapter 4.2.5 --- Stopping Criteria --- p.4-17Chapter 4.2.6 --- Distance Measures --- p.4-17Chapter 4.2.7 --- Parameters Determination --- p.4-21Chapter 4.3 --- Recognition Rate --- p.4-23Chapter 4.4 --- Reasons for Mis-recognition in Relaxation --- p.4-27Chapter 4.5 --- Introduction of No-label Class --- p.4-31Chapter 4.5.1 --- No-label Initial Probability --- p.4-31Chapter 4.5.2 --- No-label Compatibility Function --- p.4-32Chapter 4.5.3 --- Improvement by No-label Class --- p.4-33Chapter 4.6 --- Rate of Convergence --- p.4-35Chapter 4.6.1 --- Updating Formulae in Exponential Form --- p.4-38Chapter 4.7 --- Comparison with Yamamoto et al's Relaxation Method --- p.4-40Chapter 4.7.1 --- Formulations in Yamamoto et al's Relaxation Method --- p.4-40Chapter 4.7.2 --- Modifications in [YAMAM82] --- p.4-42Chapter 4.7.3 --- Performance Comparison with [YAMAM82] --- p.4-43Chapter 4.8 --- System Overall Recognition Rate --- p.4-45Chapter 4.9 --- Concluding Remarks --- p.4-48Chapter Chapter V --- Concluding RemarksChapter 5.1 --- Recapitulation and Conclusions --- p.5-1Chapter 5.2 --- Limitations in the System --- p.5-4Chapter 5.3 --- Suggestions for Further Developments --- p.5-6References --- p.R-1Appendix User's GuideChapter A .l --- System Functions --- p.A-1Chapter A.2 --- Platform and Compiler --- p.A-1Chapter A.3 --- File List --- p.A-2Chapter A.4 --- Directory --- p.A-3Chapter A.5 --- Description of Sub-routines --- p.A-3Chapter A.6 --- Data Structures and Header Files --- p.A-12Chapter A.7 --- Character File charfile Structure --- p.A-15Chapter A.8 --- Suggested Program to Implement the System --- p.A-1

    Text-detection and -recognition from natural images

    Get PDF
    Text detection and recognition from images could have numerous functional applications for document analysis, such as assistance for visually impaired people; recognition of vehicle license plates; evaluation of articles containing tables, street signs, maps, and diagrams; keyword-based image exploration; document retrieval; recognition of parts within industrial automation; content-based extraction; object recognition; address block location; and text-based video indexing. This research exploited the advantages of artificial intelligence (AI) to detect and recognise text from natural images. Machine learning and deep learning were used to accomplish this task.In this research, we conducted an in-depth literature review on the current detection and recognition methods used by researchers to identify the existing challenges, wherein the differences in text resulting from disparity in alignment, style, size, and orientation combined with low image contrast and a complex background make automatic text extraction a considerably challenging and problematic task. Therefore, the state-of-the-art suggested approaches obtain low detection rates (often less than 80%) and recognition rates (often less than 60%). This has led to the development of new approaches. The aim of the study was to develop a robust text detection and recognition method from natural images with high accuracy and recall, which would be used as the target of the experiments. This method could detect all the text in the scene images, despite certain specific features associated with the text pattern. Furthermore, we aimed to find a solution to the two main problems concerning arbitrarily shaped text (horizontal, multi-oriented, and curved text) detection and recognition in a low-resolution scene and with various scales and of different sizes.In this research, we propose a methodology to handle the problem of text detection by using novel combination and selection features to deal with the classification algorithms of the text/non-text regions. The text-region candidates were extracted from the grey-scale images by using the MSER technique. A machine learning-based method was then applied to refine and validate the initial detection. The effectiveness of the features based on the aspect ratio, GLCM, LBP, and HOG descriptors was investigated. The text-region classifiers of MLP, SVM, and RF were trained using selections of these features and their combinations. The publicly available datasets ICDAR 2003 and ICDAR 2011 were used to evaluate the proposed method. This method achieved the state-of-the-art performance by using machine learning methodologies on both databases, and the improvements were significant in terms of Precision, Recall, and F-measure. The F-measure for ICDAR 2003 and ICDAR 2011 was 81% and 84%, respectively. The results showed that the use of a suitable feature combination and selection approach could significantly increase the accuracy of the algorithms.A new dataset has been proposed to fill the gap of character-level annotation and the availability of text in different orientations and of curved text. The proposed dataset was created particularly for deep learning methods which require a massive completed and varying range of training data. The proposed dataset includes 2,100 images annotated at the character and word levels to obtain 38,500 samples of English characters and 12,500 words. Furthermore, an augmentation tool has been proposed to support the proposed dataset. The missing of object detection augmentation tool encroach to proposed tool which has the ability to update the position of bounding boxes after applying transformations on images. This technique helps to increase the number of samples in the dataset and reduce the time of annotations where no annotation is required. The final part of the thesis presents a novel approach for text spotting, which is a new framework for an end-to-end character detection and recognition system designed using an improved SSD convolutional neural network, wherein layers are added to the SSD networks and the aspect ratio of the characters is considered because it is different from that of the other objects. Compared with the other methods considered, the proposed method could detect and recognise characters by training the end-to-end model completely. The performance of the proposed method was better on the proposed dataset; it was 90.34. Furthermore, the F-measure of the method’s accuracy on ICDAR 2015, ICDAR 2013, and SVT was 84.5, 91.9, and 54.8, respectively. On ICDAR13, the method achieved the second-best accuracy. The proposed method could spot text in arbitrarily shaped (horizontal, oriented, and curved) scene text.</div

    Pattern Recognition

    Get PDF
    A wealth of advanced pattern recognition algorithms are emerging from the interdiscipline between technologies of effective visual features and the human-brain cognition process. Effective visual features are made possible through the rapid developments in appropriate sensor equipments, novel filter designs, and viable information processing architectures. While the understanding of human-brain cognition process broadens the way in which the computer can perform pattern recognition tasks. The present book is intended to collect representative researches around the globe focusing on low-level vision, filter design, features and image descriptors, data mining and analysis, and biologically inspired algorithms. The 27 chapters coved in this book disclose recent advances and new ideas in promoting the techniques, technology and applications of pattern recognition

    Framework of hierarchy for neural theory

    Get PDF
    • …
    corecore