5,475 research outputs found

    Easily Read, Easily Forgotten: Reassessing the Effects of Visual Difficulties and Multi-Modality in Educational Text Design

    Get PDF
    The graphic design of a book affects the way the reader receives and processes information. However, design is often focused on aesthetic principles and traditional wisdom, not taking into account how design aspects affect cognitive processes and educational outcomes. This thesis examines the efficacy of page design elements on educational outcomes, specifically disfluent fonts, handwritten fonts and multi-modal design. The traditional wisdom of typography has maintained that the faster the human eye can read a text, the more suited it is for reading materials. However, recent research suggests that disfluent, or difficult-to-read fonts result in significantly improved reading comprehension and retention (Chih-Ming Chen & Yu-Ju Lin 553; Diemand-Yauman, et al. 114; Faber, et al. 914; French, M. M. J., et al. 301; Geller, Jason, et al.1109; Halin, et al. 31; Oppenheimer D.M & Frank M.D. 1178). This body of research suggests that certain visual disfluencies enhance educational outcomes, improving retention and comprehension by encouraging the reader to mentally process material in a slower and deeper way. What if texts that are easily read are easily forgotten? Medieval manuscript design encouraged a reading culture nurtured by deep, contemplative and slow reading methods, enhanced by semiotic images, text and design. The modern book designer, inspired by medieval manuscripts, and their modern incarnation, the graphic novel, can enhance educational outcomes through design that elicits a deep cognitive processing. The aim of this thesis is to present evidence that this inspiration combined with difficult-to-read fonts and multi-modal design can enhance educational outcomes, specifically in the American high school literature classroom

    Optical character recognition with neural networks and post-correction with finite state methods

    Get PDF
    The optical character recognition (OCR) quality of the historical part of the Finnish newspaper and journal corpus is rather low for reliable search and scientific research on the OCRed data. The estimated character error rate (CER) of the corpus, achieved with commercial software, is between 8 and 13%. There have been earlier attempts to train high-quality OCR models with open-source software, like Ocropy (https://github.com/tmbdev/ocropy) and Tesseract (https://github.com/tesseract-ocr/tesseract), but so far, none of the methods have managed to successfully train a mixed model that recognizes all of the data in the corpus, which would be essential for an efficient re-OCRing of the corpus. The difficulty lies in the fact that the corpus is printed in the two main languages of Finland (Finnish and Swedish) and in two font families (Blackletter and Antiqua). In this paper, we explore the training of a variety of OCR models with deep neural networks (DNN). First, we find an optimal DNN for our data and, with additional training data, successfully train high-quality mixed-language models. Furthermore, we revisit the effect of confidence voting on the OCR results with different model combinations. Finally, we perform post-correction on the new OCR results and perform error analysis. The results show a significant boost in accuracy, resulting in 1.7% CER on the Finnish and 2.7% CER on the Swedish test set. The greatest accomplishment of the study is the successful training of one mixed language model for the entire corpus and finding a voting setup that further improves the results.Peer reviewe

    Assessment of OCR Quality and Font Identification in Historical Documents

    Get PDF
    Mass digitization of historical documents is a challenging problem for optical character recognition (OCR) tools. Issues include noisy backgrounds and faded text due to aging, border/marginal noise, bleed-through, skewing, warping, as well as irregular fonts and page layouts. As a result, OCR tools often produce a large number of spurious bounding boxes (BBs) in addition to those that correspond to words in the document. To improve the OCR output, in this thesis we develop machine-learning methods to assess the quality of historical documents and label/tag documents (with the page problems) in the EEBO/ECCO collections—45 million pages available through the Early Modern OCR Project at Texas A&M University. We present an iterative classification algorithm to automatically label BBs (i.e., as text or noise) based on their spatial distribution and geometry. The approach uses a rule-base classifier to generate initial text/noise labels for each BB, followed by an iterative classifier that refines the initial labels by incorporating local information to each BB, its spatial location, shape and size. When evaluated on a dataset containing over 72,000 manually-labeled BBs from 159 historical documents, the algorithm can classify BBs with 0.95 precision and 0.96 recall. Further evaluation on a collection of 6,775 documents with ground-truth transcriptions shows that the algorithm can also be used to predict document quality (0.7 correlation) and improve OCR transcriptions in 85% of the cases. This thesis also aims at generating font metadata for historical documents. Knowledge of the font can aid OCR system to produce very accurate text transcriptions, but getting font information for 45 million documents is a daunting task. We present an active learning based font identification system that can classify document images into fonts. In active learning, a learner queries the human for labels on examples it finds most informative. We capture the characteristics of the fonts using word image features related to character width, angled strokes, and Zernike moments. To extract page level features, we use bag-of-word feature (BoF) model. A font classification model trained using BoF and active learning requires only 443 labeled instances to achieve 89.3% test accuracy
    • …
    corecore