131,178 research outputs found

    READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents

    Full text link
    Text line detection is crucial for any application associated with Automatic Text Recognition or Keyword Spotting. Modern algorithms perform good on well-established datasets since they either comprise clean data or simple/homogeneous page layouts. We have collected and annotated 2036 archival document images from different locations and time periods. The dataset contains varying page layouts and degradations that challenge text line segmentation methods. Well established text line segmentation evaluation schemes such as the Detection Rate or Recognition Accuracy demand for binarized data that is annotated on a pixel level. Producing ground truth by these means is laborious and not needed to determine a method's quality. In this paper we propose a new evaluation scheme that is based on baselines. The proposed scheme has no need for binarization and it can handle skewed as well as rotated text lines. The ICDAR 2017 Competition on Baseline Detection and the ICDAR 2017 Competition on Layout Analysis for Challenging Medieval Manuscripts used this evaluation scheme. Finally, we present results achieved by a recently published text line detection algorithm.Comment: Submitted to DAS201

    A survey of comics research in computer science

    Full text link
    Graphical novels such as comics and mangas are well known all over the world. The digital transition started to change the way people are reading comics, more and more on smartphones and tablets and less and less on paper. In the recent years, a wide variety of research about comics has been proposed and might change the way comics are created, distributed and read in future years. Early work focuses on low level document image analysis: indeed comic books are complex, they contains text, drawings, balloon, panels, onomatopoeia, etc. Different fields of computer science covered research about user interaction and content generation such as multimedia, artificial intelligence, human-computer interaction, etc. with different sets of values. We propose in this paper to review the previous research about comics in computer science, to state what have been done and to give some insights about the main outlooks

    Putting the Text back into Context: A Codicological Approach to Manuscript Transcription

    Get PDF
    Textual scholars have tended to produce editions which present the text without its manuscript context. Even though digital editions now often present single-witness editions with facsimiles of the manuscripts, nevertheless the text itself is still transcribed and represented as a linguistic object rather than a physical one. Indeed, this is explicitly stated as the theoretical basis for the de facto standard of markup for digital texts: the Guidelines of the Text Encoding Initiative (TEI). These explicitly treat texts as semantic units such as paragraphs, sentences, verses and so on, rather than physical elements such as pages, openings, or surfaces, and some scholars have argued that this is the only viable model for representing texts. In contrast, this chapter presents arguments for considering the document as a physical object in the markup of texts. The theoretical arguments of what constitutes a text are first reviewed, with emphasis on those used by the TEI and other theoreticians of digital markup. A series of cases is then given in which a document-centric approach may be desirable, with both modern and medieval examples. Finally a step forward in this direction is raised, namely the results of the Genetic Edition Working Group in the Manuscript Special Interest Group of the TEI: this includes a proposed standard for documentary markup, whereby aspects of codicology and mise en page can be included in digital editions, putting the text back into its manuscript context

    SCUdent Books: A University-Focused Bookselling Platform

    Get PDF
    As the beginning of each university semester or quarter commences, so does the rush to acquire books for classes. The search for school books is a busy and important task for many students. However, an entire slew of problems and frustrations emerge with this academic race to gather books. To begin, students have to deal with the traditional frustrations of expensive textbooks sold at the university bookstore which is especially troublesome for those on a tight budget. Additionally, required textbooks for classes may not be available at the bookstore or require restocking which can take an unknown amount of time. Because of this, students turn to cheaper, faster, and more efficient alternatives for acquiring school books including online retailers such as Amazon or Barnes and Noble. While the Internet makes book shopping appear easier, there exist issues that come with it. Students have to put in more effort ordering online, pay for extra shipping, and wait for their books to arrive. Also, online shopping for books is incredibly decentralized with no convenient platform to cater to students\u27 needs. Students must first spend time finding out which books are required for each class and then spend even more time comparing prices from multiple online retailers. In addition, once a student completes a class he or she may no longer need the book. As a result, the student has no convenient method of disposing the book and must now sell it, throw it away, or keep it. Overall, the process of acquiring books in university is disorganized, stressful, and inconvenient for students

    Designing the printed book as an interactive environment

    Get PDF
    Reading a book demands a certain level of interaction from the reader. The cover must be opened and pages turned to navigate the information inside. Conventions have been developed over the life of the book to assist the reader in this navigation and provide orientation. The evolution of electronic reading material has given readers greater opportunities for interacting with their reading material, but many readers still prefer reading from a printed book. This paper investigates how the interactive organizational paradigm of hypertext can be implemented in a printed book to give the reader the opportunity for greater interaction and benefit from some of the advantages that electronic reading environments provide. The investigation in this paper follows an iterative design process in consultation with a panel of four experts. Through four rounds of consultation and refinement two potential solutions were developed for the incorporation of hypertext methods in a printed book

    Text spacing considerations for children’s on-screen reading

    Get PDF
    This investigation seeks to uncover the insights of three integral and inter-related participants in the creation and use of on-screen reading material for children’s learning. This is an effort to discover what factors are perceived to influence children’s comprehension. Through a design-analyse-refine methodology this researcher discusses a series of typographical considerations relating to space which bear further empirical investigation in the literature. This methodology involved discussion of ideas garnered from four experts. The results of each iteration of the experiment influenced further refinement of the ideas until suitable conclusions were able to be developed by the writer. Testing materials in this experiment adjusted variables for visual separation, including margins, separation of image and type, as well as line spacing, letter spacing and word spacing

    Deliberate Multimodality of the Newspaper Text (Front Pages – Case Study)

    Get PDF
    The focus of the present study is the notion that texts are multimodal and that language is realised through various semiotic modes. Kress and Leeuwen’s theory is tested comparing two front pages of the British daily - The Times and two front pages of the Bulgarian daily - Trud

    Mapping the East End 'Labyrinth'

    Get PDF

    Reviews

    Get PDF
    Sally Brown, Steve Armstrong and Gail Thompson (eds.), Motivating Students, London: Kogan Page, 1998. ISBN: 0–7494–2494‐X. Paperback, 214 pages. £18.99
    • …
    corecore