131,178 research outputs found
READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents
Text line detection is crucial for any application associated with Automatic
Text Recognition or Keyword Spotting. Modern algorithms perform good on
well-established datasets since they either comprise clean data or
simple/homogeneous page layouts. We have collected and annotated 2036 archival
document images from different locations and time periods. The dataset contains
varying page layouts and degradations that challenge text line segmentation
methods. Well established text line segmentation evaluation schemes such as the
Detection Rate or Recognition Accuracy demand for binarized data that is
annotated on a pixel level. Producing ground truth by these means is laborious
and not needed to determine a method's quality. In this paper we propose a new
evaluation scheme that is based on baselines. The proposed scheme has no need
for binarization and it can handle skewed as well as rotated text lines. The
ICDAR 2017 Competition on Baseline Detection and the ICDAR 2017 Competition on
Layout Analysis for Challenging Medieval Manuscripts used this evaluation
scheme. Finally, we present results achieved by a recently published text line
detection algorithm.Comment: Submitted to DAS201
A survey of comics research in computer science
Graphical novels such as comics and mangas are well known all over the world.
The digital transition started to change the way people are reading comics,
more and more on smartphones and tablets and less and less on paper. In the
recent years, a wide variety of research about comics has been proposed and
might change the way comics are created, distributed and read in future years.
Early work focuses on low level document image analysis: indeed comic books are
complex, they contains text, drawings, balloon, panels, onomatopoeia, etc.
Different fields of computer science covered research about user interaction
and content generation such as multimedia, artificial intelligence,
human-computer interaction, etc. with different sets of values. We propose in
this paper to review the previous research about comics in computer science, to
state what have been done and to give some insights about the main outlooks
Putting the Text back into Context: A Codicological Approach to Manuscript Transcription
Textual scholars have tended to produce editions which present the text without its
manuscript context. Even though digital editions now often present single-witness
editions with facsimiles of the manuscripts, nevertheless the text itself is still transcribed
and represented as a linguistic object rather than a physical one. Indeed, this is explicitly
stated as the theoretical basis for the de facto standard of markup for digital texts: the
Guidelines of the Text Encoding Initiative (TEI). These explicitly treat texts as semantic
units such as paragraphs, sentences, verses and so on, rather than physical elements
such as pages, openings, or surfaces, and some scholars have argued that this is the only
viable model for representing texts. In contrast, this chapter presents arguments for
considering the document as a physical object in the markup of texts. The theoretical
arguments of what constitutes a text are first reviewed, with emphasis on those used
by the TEI and other theoreticians of digital markup. A series of cases is then given in
which a document-centric approach may be desirable, with both modern and medieval
examples. Finally a step forward in this direction is raised, namely the results of
the Genetic Edition Working Group in the Manuscript Special Interest Group of the
TEI: this includes a proposed standard for documentary markup, whereby aspects of
codicology and mise en page can be included in digital editions, putting the text back
into its manuscript context
SCUdent Books: A University-Focused Bookselling Platform
As the beginning of each university semester or quarter commences, so does the rush to acquire books for classes. The search for school books is a busy and important task for many students. However, an entire slew of problems and frustrations emerge with this academic race to gather books. To begin, students have to deal with the traditional frustrations of expensive textbooks sold at the university bookstore which is especially troublesome for those on a tight budget. Additionally, required textbooks for classes may not be available at the bookstore or require restocking which can take an unknown amount of time. Because of this, students turn to cheaper, faster, and more efficient alternatives for acquiring school books including online retailers such as Amazon or Barnes and Noble.
While the Internet makes book shopping appear easier, there exist issues that come with it. Students have to put in more effort ordering online, pay for extra shipping, and wait for their books to arrive. Also, online shopping for books is incredibly decentralized with no convenient platform to cater to students\u27 needs. Students must first spend time finding out which books are required for each class and then spend even more time comparing prices from multiple online retailers. In addition, once a student completes a class he or she may no longer need the book. As a result, the student has no convenient method of disposing the book and must now sell it, throw it away, or keep it. Overall, the process of acquiring books in university is disorganized, stressful, and inconvenient for students
Designing the printed book as an interactive environment
Reading a book demands a certain level of interaction from the reader. The cover must be opened and pages turned to navigate the information inside. Conventions have been developed over the life of the book to assist the reader in this navigation and provide orientation. The evolution of electronic reading material has given readers greater opportunities for interacting with their reading material, but many readers still prefer reading from a printed book. This paper investigates how the interactive organizational paradigm of hypertext can be implemented in a printed book to give the reader the opportunity for greater interaction and benefit from some of the advantages that electronic reading environments provide. The investigation in this paper follows an iterative design process in consultation with a panel of four experts. Through four rounds of consultation and refinement two potential solutions were developed for the incorporation of hypertext methods in a printed book
Text spacing considerations for childrenâs on-screen reading
This investigation seeks to uncover the insights of three integral and inter-related participants in the creation and use of on-screen reading material for childrenâs learning. This is an effort to discover what factors are perceived to influence childrenâs comprehension. Through a design-analyse-refine methodology this researcher discusses a series of typographical considerations relating to space which bear further empirical investigation in the literature. This methodology involved discussion of ideas garnered from four experts. The results of each iteration of the experiment influenced further refinement of the ideas until suitable conclusions were able to be developed by the writer. Testing materials in this experiment adjusted variables for visual separation, including margins, separation of image and type, as well as line spacing, letter spacing and word spacing
Deliberate Multimodality of the Newspaper Text (Front Pages â Case Study)
The focus of the present study is the notion that texts are multimodal and that language is realised through various semiotic modes. Kress and Leeuwenâs theory is tested comparing two front pages of the British daily - The Times and two front pages of the Bulgarian daily - Trud
Reviews
Sally Brown, Steve Armstrong and Gail Thompson (eds.), Motivating Students, London: Kogan Page, 1998. ISBN: 0â7494â2494âX. Paperback, 214 pages. ÂŁ18.99
- âŚ