1,221 research outputs found
Human interaction with digital ink : legibility measurement and structural analysis
Literature suggests that it is possible to design and implement pen-based computer
interfaces that resemble the use of pen and paper. These interfaces appear to
allow users freedom in expressing ideas and seem to be familiar and easy to use.
Different ideas have been put forward concerning this type of interface, however
despite the commonality of aims and problems faced, there does not appear to be
a common approach to their design and implementation.
This thesis aims to progress the development of pen-based computer interfaces
that resemble the use of pen and paper. To do this, a conceptual model is proposed
for interfaces that enable interaction with "digital ink". This conceptual model is
used to organize and analyse the broad range of literature related to pen-based
interfaces, and to identify topics that are not sufficiently addressed by published
research. Two issues highlighted by the model: digital ink legibility and digital
ink structuring, are then investigated.
In the first investigation, methods are devised to objectively and subjectively
measure the legibility of handwritten script. These methods are then piloted in
experiments that vary the horizontal rendering resolution of handwritten script
displayed on a computer screen. Script legibility is shown to decrease with rendering
resolution, after it drops below a threshold value.
In the second investigation, the clustering of digital ink strokes into words is
addressed. A method of rating the accuracy of clustering algorithms is proposed:
the percentage of words spoiled. The clustering error rate is found to vary among
different writers, for a clustering algorithm using the geometric features of both
ink strokes, and the gaps between them.
The work contributes a conceptual interface model, methods of measuring
digital ink legibility, and techniques for investigating stroke clustering features, to
the field of digital ink interaction research
Content Detection in Handwritten Documents
abstract: Handwritten documents have gained popularity in various domains including education and business. A key task in analyzing a complex document is to distinguish between various content types such as text, math, graphics, tables and so on. For example, one such aspect could be a region on the document with a mathematical expression; in this case, the label would be math. This differentiation facilitates the performance of specific recognition tasks depending on the content type. We hypothesize that the recognition accuracy of the subsequent tasks such as textual, math, and shape recognition will increase, further leading to a better analysis of the document.
Content detection on handwritten documents assigns a particular class to a homogeneous portion of the document. To complete this task, a set of handwritten solutions was digitally collected from middle school students located in two different geographical regions in 2017 and 2018. This research discusses the methods to collect, pre-process and detect content type in the collected handwritten documents. A total of 4049 documents were extracted in the form of image, and json format; and were labelled using an object labelling software with tags being text, math, diagram, cross out, table, graph, tick mark, arrow, and doodle. The labelled images were fed to the Tensorflow’s object detection API to learn a neural network model. We show our results from two neural networks models, Faster Region-based Convolutional Neural Network (Faster R-CNN) and Single Shot detection model (SSD).Dissertation/ThesisMasters Thesis Computer Science 201
VATr++: Choose Your Words Wisely for Handwritten Text Generation
Styled Handwritten Text Generation (HTG) has received significant attention
in recent years, propelled by the success of learning-based solutions employing
GANs, Transformers, and, preliminarily, Diffusion Models. Despite this surge in
interest, there remains a critical yet understudied aspect - the impact of the
input, both visual and textual, on the HTG model training and its subsequent
influence on performance. This study delves deeper into a cutting-edge
Styled-HTG approach, proposing strategies for input preparation and training
regularization that allow the model to achieve better performance and
generalize better. These aspects are validated through extensive analysis on
several different settings and datasets. Moreover, in this work, we go beyond
performance optimization and address a significant hurdle in HTG research - the
lack of a standardized evaluation protocol. In particular, we propose a
standardization of the evaluation protocol for HTG and conduct a comprehensive
benchmarking of existing approaches. By doing so, we aim to establish a
foundation for fair and meaningful comparisons between HTG strategies,
fostering progress in the field
Character Queries: A Transformer-based Approach to On-Line Handwritten Character Segmentation
On-line handwritten character segmentation is often associated with
handwriting recognition and even though recognition models include mechanisms
to locate relevant positions during the recognition process, it is typically
insufficient to produce a precise segmentation. Decoupling the segmentation
from the recognition unlocks the potential to further utilize the result of the
recognition. We specifically focus on the scenario where the transcription is
known beforehand, in which case the character segmentation becomes an
assignment problem between sampling points of the stylus trajectory and
characters in the text. Inspired by the -means clustering algorithm, we view
it from the perspective of cluster assignment and present a Transformer-based
architecture where each cluster is formed based on a learned character query in
the Transformer decoder block. In order to assess the quality of our approach,
we create character segmentation ground truths for two popular on-line
handwriting datasets, IAM-OnDB and HANDS-VNOnDB, and evaluate multiple methods
on them, demonstrating that our approach achieves the overall best results.Comment: ICDAR 2023 Best Student Paper Award. Code available at
https://github.com/jungomi/character-querie
Disentangling Writer and Character Styles for Handwriting Generation
Training machines to synthesize diverse handwritings is an intriguing task.
Recently, RNN-based methods have been proposed to generate stylized online
Chinese characters. However, these methods mainly focus on capturing a person's
overall writing style, neglecting subtle style inconsistencies between
characters written by the same person. For example, while a person's
handwriting typically exhibits general uniformity (e.g., glyph slant and aspect
ratios), there are still small style variations in finer details (e.g., stroke
length and curvature) of characters. In light of this, we propose to
disentangle the style representations at both writer and character levels from
individual handwritings to synthesize realistic stylized online handwritten
characters. Specifically, we present the style-disentangled Transformer (SDT),
which employs two complementary contrastive objectives to extract the style
commonalities of reference samples and capture the detailed style patterns of
each sample, respectively. Extensive experiments on various language scripts
demonstrate the effectiveness of SDT. Notably, our empirical findings reveal
that the two learned style representations provide information at different
frequency magnitudes, underscoring the importance of separate style extraction.
Our source code is public at: https://github.com/dailenson/SDT.Comment: accepted by CVPR 2023. Source code: https://github.com/dailenson/SD
A video-based text and equation editor for LaTeX
Cataloged from PDF version of article.In this paper we present a video based text and equation editor for LaTeX. The system recognizes what is written onto paper and generates the LaTeX code. Text and equations are written on a regular paper using a board marker, and a USB camera attached to a computer is used to capture and record the pen-tip positions in each consecutive image frame. Characters and symbols are represented as separate finite state machines (FSMs). They are written in an isolated manner and they are recognized on-line using the FSMs. In the last step, LaTeX code corresponding to recognized characters and symbols is generated. (c) 2007 Elsevier Ltd. All rights reserved
A Sketch-Based Educational System for Learning Chinese Handwriting
Learning Chinese as a Second Language (CSL) is a difficult task for students in English-speaking countries due to the large symbol set and complicated writing techniques. Traditional classroom methods of teaching Chinese handwriting have major drawbacks due to human experts’ bias and the lack of assessment on writing techniques. In this work, we propose a sketch-based educational system to help CSL students learn Chinese handwriting faster and better in a novel way. Our system allows students to draw freehand symbols to answer questions, and uses sketch recognition and AI techniques to recognize, assess, and provide feedback in real time. Results have shown that the system reaches a recognition accuracy of 86% on novice learners’ inputs, higher than 95% detection rate for mistakes in writing techniques, and 80.3% F-measure on the classification between expert and novice handwriting inputs
- …