2,493 research outputs found
Representation, Recognition and Collaboration with Digital Ink
Pen input for computing devices is now widespread, providing a promising interaction mechanism for many purposes. Nevertheless, the diverse nature of digital ink and varied application domains still present many challenges. First, the sampling rate and resolution of pen-based devices keep improving, making input data more costly to process and store. At the same time, existing applications typically record digital ink either in proprietary formats, which are restricted to single platforms and consequently lack portability, or simply as images, which lose important information. Moreover, in certain domains such as mathematics, current systems are now achieving good recognition rates on individual symbols, in general recognition of complete expressions remains a problem due to the absence of an effective method that can reliably identify the spatial relationships among symbols. Last, but not least, existing digital ink collaboration tools are platform-dependent and typically allow only one input method to be used at a time. Together with the absence of recognition, this has placed significant limitations on what can be done.
In this thesis, we investigate these issues and make contributions to each. We first present an algorithm that can accurately approximate a digital ink curve by selecting a certain subset of points from the original trace. This allows a compact representation of digital ink for efficient processing and storage. We then describe an algorithm that can automatically identify certain important features in handwritten symbols. Identifying the features can help us solve a number of problems such as improving two-dimensional mathematical recognition. Last, we present a framework for multi-user online collaboration in a pen-based and graphical environment. This framework is portable across multiple platforms and allows multimodal interactions in collaborative sessions. To demonstrate our ideas, we present InkChat, a whiteboard application, which can be used to conduct collaborative sessions on a shared canvas. It allows participants to use voice and digital ink independently and simultaneously, which has been found useful in remote collaboration
Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform
In this research, off-line handwriting recognition system for Arabic alphabet is
introduced. The system contains three main stages: preprocessing, segmentation and
recognition stage. In the preprocessing stage, Radon transform was used in the design
of algorithms for page, line and word skew correction as well as for word slant
correction. In the segmentation stage, Hough transform approach was used for line
extraction. For line to words and word to characters segmentation, a statistical method
using mathematic representation of the lines and words binary image was used.
Unlike most of current handwriting recognition system, our system simulates the
human mechanism for image recognition, where images are encoded and saved in
memory as groups according to their similarity to each other. Characters are
decomposed into a coefficient vectors, using fast wavelet transform, then, vectors,
that represent a character in different possible shapes, are saved as groups with one
representative for each group. The recognition is achieved by comparing a vector of
the character to be recognized with group representatives.
Experiments showed that the proposed system is able to achieve the recognition task
with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a
single character in a text of 15 lines where each line has 10 words on average
Dysgraphia detection based on convolutional neural networks and child-robot interaction
Dysgraphia is a disorder of expression with the writing of letters, words, and numbers. Dysgraphia is one of the learning disabilities attributed to the educational sector, which has a strong impact on the academic, motor, and emotional aspects of the individual. The purpose of this study is to identify dysgraphia in children by creating an engaging robot-mediated activity, to collect a new dataset of Latin digits written exclusively by children aged 6 to 12 years. An interactive scenario that explains and demonstrates the steps involved in handwriting digits is created using the verbal and non-verbal behaviors of the social humanoid robot Nao. Therefore, we have collected a dataset that contains 11,347 characters written by 174 participants with and without dysgraphia. And through the advent of deep learning technologies and their success in various fields, we have developed an approach based on these methods. The proposed approach was tested on the generated database. We performed a classification with a convolutional neural network (CNN) to identify dysgraphia in children. The results show that the performance of our model is promising, reaching an accuracy of 91%
Anna Maria Van Schurman’s Chinese Calligraphy
Calligraphy is an understudied aspect of the reception of Chinese art in early modern Europe. Chinese visitors to Middelburg (1601) and Amsterdam (1654) first demonstrated it as a cultural practice. Other written samples circulated in the Dutch Republic, an emporium for Chinese goods. This article focuses on a previously unknown participant in this exchange: Anna Maria van Schurman, Europe’s first female university student, who had mastered various Asian scripts and was expected to try her hand at Chinese and Japanese. In 1637 Andreas Colvius sent her samples of East Asian writing to copy ‘by her own hand’. This exchange makes possible a transcultural study of the calligraphic gift. Via the popular writings of Matteo Ricci, Van Schurman’s correspondents may have learned about the role of calligraphy in fostering social relationships in late Ming China. Some of the visual and material qualities of East Asian writing must have made it look like a fitting tribute to a female European scholar of high profile and, in being exchanged as a gift, calligraphy acquired new meanings even while remaining illegible. In seventeenth-century China and Europe, the friendly exchange of calligraphy expressed new forms of sociability
Embedding and learning with signatures
Sequential and temporal data arise in many fields of research, such as quantitative finance, medicine, or computer vision. The present article is concerned with a novel approach for sequential learning, called the signature method, and rooted in rough path theory. Its basic principle is to represent multidimensional paths by a graded feature set of their iterated integrals, called the signature. This approach relies critically on an embedding principle, which consists in representing discretely sampled data as paths, i.e., functions from [0,1] to R^d. After a survey of machine learning methodologies for signatures, we investigate the influence of embeddings on prediction accuracy with an in-depth study of three recent and challenging datasets. We show that a specific embedding, called lead-lag, is systematically better, whatever the dataset or algorithm used. Moreover, we emphasize through an empirical study that computing signatures over the whole path domain does not lead to a loss of local information. We conclude that, with a good embedding, the signature combined with a simple algorithm achieves results competitive with state-of-the-art, domain-specific approaches
Bernoulli HMMs for Handwritten Text Recognition
In last years Hidden Markov Models (HMMs) have received significant attention in the
task off-line handwritten text recognition (HTR). As in automatic speech recognition (ASR),
HMMs are used to model the probability of an observation sequence, given its corresponding
text transcription. However, in contrast to what happens in ASR, in HTR there is no standard
set of local features being used by most of the proposed systems. In this thesis we propose the
use of raw binary pixels as features, in conjunction with models that deal more directly with
the binary data. In particular, we propose the use of Bernoulli HMMs (BHMMs), that is, conventional
HMMs in which Gaussian (mixture) distributions have been replaced by Bernoulli
(mixture) probability functions. The objective is twofold: on the one hand, this allows us
to better modeling the binary nature of text images (foreground/background) using BHMMs.
On the other hand, this guarantees that no discriminative information is filtered out during
feature extraction (most HTR available datasets can be easily binarized without a relevant
loss of information).
In this thesis, all the HMM theory required to develop a HMM based HTR toolkit is
reviewed and adapted to the case of BHMMs. Specifically, we begin by defining a simple
classifier based on BHMMs with Bernoulli probability functions at the states, and we end
with an embedded Bernoulli mixture HMM recognizer for continuous HTR. Regarding the
binary features, we propose a simple binary feature extraction process without significant
loss of information. All input images are scaled and binarized, in order to easily reinterpret
them as sequences of binary feature vectors. Two extensions are proposed to this basic feature
extraction method: the use of a sliding window in order to better capture the context,
and a repositioning method in order to better deal with vertical distortions. Competitive results
were obtained when BHMMs and proposed methods were applied to well-known HTR
databases. In particular, we ranked first at the Arabic Handwriting Recognition Competition
organized during the 12th International Conference on Frontiers in Handwriting Recognition
(ICFHR 2010), and at the Arabic Recognition Competition: Multi-font Multi-size Digitally
Represented Text organized during the 11th International Conference on Document Analysis
and Recognition (ICDAR 2011).
In the last part of this thesis we propose a method for training BHMM classifiers using In last years Hidden Markov Models (HMMs) have received significant attention in the
task off-line handwritten text recognition (HTR). As in automatic speech recognition (ASR),
HMMs are used to model the probability of an observation sequence, given its corresponding
text transcription. However, in contrast to what happens in ASR, in HTR there is no standard
set of local features being used by most of the proposed systems. In this thesis we propose the
use of raw binary pixels as features, in conjunction with models that deal more directly with
the binary data. In particular, we propose the use of Bernoulli HMMs (BHMMs), that is, conventional
HMMs in which Gaussian (mixture) distributions have been replaced by Bernoulli
(mixture) probability functions. The objective is twofold: on the one hand, this allows us
to better modeling the binary nature of text images (foreground/background) using BHMMs.
On the other hand, this guarantees that no discriminative information is filtered out during
feature extraction (most HTR available datasets can be easily binarized without a relevant
loss of information).
In this thesis, all the HMM theory required to develop a HMM based HTR toolkit is
reviewed and adapted to the case of BHMMs. Specifically, we begin by defining a simple
classifier based on BHMMs with Bernoulli probability functions at the states, and we end
with an embedded Bernoulli mixture HMM recognizer for continuous HTR. Regarding the
binary features, we propose a simple binary feature extraction process without significant
loss of information. All input images are scaled and binarized, in order to easily reinterpret
them as sequences of binary feature vectors. Two extensions are proposed to this basic feature
extraction method: the use of a sliding window in order to better capture the context,
and a repositioning method in order to better deal with vertical distortions. Competitive results
were obtained when BHMMs and proposed methods were applied to well-known HTR
databases. In particular, we ranked first at the Arabic Handwriting Recognition Competition
organized during the 12th International Conference on Frontiers in Handwriting Recognition
(ICFHR 2010), and at the Arabic Recognition Competition: Multi-font Multi-size Digitally
Represented Text organized during the 11th International Conference on Document Analysis
and Recognition (ICDAR 2011).
In the last part of this thesis we propose a method for training BHMM classifiers using In last years Hidden Markov Models (HMMs) have received significant attention in the
task off-line handwritten text recognition (HTR). As in automatic speech recognition (ASR),
HMMs are used to model the probability of an observation sequence, given its corresponding
text transcription. However, in contrast to what happens in ASR, in HTR there is no standard
set of local features being used by most of the proposed systems. In this thesis we propose the
use of raw binary pixels as features, in conjunction with models that deal more directly with
the binary data. In particular, we propose the use of Bernoulli HMMs (BHMMs), that is, conventional
HMMs in which Gaussian (mixture) distributions have been replaced by Bernoulli
(mixture) probability functions. The objective is twofold: on the one hand, this allows us
to better modeling the binary nature of text images (foreground/background) using BHMMs.
On the other hand, this guarantees that no discriminative information is filtered out during
feature extraction (most HTR available datasets can be easily binarized without a relevant
loss of information).
In this thesis, all the HMM theory required to develop a HMM based HTR toolkit is
reviewed and adapted to the case of BHMMs. Specifically, we begin by defining a simple
classifier based on BHMMs with Bernoulli probability functions at the states, and we end
with an embedded Bernoulli mixture HMM recognizer for continuous HTR. Regarding the
binary features, we propose a simple binary feature extraction process without significant
loss of information. All input images are scaled and binarized, in order to easily reinterpret
them as sequences of binary feature vectors. Two extensions are proposed to this basic feature
extraction method: the use of a sliding window in order to better capture the context,
and a repositioning method in order to better deal with vertical distortions. Competitive results
were obtained when BHMMs and proposed methods were applied to well-known HTR
databases. In particular, we ranked first at the Arabic Handwriting Recognition Competition
organized during the 12th International Conference on Frontiers in Handwriting Recognition
(ICFHR 2010), and at the Arabic Recognition Competition: Multi-font Multi-size Digitally
Represented Text organized during the 11th International Conference on Document Analysis
and Recognition (ICDAR 2011).
In the last part of this thesis we propose a method for training BHMM classifiers using discriminative training criteria, instead of the conventionalMaximum Likelihood Estimation
(MLE). Specifically, we propose a log-linear classifier for binary data based on the BHMM
classifier. Parameter estimation of this model can be carried out using discriminative training
criteria for log-linear models. In particular, we show the formulae for several MMI based
criteria. Finally, we prove the equivalence between both classifiers, hence, discriminative
training of a BHMM classifier can be carried out by obtaining its equivalent log-linear classifier.
Reported results show that discriminative BHMMs clearly outperform conventional
generative BHMMs.Giménez Pastor, A. (2014). Bernoulli HMMs for Handwritten Text Recognition [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/37978TESI
- …