2,493 research outputs found

    Representation, Recognition and Collaboration with Digital Ink

    Get PDF
    Pen input for computing devices is now widespread, providing a promising interaction mechanism for many purposes. Nevertheless, the diverse nature of digital ink and varied application domains still present many challenges. First, the sampling rate and resolution of pen-based devices keep improving, making input data more costly to process and store. At the same time, existing applications typically record digital ink either in proprietary formats, which are restricted to single platforms and consequently lack portability, or simply as images, which lose important information. Moreover, in certain domains such as mathematics, current systems are now achieving good recognition rates on individual symbols, in general recognition of complete expressions remains a problem due to the absence of an effective method that can reliably identify the spatial relationships among symbols. Last, but not least, existing digital ink collaboration tools are platform-dependent and typically allow only one input method to be used at a time. Together with the absence of recognition, this has placed significant limitations on what can be done. In this thesis, we investigate these issues and make contributions to each. We first present an algorithm that can accurately approximate a digital ink curve by selecting a certain subset of points from the original trace. This allows a compact representation of digital ink for efficient processing and storage. We then describe an algorithm that can automatically identify certain important features in handwritten symbols. Identifying the features can help us solve a number of problems such as improving two-dimensional mathematical recognition. Last, we present a framework for multi-user online collaboration in a pen-based and graphical environment. This framework is portable across multiple platforms and allows multimodal interactions in collaborative sessions. To demonstrate our ideas, we present InkChat, a whiteboard application, which can be used to conduct collaborative sessions on a shared canvas. It allows participants to use voice and digital ink independently and simultaneously, which has been found useful in remote collaboration

    Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform

    Get PDF
    In this research, off-line handwriting recognition system for Arabic alphabet is introduced. The system contains three main stages: preprocessing, segmentation and recognition stage. In the preprocessing stage, Radon transform was used in the design of algorithms for page, line and word skew correction as well as for word slant correction. In the segmentation stage, Hough transform approach was used for line extraction. For line to words and word to characters segmentation, a statistical method using mathematic representation of the lines and words binary image was used. Unlike most of current handwriting recognition system, our system simulates the human mechanism for image recognition, where images are encoded and saved in memory as groups according to their similarity to each other. Characters are decomposed into a coefficient vectors, using fast wavelet transform, then, vectors, that represent a character in different possible shapes, are saved as groups with one representative for each group. The recognition is achieved by comparing a vector of the character to be recognized with group representatives. Experiments showed that the proposed system is able to achieve the recognition task with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a single character in a text of 15 lines where each line has 10 words on average

    Dysgraphia detection based on convolutional neural networks and child-robot interaction

    Get PDF
    Dysgraphia is a disorder of expression with the writing of letters, words, and numbers. Dysgraphia is one of the learning disabilities attributed to the educational sector, which has a strong impact on the academic, motor, and emotional aspects of the individual. The purpose of this study is to identify dysgraphia in children by creating an engaging robot-mediated activity, to collect a new dataset of Latin digits written exclusively by children aged 6 to 12 years. An interactive scenario that explains and demonstrates the steps involved in handwriting digits is created using the verbal and non-verbal behaviors of the social humanoid robot Nao. Therefore, we have collected a dataset that contains 11,347 characters written by 174 participants with and without dysgraphia. And through the advent of deep learning technologies and their success in various fields, we have developed an approach based on these methods. The proposed approach was tested on the generated database. We performed a classification with a convolutional neural network (CNN) to identify dysgraphia in children. The results show that the performance of our model is promising, reaching an accuracy of 91%

    Anna Maria Van Schurman’s Chinese Calligraphy

    Get PDF
    Calligraphy is an understudied aspect of the reception of Chinese art in early modern Europe. Chinese visitors to Middelburg (1601) and Amsterdam (1654) first demonstrated it as a cultural practice. Other written samples circulated in the Dutch Republic, an emporium for Chinese goods. This article focuses on a previously unknown participant in this exchange: Anna Maria van Schurman, Europe’s first female university student, who had mastered various Asian scripts and was expected to try her hand at Chinese and Japanese. In 1637 Andreas Colvius sent her samples of East Asian writing to copy ‘by her own hand’. This exchange makes possible a transcultural study of the calligraphic gift. Via the popular writings of Matteo Ricci, Van Schurman’s correspondents may have learned about the role of calligraphy in fostering social relationships in late Ming China. Some of the visual and material qualities of East Asian writing must have made it look like a fitting tribute to a female European scholar of high profile and, in being exchanged as a gift, calligraphy acquired new meanings even while remaining illegible. In seventeenth-century China and Europe, the friendly exchange of calligraphy expressed new forms of sociability

    Embedding and learning with signatures

    Get PDF
    Sequential and temporal data arise in many fields of research, such as quantitative finance, medicine, or computer vision. The present article is concerned with a novel approach for sequential learning, called the signature method, and rooted in rough path theory. Its basic principle is to represent multidimensional paths by a graded feature set of their iterated integrals, called the signature. This approach relies critically on an embedding principle, which consists in representing discretely sampled data as paths, i.e., functions from [0,1] to R^d. After a survey of machine learning methodologies for signatures, we investigate the influence of embeddings on prediction accuracy with an in-depth study of three recent and challenging datasets. We show that a specific embedding, called lead-lag, is systematically better, whatever the dataset or algorithm used. Moreover, we emphasize through an empirical study that computing signatures over the whole path domain does not lead to a loss of local information. We conclude that, with a good embedding, the signature combined with a simple algorithm achieves results competitive with state-of-the-art, domain-specific approaches

    Bernoulli HMMs for Handwritten Text Recognition

    Full text link
    In last years Hidden Markov Models (HMMs) have received significant attention in the task off-line handwritten text recognition (HTR). As in automatic speech recognition (ASR), HMMs are used to model the probability of an observation sequence, given its corresponding text transcription. However, in contrast to what happens in ASR, in HTR there is no standard set of local features being used by most of the proposed systems. In this thesis we propose the use of raw binary pixels as features, in conjunction with models that deal more directly with the binary data. In particular, we propose the use of Bernoulli HMMs (BHMMs), that is, conventional HMMs in which Gaussian (mixture) distributions have been replaced by Bernoulli (mixture) probability functions. The objective is twofold: on the one hand, this allows us to better modeling the binary nature of text images (foreground/background) using BHMMs. On the other hand, this guarantees that no discriminative information is filtered out during feature extraction (most HTR available datasets can be easily binarized without a relevant loss of information). In this thesis, all the HMM theory required to develop a HMM based HTR toolkit is reviewed and adapted to the case of BHMMs. Specifically, we begin by defining a simple classifier based on BHMMs with Bernoulli probability functions at the states, and we end with an embedded Bernoulli mixture HMM recognizer for continuous HTR. Regarding the binary features, we propose a simple binary feature extraction process without significant loss of information. All input images are scaled and binarized, in order to easily reinterpret them as sequences of binary feature vectors. Two extensions are proposed to this basic feature extraction method: the use of a sliding window in order to better capture the context, and a repositioning method in order to better deal with vertical distortions. Competitive results were obtained when BHMMs and proposed methods were applied to well-known HTR databases. In particular, we ranked first at the Arabic Handwriting Recognition Competition organized during the 12th International Conference on Frontiers in Handwriting Recognition (ICFHR 2010), and at the Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text organized during the 11th International Conference on Document Analysis and Recognition (ICDAR 2011). In the last part of this thesis we propose a method for training BHMM classifiers using In last years Hidden Markov Models (HMMs) have received significant attention in the task off-line handwritten text recognition (HTR). As in automatic speech recognition (ASR), HMMs are used to model the probability of an observation sequence, given its corresponding text transcription. However, in contrast to what happens in ASR, in HTR there is no standard set of local features being used by most of the proposed systems. In this thesis we propose the use of raw binary pixels as features, in conjunction with models that deal more directly with the binary data. In particular, we propose the use of Bernoulli HMMs (BHMMs), that is, conventional HMMs in which Gaussian (mixture) distributions have been replaced by Bernoulli (mixture) probability functions. The objective is twofold: on the one hand, this allows us to better modeling the binary nature of text images (foreground/background) using BHMMs. On the other hand, this guarantees that no discriminative information is filtered out during feature extraction (most HTR available datasets can be easily binarized without a relevant loss of information). In this thesis, all the HMM theory required to develop a HMM based HTR toolkit is reviewed and adapted to the case of BHMMs. Specifically, we begin by defining a simple classifier based on BHMMs with Bernoulli probability functions at the states, and we end with an embedded Bernoulli mixture HMM recognizer for continuous HTR. Regarding the binary features, we propose a simple binary feature extraction process without significant loss of information. All input images are scaled and binarized, in order to easily reinterpret them as sequences of binary feature vectors. Two extensions are proposed to this basic feature extraction method: the use of a sliding window in order to better capture the context, and a repositioning method in order to better deal with vertical distortions. Competitive results were obtained when BHMMs and proposed methods were applied to well-known HTR databases. In particular, we ranked first at the Arabic Handwriting Recognition Competition organized during the 12th International Conference on Frontiers in Handwriting Recognition (ICFHR 2010), and at the Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text organized during the 11th International Conference on Document Analysis and Recognition (ICDAR 2011). In the last part of this thesis we propose a method for training BHMM classifiers using In last years Hidden Markov Models (HMMs) have received significant attention in the task off-line handwritten text recognition (HTR). As in automatic speech recognition (ASR), HMMs are used to model the probability of an observation sequence, given its corresponding text transcription. However, in contrast to what happens in ASR, in HTR there is no standard set of local features being used by most of the proposed systems. In this thesis we propose the use of raw binary pixels as features, in conjunction with models that deal more directly with the binary data. In particular, we propose the use of Bernoulli HMMs (BHMMs), that is, conventional HMMs in which Gaussian (mixture) distributions have been replaced by Bernoulli (mixture) probability functions. The objective is twofold: on the one hand, this allows us to better modeling the binary nature of text images (foreground/background) using BHMMs. On the other hand, this guarantees that no discriminative information is filtered out during feature extraction (most HTR available datasets can be easily binarized without a relevant loss of information). In this thesis, all the HMM theory required to develop a HMM based HTR toolkit is reviewed and adapted to the case of BHMMs. Specifically, we begin by defining a simple classifier based on BHMMs with Bernoulli probability functions at the states, and we end with an embedded Bernoulli mixture HMM recognizer for continuous HTR. Regarding the binary features, we propose a simple binary feature extraction process without significant loss of information. All input images are scaled and binarized, in order to easily reinterpret them as sequences of binary feature vectors. Two extensions are proposed to this basic feature extraction method: the use of a sliding window in order to better capture the context, and a repositioning method in order to better deal with vertical distortions. Competitive results were obtained when BHMMs and proposed methods were applied to well-known HTR databases. In particular, we ranked first at the Arabic Handwriting Recognition Competition organized during the 12th International Conference on Frontiers in Handwriting Recognition (ICFHR 2010), and at the Arabic Recognition Competition: Multi-font Multi-size Digitally Represented Text organized during the 11th International Conference on Document Analysis and Recognition (ICDAR 2011). In the last part of this thesis we propose a method for training BHMM classifiers using discriminative training criteria, instead of the conventionalMaximum Likelihood Estimation (MLE). Specifically, we propose a log-linear classifier for binary data based on the BHMM classifier. Parameter estimation of this model can be carried out using discriminative training criteria for log-linear models. In particular, we show the formulae for several MMI based criteria. Finally, we prove the equivalence between both classifiers, hence, discriminative training of a BHMM classifier can be carried out by obtaining its equivalent log-linear classifier. Reported results show that discriminative BHMMs clearly outperform conventional generative BHMMs.Giménez Pastor, A. (2014). Bernoulli HMMs for Handwritten Text Recognition [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/37978TESI
    corecore