1,789 research outputs found

    Math Search for the Masses: Multimodal Search Interfaces and Appearance-Based Retrieval

    Full text link
    We summarize math search engines and search interfaces produced by the Document and Pattern Recognition Lab in recent years, and in particular the min math search interface and the Tangent search engine. Source code for both systems are publicly available. "The Masses" refers to our emphasis on creating systems for mathematical non-experts, who may be looking to define unfamiliar notation, or browse documents based on the visual appearance of formulae rather than their mathematical semantics.Comment: Paper for Invited Talk at 2015 Conference on Intelligent Computer Mathematics (July, Washington DC

    Dysarthric Speech Recognition and Offline Handwriting Recognition using Deep Neural Networks

    Get PDF
    Millions of people around the world are diagnosed with neurological disorders like Parkinson’s, Cerebral Palsy or Amyotrophic Lateral Sclerosis. Due to the neurological damage as the disease progresses, the person suffering from the disease loses control of muscles, along with speech deterioration. Speech deterioration is due to neuro motor condition that limits manipulation of the articulators of the vocal tract, the condition collectively called as dysarthria. Even though dysarthric speech is grammatically and syntactically correct, it is difficult for humans to understand and for Automatic Speech Recognition (ASR) systems to decipher. With the emergence of deep learning, speech recognition systems have improved a lot compared to traditional speech recognition systems, which use sophisticated preprocessing techniques to extract speech features. In this digital era there are still many documents that are handwritten many of which need to be digitized. Offline handwriting recognition involves recognizing handwritten characters from images of handwritten text (i.e. scanned documents). This is an interesting task as it involves sequence learning with computer vision. The task is more difficult than Optical Character Recognition (OCR), because handwritten letters can be written in virtually infinite different styles. This thesis proposes exploiting deep learning techniques like Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) for offline handwriting recognition. For speech recognition, we compare traditional methods for speech recognition with recent deep learning methods. Also, we apply speaker adaptation methods both at feature level and at parameter level to improve recognition of dysarthric speech

    Introducing Handwriting into a Multimodal LATEX Formula Editor

    Get PDF
    Handwriting has been shown to be a useful input modality for math. However, math recognizers are imperfect, especially when recognizing complex expressions. Instead of improving the recognizer itself, we explore ways to best visualize the recognizer\u27s output to help the user fix recognition mistakes more efficiently. To do this, we propose changes to the visual editing operations in MathDeck, a math-aware search engine and formula editor, as well as the addition of an n-best list of results for each symbol in the recognizer\u27s output. We present two experiments to help us find good ways to help users fix errors in the recognizer, and to test whether these changes help novices input formulas more efficiently than they would if they did not have handwriting as an input modality. In the first experiment, users had the option to fix errors with an in-place drop-down menu of alternate symbols, a side symbol correction panel, or by typing the symbols themselves or dragging them from a symbol palette. In our experiment, most users preferred to fix the errors manually by typing the correct symbols or using the symbol palette. In the second experiment, participants entered formulas using handwriting and/or LaTeX. We found evidence that suggests that novices can input formulas faster when they have access to handwriting, but experts still do better when they can just type LaTeX

    Writer adaptation for offline text recognition: An exploration of neural network-based methods

    Full text link
    Handwriting recognition has seen significant success with the use of deep learning. However, a persistent shortcoming of neural networks is that they are not well-equipped to deal with shifting data distributions. In the field of handwritten text recognition (HTR), this shows itself in poor recognition accuracy for writers that are not similar to those seen during training. An ideal HTR model should be adaptive to new writing styles in order to handle the vast amount of possible writing styles. In this paper, we explore how HTR models can be made writer adaptive by using only a handful of examples from a new writer (e.g., 16 examples) for adaptation. Two HTR architectures are used as base models, using a ResNet backbone along with either an LSTM or Transformer sequence decoder. Using these base models, two methods are considered to make them writer adaptive: 1) model-agnostic meta-learning (MAML), an algorithm commonly used for tasks such as few-shot classification, and 2) writer codes, an idea originating from automatic speech recognition. Results show that an HTR-specific version of MAML known as MetaHTR improves performance compared to the baseline with a 1.4 to 2.0 improvement in word error rate (WER). The improvement due to writer adaptation is between 0.2 and 0.7 WER, where a deeper model seems to lend itself better to adaptation using MetaHTR than a shallower model. However, applying MetaHTR to larger HTR models or sentence-level HTR may become prohibitive due to its high computational and memory requirements. Lastly, writer codes based on learned features or Hinge statistical features did not lead to improved recognition performance.Comment: 21 pages including appendices, 6 figures, 10 table

    Embedding and learning with signatures

    Get PDF
    Sequential and temporal data arise in many fields of research, such as quantitative finance, medicine, or computer vision. The present article is concerned with a novel approach for sequential learning, called the signature method, and rooted in rough path theory. Its basic principle is to represent multidimensional paths by a graded feature set of their iterated integrals, called the signature. This approach relies critically on an embedding principle, which consists in representing discretely sampled data as paths, i.e., functions from [0,1] to R^d. After a survey of machine learning methodologies for signatures, we investigate the influence of embeddings on prediction accuracy with an in-depth study of three recent and challenging datasets. We show that a specific embedding, called lead-lag, is systematically better, whatever the dataset or algorithm used. Moreover, we emphasize through an empirical study that computing signatures over the whole path domain does not lead to a loss of local information. We conclude that, with a good embedding, the signature combined with a simple algorithm achieves results competitive with state-of-the-art, domain-specific approaches

    Discovering real-world usage scenarios for a multimodal math search interface

    Get PDF
    To use math expressions in search, current search engines require knowing expression names or using a structure editor or string encoding (e.g., LaTeX) to enter expressions. This is unfortunate for people who are not math experts, as this can lead to an intention gap between the math query they wish to express, and what the interface will allow. min is a search interface that supports drawing expressions on a canvas using a mouse/touch, keyboard and images. We designed a user study to examine how the multimodal interface of min changes search behavior for mathematical non-experts, and discover real-world usage scenarios. Participants demonstrated increased use of math expressions in queries when using min. There was little difference in task success reported by participants using min vs. text-based search, but the majority of participants appreciated the multimodal input, and identified real-world scenarios in which they would like to use systems like min

    The adaptation of handwriting recognition system user interface in preschool literacy learning courseware

    Get PDF
    Currently, the education software or simply courseware has successfully become the heart of computer usage in classrooms. With varieties of courseware introduced, the discussion of which courseware assist students the most continues. This paper describes our experience on developing a literacy courseware named handwriting leaning number (HLN) for preschool children.HLN has adapted the handwriting recognition system interface where stylus pen becomes its input modality.By using the stylus pen, we believed that the children could perform better in writing exercise.In order to prove our assumption, we have conducted user satisfaction test using modified questionnaire for user interface satisfaction (QUIS) version 5.5.Result derived from the test indicated that the children tend to satisfy the user interface designed for HLN as the average mean score obtained was 3.864 with Standard Deviation of 0.6879.On the other hand, problems regarding precision and usability of HLN are also reported while solutions to the problems are recommended
    corecore