14 research outputs found

    Curriculum Learning for Handwritten Text Line Recognition

    Full text link
    Recurrent Neural Networks (RNN) have recently achieved the best performance in off-line Handwriting Text Recognition. At the same time, learning RNN by gradient descent leads to slow convergence, and training times are particularly long when the training database consists of full lines of text. In this paper, we propose an easy way to accelerate stochastic gradient descent in this set-up, and in the general context of learning to recognize sequences. The principle is called Curriculum Learning, or shaping. The idea is to first learn to recognize short sequences before training on all available training sequences. Experiments on three different handwritten text databases (Rimes, IAM, OpenHaRT) show that a simple implementation of this strategy can significantly speed up the training of RNN for Text Recognition, and even significantly improve performance in some cases

    A Biologically Plausible SOM Representation of the Orthographic Form of 50,000 French Words

    No full text
    International audienceRecently, an important aspect of human visual word recognition has been characterized. The letter position is encoded in our brain using an explicit representation of order based on letter pairs: the open-bigram coding [15]. We hypothesize that spelling has evolved in order to minimize reading errors. Therefore, word recognition using bigrams — instead of letters — should be more efficient. First, we study the influence of the size of the neighborhood, which defines the number of bigrams per word, on the performance of the matching between bigrams and word. Our tests are conducted against one of the best recognition solutions used today by the industry, which matches letters to words. Secondly, we build a cortical map representation of the words in the bigram space — which implies numerous experiments in order to achieve a satisfactory projection. Third, we develop an ultra-fast version of the self-organizing map in order to achieve learning in minutes instead of months

    SIMARA: a database for key-value information extraction from full pages

    Full text link
    We propose a new database for information extraction from historical handwritten documents. The corpus includes 5,393 finding aids from six different series, dating from the 18th-20th centuries. Finding aids are handwritten documents that contain metadata describing older archives. They are stored in the National Archives of France and are used by archivists to identify and find archival documents. Each document is annotated at page-level, and contains seven fields to retrieve. The localization of each field is not available in such a way that this dataset encourages research on segmentation-free systems for information extraction. We propose a model based on the Transformer architecture trained for end-to-end information extraction and provide three sets for training, validation and testing, to ensure fair comparison with future works. The database is freely accessible at https://zenodo.org/record/7868059

    Visual analytics for the interpretation of fluency tests during Alzheimer evaluation

    Get PDF
    International audienceA possible way to evaluate the progress of Alzheimer disease is to conduct the Isaac set test [13, 14]. In this activity, patients are asked to cite the largest possible number of city names within a minute. Since the city names are handwritten very quickly by a medical practitioner some cities are abbreviated or poorly written. In order to analyze such data, medical practitioners need to digitize the notes first and clean the dataset. Because these tasks are intricate and error prone we propose a novel set of tools, involving interactive visualization techniques, to help medical practitioners in the digitization and data-cleaning process. This system will be tested as part of an ongoing longitudinal study involving 9500 patients

    A high-performance word recognition system for the biological fieldnotes of the Natuurkundige Commissie

    Get PDF
    In this research, a high word-recognition accuracy was achieved using an e-Science friendly deep learning method on a highly multilingual data set. Deep learning requires large training sets. Therefore, we use an auxiliary data set in addition to the target data set which is derived from the collection Natuurkundige Commissie, years 1820-1850. The auxiliary historical data set is from another writer (van Oort). The method concerns a compact ensemble of Convolutional Bidirectional Long Short-Term Memory neural networks. A dual-state word-beam search combined with an adequate label-coding scheme is used for decoding the connectionist temporal classification layer. Our approach increased the recognition accuracy of the words that a recognizer has never seen, i.e., out-of-vocabulary (OOV) words with 3.5 percentage points. The use of extraneous training data increased the performance on in-vocabulary words by 1 pp. The network architectures in an ensemble are generated randomly and autonomously such that our system can be deployed in an e-Science server. The OOV capability allows scholars to search for words that did not exist in the original training set.</p
    corecore