35 research outputs found

    On the Use of Attention Mechanism in a Seq2Seq based Approach for Off-line Handwritten Digit String Recognition

    Get PDF
    International audienceIn this work, we investigate the use of the attention mechanism in deep learning for a better reading of handwritten digit strings in digitized images. The proposed recognition system built upon a CNN (Convolutional Neural Network) and two RNNs (Recurrent Neural Networks), acting as Encoder and Decoder and using the attention mechanism. We used a 1D mechanism for attention location with a "soft" alignment attention which has the peculiarity of having an easily calculable gradient and thus to integrate well with the network. Experimental results on data from ORAND-CAR A, ORAND-CAR B and CVL HDS databases compare favorably to other published methods

    An End-to-End Approach for Recognition of Modern and Historical Handwritten Numeral Strings

    Full text link
    An end-to-end solution for handwritten numeral string recognition is proposed, in which the numeral string is considered as composed of objects automatically detected and recognized by a YoLo-based model. The main contribution of this paper is to avoid heuristic-based methods for string preprocessing and segmentation, the need for task-oriented classifiers, and also the use of specific constraints related to the string length. A robust experimental protocol based on several numeral string datasets, including one composed of historical documents, has shown that the proposed method is a feasible end-to-end solution for numeral string recognition. Besides, it reduces the complexity of the string recognition task considerably since it drops out classical steps, in special preprocessing, segmentation, and a set of classifiers devoted to strings with a specific length

    A sequential handwriting recognition model based on a dynamically configurable CRNN

    Get PDF
    Handwriting recognition refers to recognizing a handwritten input that includes character(s) or digit(s) based on an image. Because most applications of handwriting recognition in real life contain sequential text in various languages, there is a need to develop a dynamic handwriting recognition system. Inspired by the neuroevolutionary technique, this paper proposes a Dynamically Configurable Convolutional Recurrent Neural Network (DC-CRNN) for the handwriting recognition sequence modeling task. The proposed DC-CRNN is based on the Salp Swarm Optimization Algorithm (SSA), which generates the optimal structure and hyperparameters for Convolutional Recurrent Neural Networks (CRNNs). In addition, we investigate two types of encoding techniques used to translate the output of optimization to a CRNN recognizer. Finally, we proposed a novel hybridized SSA with Late Acceptance Hill-Climbing (LAHC) to improve the exploitation process. We conducted our experiments on two well-known datasets, IAM and IFN/ENIT, which include both the Arabic and English languages. The experimental results have shown that LAHC significantly improves the SSA search process. Therefore, the proposed DC-CRNN outperforms the handcrafted CRNN methods

    Towards robust real-world historical handwriting recognition

    Get PDF
    In this thesis, we make a bridge from the past to the future by using artificial-intelligence methods for text recognition in a historical Dutch collection of the Natuurkundige Commissie that explored Indonesia (1820-1850). In spite of the successes of systems like 'ChatGPT', reading historical handwriting is still quite challenging for AI. Whereas GPT-like methods work on digital texts, historical manuscripts are only available as an extremely diverse collections of (pixel) images. Despite the great results, current DL methods are very data greedy, time consuming, heavily dependent on the human expert from the humanities for labeling and require machine-learning experts for designing the models. Ideally, the use of deep learning methods should require minimal human effort, have an algorithm observe the evolution of the training process, and avoid inefficient use of the already sparse amount of labeled data. We present several approaches towards dealing with these problems, aiming to improve the robustness of current methods and to improve the autonomy in training. We applied our novel word and line text recognition approaches on nine data sets differing in time period, language, and difficulty: three locally collected historical Latin-based data sets from Naturalis, Leiden; four public Latin-based benchmark data sets for comparability with other approaches; and two Arabic data sets. Using ensemble voting of just five neural networks, a level of accuracy was achieved which required hundreds of neural networks in earlier studies. Moreover, we increased the speed of evaluation of each training epoch without the need of labeled data
    corecore