85 research outputs found

    Convolutional Bidirectional Variational Autoencoder for Image Domain Translation of Dotted Arabic Expiration

    Full text link
    THIS paper proposes an approach of Ladder Bottom-up Convolutional Bidirectional Variational Autoencoder (LCBVAE) architecture for the encoder and decoder, which is trained on the image translation of the dotted Arabic expiration dates by reconstructing the Arabic dotted expiration dates into filled-in expiration dates. We employed a customized and adapted version of Convolutional Recurrent Neural Network CRNN model to meet our specific requirements and enhance its performance in our context, and then trained the custom CRNN model with the filled-in images from the year of 2019 to 2027 to extract the expiration dates and assess the model performance of LCBVAE on the expiration date recognition. The pipeline of (LCBVAE+CRNN) can be then integrated into an automated sorting systems for extracting the expiry dates and sorting the products accordingly during the manufacture stage. Additionally, it can overcome the manual entry of expiration dates that can be time-consuming and inefficient at the merchants. Due to the lack of the availability of the dotted Arabic expiration date images, we created an Arabic dot-matrix True Type Font (TTF) for the generation of the synthetic images. We trained the model with unrealistic synthetic dates of 59902 images and performed the testing on a realistic synthetic date of 3287 images from the year of 2019 to 2027, represented as yyyy/mm/dd. In our study, we demonstrated the significance of latent bottleneck layer with improving the generalization when the size is increased up to 1024 in downstream transfer learning tasks as for image translation. The proposed approach achieved an accuracy of 97% on the image translation with using the LCBVAE architecture that can be generalized for any downstream learning tasks as for image translation and reconstruction.Comment: 15 Pages, 10 figure

    Unsupervised feature learning for writer identification

    Get PDF
    Our work presents a research on unsupervised feature learning methods for writer identification and retrieval. We want to study the impact of deep learning alternatives in this field by proposing methodologies which explore different uses of autoencoder networks. Taking a patch extraction algorithm as a starting point, we aim to obtain characteristics from patches of handwritten documents in an unsupervised way, meaning no label information is used for the task. To prove if the extraction of features is valid for writer identification, the approaches we propose are evaluated and compared with state-of-the-art methods on the ICDAR2013 and ICDAR2017 datasets for writer identification

    Recurrent Deep Divergence-based Clustering for simultaneous feature learning and clustering of variable length time series

    Full text link
    The task of clustering unlabeled time series and sequences entails a particular set of challenges, namely to adequately model temporal relations and variable sequence lengths. If these challenges are not properly handled, the resulting clusters might be of suboptimal quality. As a key solution, we present a joint clustering and feature learning framework for time series based on deep learning. For a given set of time series, we train a recurrent network to represent, or embed, each time series in a vector space such that a divergence-based clustering loss function can discover the underlying cluster structure in an end-to-end manner. Unlike previous approaches, our model inherently handles multivariate time series of variable lengths and does not require specification of a distance-measure in the input space. On a diverse set of benchmark datasets we illustrate that our proposed Recurrent Deep Divergence-based Clustering approach outperforms, or performs comparable to, previous approaches

    Robust recognition technique for handwritten Kannada character recognition using capsule networks

    Get PDF
    Automated reading of handwritten Kannada documents is highly challenging due to the presence of vowels, consonants and its modifiers. The variable nature of handwriting styles aggravates the complexity of machine based reading of handwritten vowels and consonants. In this paper, our investigation is inclined towards design of a deep convolution network with capsule and routing layers to efficiently recognize  Kannada handwritten characters.  Capsule network architecture is built of an input layer,  two convolution layers, primary capsule, routing capsule layers followed by tri-level dense convolution layer and an output layer.  For experimentation, datasets are collected from more than 100 users for creation of training data samples of about 7769 comprising of 49 classes. Test samples of all the 49 classes are again collected separately from 3 to 5 users creating a total of 245 samples for novel patterns. It is inferred from performance evaluation; a loss of 0.66% is obtained in the classification process and for 43 classes precision of 100% is achieved with an accuracy of 99%. An average accuracy of 95% is achieved for all remaining 6 classes with an average precision of 89%

    Deep Sparse Auto-Encoder Features Learning for Arabic Text Recognition

    Get PDF
    One of the most recent challenging issues of pattern recognition and artificial intelligence is Arabic text recognition. This research topic is still a pervasive and unaddressed research field, because of several factors. Complications arise due to the cursive nature of the Arabic writing, character similarities, unlimited vocabulary, use of multi-size and mixed-fonts, etc. To handle these challenges, an automatic Arabic text recognition requires building a robust system by computing discriminative features and applying a rigorous classifier together to achieve an improved performance. In this work, we introduce a new deep learning based system that recognizes Arabic text contained in images. We propose a novel hybrid network, combining a Bag-of-Feature (BoF) framework for feature extraction based on a deep Sparse Auto-Encoder (SAE), and Hidden Markov Models (HMMs), for sequence recognition. Our proposed system, termed BoF-deep SAE-HMM, is tested on four datasets, namely the printed Arabic line images Printed KHATT (P-KHATT), the benchmark printed word images Arabic Printed Text Image (APTI), the benchmark handwritten Arabic word images IFN/ENIT, and the benchmark handwritten digits images Modified National Institute of Standards and Technology (MNIST)
    • …
    corecore