4,308 research outputs found

    NeuroWrite: Predictive Handwritten Digit Classification using Deep Neural Networks

    Full text link
    The rapid evolution of deep neural networks has revolutionized the field of machine learning, enabling remarkable advancements in various domains. In this article, we introduce NeuroWrite, a unique method for predicting the categorization of handwritten digits using deep neural networks. Our model exhibits outstanding accuracy in identifying and categorising handwritten digits by utilising the strength of convolutional neural networks (CNNs) and recurrent neural networks (RNNs).In this article, we give a thorough examination of the data preparation methods, network design, and training methods used in NeuroWrite. By implementing state-of-the-art techniques, we showcase how NeuroWrite can achieve high classification accuracy and robust generalization on handwritten digit datasets, such as MNIST. Furthermore, we explore the model's potential for real-world applications, including digit recognition in digitized documents, signature verification, and automated postal code recognition. NeuroWrite is a useful tool for computer vision and pattern recognition because of its performance and adaptability.The architecture, training procedure, and evaluation metrics of NeuroWrite are covered in detail in this study, illustrating how it can improve a number of applications that call for handwritten digit classification. The outcomes show that NeuroWrite is a promising method for raising the bar for deep neural network-based handwritten digit recognition.Comment: 6 pages, 10 figure

    Using generative models for handwritten digit recognition

    Get PDF
    We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable B-splines with Gaussian ``ink generators'' spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. (1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. (2) During the process of explaining the image, generative models can perform recognition driven segmentation. (3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. (4) Unlike many other recognition schemes it does not rely on some form of pre-normalization of input images, but can handle arbitrary scalings, translations and a limited degree of image rotation. We have demonstrated our method of fitting models to images does not get trapped in poor local minima. The main disadvantage of the method is it requires much more computation than more standard OCR techniques

    Human Reading Based Strategies for off-line Arabic Word Recognition

    Get PDF
    International audienceThis paper summarizes some techniques proposed for off-line Arabic word recognition. The point of view developed here concerns the human reading favoring an interactive mechanism between global memorization and local checking making easier the recognition of complex scripts as Arabic. According to this consideration, some specific papers are analyzed and their strategies commente

    Face image super-resolution using 2D CCA

    Get PDF
    In this paper a face super-resolution method using two-dimensional canonical correlation analysis (2D CCA) is presented. A detail compensation step is followed to add high-frequency components to the reconstructed high-resolution face. Unlike most of the previous researches on face super-resolution algorithms that first transform the images into vectors, in our approach the relationship between the high-resolution and the low-resolution face image are maintained in their original 2D representation. In addition, rather than approximating the entire face, different parts of a face image are super-resolved separately to better preserve the local structure. The proposed method is compared with various state-of-the-art super-resolution algorithms using multiple evaluation criteria including face recognition performance. Results on publicly available datasets show that the proposed method super-resolves high quality face images which are very close to the ground-truth and performance gain is not dataset dependent. The method is very efficient in both the training and testing phases compared to the other approaches. © 2013 Elsevier B.V

    Concurrent evolution of feature extractors and modular artificial neural networks

    Get PDF
    Artificial Neural Networks (ANNs) are commonly used in both academia and industry as a solution to challenges in the pattern recognition domain. However, there are two problems that must be addressed before an ANN can be successfully applied to a given recognition task: ANN customization and data pre-processing. First, ANNs require customization for each specific application. Although the underlying mathematics of ANNs is well understood, customization based on theoretical analysis is impractical because of the complex interrelationship between ANN behavior and the problem domain. On the other hand, an empirical approach to the task of customization can be successful with the selection of an appropriate test domain. However, this latter approach is computationally intensive, especially due to the many variables that can be adjusted within the system. Additionally, it is subject to the limitations of the selected search algorithm used to find the optimal solution. Second, data pre-processing (feature extraction) is almost always necessary in order to organize and minimize the input data, thereby optimizing ANN performance. Not only is it difficult to know what and how many features to extract from the data, but it is also challenging to find the right balance between the computational requirements for the preprocessing algorithm versus the ANN itself. Furthermore, the task of developing an appropriate pre-processing algorithm usually requires expert knowledge of the problem domain, which may not always be available. This paper contends that the concurrent evolution of ANNs and data pre-processors allows the design of highly accurate recognition networks without the need for expert knowledge in the application domain. To this end, a novel method for evolving customized ANNs with correlated feature extractors was designed and tested. This method involves the use of concurrent evolutionary processes (CEPs) as a mechanism to search the space of recognition networks. In a series of controlled experiments the CEP was applied to the digit recognition domain to show that the efficacy of this method is in-line with results seen in other digit recognition research, but without the need for expert knowledge in image processing techniques for digit recognition

    Real-time Arabic scene text detection using fully convolutional neural networks

    Get PDF
    The aim of this research is to propose a fully convolutional approach to address the problem of real-time scene text detection for Arabic language. Text detection is performed using a two-steps multi-scale approach. The first step uses light-weighted fully convolutional network: TextBlockDetector FCN, an adaptation of VGG-16 to eliminate non-textual elements, localize wide scale text and give text scale estimation. The second step determines narrow scale range of text using fully convolutional network for maximum performance. To evaluate the system, we confront the results of the framework to the results obtained with single VGG-16 fully deployed for text detection in one-shot; in addition to previous results in the state-of-the-art. For training and testing, we initiate a dataset of 575 images manually processed along with data augmentation to enrich training process. The system scores a precision of 0.651 vs 0.64 in the state-of-the-art and a FPS of 24.3 vs 31.7 for a VGG-16 fully deployed

    ChatABL: Abductive Learning via Natural Language Interaction with ChatGPT

    Full text link
    Large language models (LLMs) such as ChatGPT have recently demonstrated significant potential in mathematical abilities, providing valuable reasoning paradigm consistent with human natural language. However, LLMs currently have difficulty in bridging perception, language understanding and reasoning capabilities due to incompatibility of the underlying information flow among them, making it challenging to accomplish tasks autonomously. On the other hand, abductive learning (ABL) frameworks for integrating the two abilities of perception and reasoning has seen significant success in inverse decipherment of incomplete facts, but it is limited by the lack of semantic understanding of logical reasoning rules and the dependence on complicated domain knowledge representation. This paper presents a novel method (ChatABL) for integrating LLMs into the ABL framework, aiming at unifying the three abilities in a more user-friendly and understandable manner. The proposed method uses the strengths of LLMs' understanding and logical reasoning to correct the incomplete logical facts for optimizing the performance of perceptual module, by summarizing and reorganizing reasoning rules represented in natural language format. Similarly, perceptual module provides necessary reasoning examples for LLMs in natural language format. The variable-length handwritten equation deciphering task, an abstract expression of the Mayan calendar decoding, is used as a testbed to demonstrate that ChatABL has reasoning ability beyond most existing state-of-the-art methods, which has been well supported by comparative studies. To our best knowledge, the proposed ChatABL is the first attempt to explore a new pattern for further approaching human-level cognitive ability via natural language interaction with ChatGPT
    corecore