Search CORE

4,308 research outputs found

NeuroWrite: Predictive Handwritten Digit Classification using Deep Neural Networks

Author: Asish Kottakota
Chander R. Kishan
Hema Dr. D. Deva
Teja P. Sarath
Publication venue
Publication date: 02/11/2023
Field of study

The rapid evolution of deep neural networks has revolutionized the field of machine learning, enabling remarkable advancements in various domains. In this article, we introduce NeuroWrite, a unique method for predicting the categorization of handwritten digits using deep neural networks. Our model exhibits outstanding accuracy in identifying and categorising handwritten digits by utilising the strength of convolutional neural networks (CNNs) and recurrent neural networks (RNNs).In this article, we give a thorough examination of the data preparation methods, network design, and training methods used in NeuroWrite. By implementing state-of-the-art techniques, we showcase how NeuroWrite can achieve high classification accuracy and robust generalization on handwritten digit datasets, such as MNIST. Furthermore, we explore the model's potential for real-world applications, including digit recognition in digitized documents, signature verification, and automated postal code recognition. NeuroWrite is a useful tool for computer vision and pattern recognition because of its performance and adaptability.The architecture, training procedure, and evaluation metrics of NeuroWrite are covered in detail in this study, illustrating how it can improve a number of applications that call for handwritten digit classification. The outcomes show that NeuroWrite is a promising method for raising the bar for deep neural network-based handwritten digit recognition.Comment: 6 pages, 10 figure

arXiv.org e-Print Archive

Using generative models for handwritten digit recognition

Author: Hinton G. E.
Revow M.
Williams C. K. I.
Publication venue
Publication date: 01/01/1996
Field of study

We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable B-splines with Gaussian ``ink generators'' spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. (1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. (2) During the process of explaining the image, generative models can perform recognition driven segmentation. (3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. (4) Unlike many other recognition schemes it does not rely on some form of pre-normalization of input images, but can handle arbitrary scalings, translations and a limited degree of image rotation. We have demonstrated our method of fitting models to images does not get trapped in poor local minima. The main disadvantage of the method is it requires much more computation than more standard OCR techniques

CiteSeerX

Aston Publications Explorer

Human Reading Based Strategies for off-line Arabic Word Recognition

Author: Belaïd Abdel
Choisy Christophe
Publication venue: HAL CCSD
Publication date: 27/09/2006
Field of study

International audienceThis paper summarizes some techniques proposed for off-line Arabic word recognition. The point of view developed here concerns the human reading favoring an interactive mechanism between global memorization and local checking making easier the recognition of complex scripts as Arabic. According to this consideration, some specific papers are analyzed and their strategies commente

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Face image super-resolution using 2D CCA

Author: An L
Bhanu B
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

In this paper a face super-resolution method using two-dimensional canonical correlation analysis (2D CCA) is presented. A detail compensation step is followed to add high-frequency components to the reconstructed high-resolution face. Unlike most of the previous researches on face super-resolution algorithms that first transform the images into vectors, in our approach the relationship between the high-resolution and the low-resolution face image are maintained in their original 2D representation. In addition, rather than approximating the entire face, different parts of a face image are super-resolved separately to better preserve the local structure. The proposed method is compared with various state-of-the-art super-resolution algorithms using multiple evaluation criteria including face recognition performance. Results on publicly available datasets show that the proposed method super-resolves high quality face images which are very close to the ground-truth and performance gain is not dataset dependent. The method is very efficient in both the training and testing phases compared to the other approaches. © 2013 Elsevier B.V

Crossref

eScholarship - University of California

Concurrent evolution of feature extractors and modular artificial neural networks

Author: Hannak Victor
Publication venue: RIT Scholar Works
Publication date: 01/01/2004
Field of study

Artificial Neural Networks (ANNs) are commonly used in both academia and industry as a solution to challenges in the pattern recognition domain. However, there are two problems that must be addressed before an ANN can be successfully applied to a given recognition task: ANN customization and data pre-processing. First, ANNs require customization for each specific application. Although the underlying mathematics of ANNs is well understood, customization based on theoretical analysis is impractical because of the complex interrelationship between ANN behavior and the problem domain. On the other hand, an empirical approach to the task of customization can be successful with the selection of an appropriate test domain. However, this latter approach is computationally intensive, especially due to the many variables that can be adjusted within the system. Additionally, it is subject to the limitations of the selected search algorithm used to find the optimal solution. Second, data pre-processing (feature extraction) is almost always necessary in order to organize and minimize the input data, thereby optimizing ANN performance. Not only is it difficult to know what and how many features to extract from the data, but it is also challenging to find the right balance between the computational requirements for the preprocessing algorithm versus the ANN itself. Furthermore, the task of developing an appropriate pre-processing algorithm usually requires expert knowledge of the problem domain, which may not always be available. This paper contends that the concurrent evolution of ANNs and data pre-processors allows the design of highly accurate recognition networks without the need for expert knowledge in the application domain. To this end, a novel method for evolving customized ANNs with correlated feature extractors was designed and tested. This method involves the use of concurrent evolutionary processes (CEPs) as a mechanism to search the space of recognition networks. In a series of controlled experiments the CEP was applied to the digit recognition domain to show that the efficacy of this method is in-line with results seen in other digit recognition research, but without the need for expert knowledge in image processing techniques for digit recognition

RIT Scholar Works

Real-time Arabic scene text detection using fully convolutional neural networks

Author: Chiheb Raddouane
Faizi Rdouan
Moumen Rajae
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/04/2021
Field of study

The aim of this research is to propose a fully convolutional approach to address the problem of real-time scene text detection for Arabic language. Text detection is performed using a two-steps multi-scale approach. The first step uses light-weighted fully convolutional network: TextBlockDetector FCN, an adaptation of VGG-16 to eliminate non-textual elements, localize wide scale text and give text scale estimation. The second step determines narrow scale range of text using fully convolutional network for maximum performance. To evaluate the system, we confront the results of the framework to the results obtained with single VGG-16 fully deployed for text detection in one-shot; in addition to previous results in the state-of-the-art. For training and testing, we initiate a dataset of 575 images manually processed along with data augmentation to enrich training process. The system scores a precision of 0.651 vs 0.64 in the state-of-the-art and a FPS of 24.3 vs 31.7 for a VGG-16 fully deployed

ZENODO

Institute of Advanced Engineering and Science

ChatABL: Abductive Learning via Natural Language Interaction with ChatGPT

Author: Han Junwei
Jiang Xi
Li Wenjun
Li Xiang
Liu Tianming
Liu Zhengliang
Ma Chong
Shen Dinggang
Wei Xiaozheng
Wei Yaonai
Wu Zihao
Yang Li
Yao Junjie
Zhang Tuo
Zhong Tianyang
Zhu Dajiang
Publication venue
Publication date: 21/04/2023
Field of study

Large language models (LLMs) such as ChatGPT have recently demonstrated significant potential in mathematical abilities, providing valuable reasoning paradigm consistent with human natural language. However, LLMs currently have difficulty in bridging perception, language understanding and reasoning capabilities due to incompatibility of the underlying information flow among them, making it challenging to accomplish tasks autonomously. On the other hand, abductive learning (ABL) frameworks for integrating the two abilities of perception and reasoning has seen significant success in inverse decipherment of incomplete facts, but it is limited by the lack of semantic understanding of logical reasoning rules and the dependence on complicated domain knowledge representation. This paper presents a novel method (ChatABL) for integrating LLMs into the ABL framework, aiming at unifying the three abilities in a more user-friendly and understandable manner. The proposed method uses the strengths of LLMs' understanding and logical reasoning to correct the incomplete logical facts for optimizing the performance of perceptual module, by summarizing and reorganizing reasoning rules represented in natural language format. Similarly, perceptual module provides necessary reasoning examples for LLMs in natural language format. The variable-length handwritten equation deciphering task, an abstract expression of the Mayan calendar decoding, is used as a testbed to demonstrate that ChatABL has reasoning ability beyond most existing state-of-the-art methods, which has been well supported by comparative studies. To our best knowledge, the proposed ChatABL is the first attempt to explore a new pattern for further approaching human-level cognitive ability via natural language interaction with ChatGPT

arXiv.org e-Print Archive