2,061 research outputs found

    English character recognition algorithm by improving the weights of MLP neural network with dragonfly algorithm

    Get PDF
    Character Recognition (CR) is taken into consideration for years. Meanwhile, the neural network plays an important role in recognizing handwritten characters. Many character identification reports have been publishing in English, but still the minimum training timing and high accuracy of handwriting English symbols and characters by utilizing a method of neural networks are represents as open problems. Therefore, creating a character recognition system manually and automatically is very important. In this research, an attempt has been done to incubate an automatic symbols and character system for recognition for English with minimum training and a very high recognition accuracy and classification timing. In the proposed idea for improving the weights of the MLP neural network method in the process of teaching and learning character recognition, the dragonfly optimization algorithm has been used. The innovation of the proposed detection system is that with a combination of dragonfly optimization technique and MLP neural networks, the precisions of the system are recovered, and the computing time is minimized. The approach which was used in this study to identify English characters has high accuracy and minimum training time

    Query-Driven Global Graph Attention Model for Visual Parsing: Recognizing Handwritten and Typeset Math Formulas

    Get PDF
    We present a new visual parsing method based on standard Convolutional Neural Networks (CNNs) for handwritten and typeset mathematical formulas. The Query-Driven Global Graph Attention (QD-GGA) parser employs multi-task learning, using a single feature representation for locating, classifying, and relating symbols. QD-GGA parses formulas by first constructing a Line-Of-Sight (LOS) graph over the input primitives (e.g handwritten strokes or connected components in images). Second, class distributions for LOS nodes and edges are obtained using query-specific feature filters (i.e., attention) in a single feed-forward pass. This allows end-to-end structure learning using a joint loss over primitive node and edge class distributions. Finally, a Maximum Spanning Tree (MST) is extracted from the weighted graph using Edmonds\u27 Arborescence Algorithm. The model may be run recurrently over the input graph, updating attention to focus on symbols detected in the previous iteration. QD-GGA does not require additional grammar rules and the language model is learned from the sets of symbols/relationships and the statistics over them in the training set. We benchmark our system against both handwritten and typeset state-of-the-art math recognition systems. Our preliminary results show that this is a promising new approach for visual parsing of math formulas. Using recurrent execution, symbol detection is near perfect for both handwritten and typeset formulas: we obtain a symbol f-measure of over 99.4% for both the CROHME (handwritten) and INFTYMCCDB-2 (typeset formula image) datasets. Our method is also much faster in both training and execution than state-of-the-art RNN-based formula parsers. The unlabeled structure detection of QDGGA is competitive with encoder-decoder models, but QD-GGA symbol and relationship classification is weaker. We believe this may be addressed through increased use of spatial features and global context

    Visual pattern recognition using neural networks

    Get PDF
    Neural networks have been widely studied in a number of fields, such as neural architectures, neurobiology, statistics of neural network and pattern classification. In the field of pattern classification, neural network models are applied on numerous applications, for instance, character recognition, speech recognition, and object recognition. Among these, character recognition is commonly used to illustrate the feature and classification characteristics of neural networks. In this dissertation, the theoretical foundations of artificial neural networks are first reviewed and existing neural models are studied. The Adaptive Resonance Theory (ART) model is improved to achieve more reasonable classification results. Experiments in applying the improved model to image enhancement and printed character recognition are discussed and analyzed. We also study the theoretical foundation of Neocognitron in terms of feature extraction, convergence in training, and shift invariance. We investigate the use of multilayered perceptrons with recurrent connections as the general purpose modules for image operations in parallel architectures. The networks are trained to carry out classification rules in image transformation. The training patterns can be derived from user-defmed transformations or from loading the pair of a sample image and its target image when the prior knowledge of transformations is unknown. Applications of our model include image smoothing, enhancement, edge detection, noise removal, morphological operations, image filtering, etc. With a number of stages stacked up together we are able to apply a series of operations on the image. That is, by providing various sets of training patterns the system can adapt itself to the concatenated transformation. We also discuss and experiment in applying existing neural models, such as multilayered perceptron, to realize morphological operations and other commonly used imaging operations. Some new neural architectures and training algorithms for the implementation of morphological operations are designed and analyzed. The algorithms are proven correct and efficient. The proposed morphological neural architectures are applied to construct the feature extraction module of a personal handwritten character recognition system. The system was trained and tested with scanned image of handwritten characters. The feasibility and efficiency are discussed along with the experimental results

    Abmash: Mashing Up Legacy Web Applications by Automated Imitation of Human Actions

    Get PDF
    Many business web-based applications do not offer applications programming interfaces (APIs) to enable other applications to access their data and functions in a programmatic manner. This makes their composition difficult (for instance to synchronize data between two applications). To address this challenge, this paper presents Abmash, an approach to facilitate the integration of such legacy web applications by automatically imitating human interactions with them. By automatically interacting with the graphical user interface (GUI) of web applications, the system supports all forms of integrations including bi-directional interactions and is able to interact with AJAX-based applications. Furthermore, the integration programs are easy to write since they deal with end-user, visual user-interface elements. The integration code is simple enough to be called a "mashup".Comment: Software: Practice and Experience (2013)

    Advances in Character Recognition

    Get PDF
    This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject

    A Computational Theory of Contextual Knowledge in Machine Reading

    Get PDF
    Machine recognition of off–line handwriting can be achieved by either recognising words as individual symbols (word level recognition) or by segmenting a word into parts, usually letters, and classifying those parts (letter level recognition). Whichever method is used, current handwriting recognition systems cannot overcome the inherent ambiguity in writingwithout recourse to contextual information. This thesis presents a set of experiments that use Hidden Markov Models of language to resolve ambiguity in the classification process. It goes on to describe an algorithm designed to recognise a document written by a single–author and to improve recognition by adaptingto the writing style and learning new words. Learning and adaptation is achieved by reading the document over several iterations. The algorithm is designed to incorporate contextual processing, adaptation to modify the shape of known words and learning of new words within a constrained dictionary. Adaptation occurs when a word that has previously been trained in the classifier is recognised at either the word or letter level and the word image is used to modify the classifier. Learning occurs when a new word that has not been in the training set is recognised at the letter level and is subsequently added to the classifier. Words and letters are recognised using a nearest neighbour classifier and used features based on the two–dimensional Fourier transform. By incorporating a measure of confidence based on the distribution of training points around an exemplar, adaptation and learning is constrained to only occur when a word is confidently classified. The algorithm was implemented and tested with a dictionary of 1000 words. Results show that adaptation of the letter classifier improved recognition on average by 3.9% with only 1.6% at the whole word level. Two experiments were carried out to evaluate the learning in the system. It was found that learning accounted for little improvement in the classification results and also that learning new words was prone to misclassifications being propagated

    Perceptual strategies of experts and novices in a fast ball sport

    Get PDF
    This thesis examined the perceptual strategies of expert and novice badminton players in an attempt to test notions of visual selective attention within applied, ecologically valid, sport settings. In keeping with established premises from information-processing theory it was hypothesized that the expert players would be characterized by a greater ability to extract advance information from the display (to facilitate anticipation), by the allocation of attention to the most pertinent cues available in the display (to promote search efficiency and to avoid distractions) and by the utilization of a relatively low visual search rate (as indicative of processing efficiency). In Experiment 1 the perceptual strategies of 20 elite and 35 novice badminton players were compared using a series of tasks in which the perceptual display of a badminton player was simulated using film. When the film display was manipulated using variable temporal occlusion points it was found that experts showed a consistently greater ability to predict the landing position of the shuttle from early advance cues than did novices, with the time period between 170 and 85 msec prior to racquet-shuttle contact being a critical one for the establishment of skill group differences. For both skill groups greatest improvements in prediction accuracy arose in the subsequent time period from 85 msec prior to contact to 85 msec after contact implying the criticality of cues arising in this period to the normal decision-making process. When specific spatial cues were selectively occluded from the film display the racquet and the playing side arm were found to be the principal cues upon which experts based their anticipatory prediction of shuttle direction whereas novices appeared to rely only upon racquet cues. These proficiency-related differences in cue usage were capable of explaining, in part, the differences ln anticipatory performance observed on the temporal occlusion task. Eye movements recorded during the performance of the film task (Experiment 2) were consistent with the notion of the racquet region containing the anticipatory cues of highest informational content with over 70% of all fixations occurring on that section of the display. The visual search sequence was found to normally progress from an early orientation of fixations upon gross bodily features of the opponent (such as trunk, head or lower body) to a later, more precise orientation to the region of the racquet with this apparent proximal-to-distal shift of the fixation distributions matching closely the emergent biomechanical characteristics of the stroke. Both the location and sequence of the fixations however, appeared relatively uninfluenced by the task conditions suggesting that the search patterns adopted were relatively inflexible as if pre-determined by some over-riding perceptual framework. Contrary to some earlier sport-specific investigations of the visual search process no significant differences In fixation location, duration or sequence were observed between experts and novices suggesting that the differences In anticipatory performance observed on the film task were not a consequence of differences In overt visual search characteristics. Advantages of the film task approach over the eye movement recording approach in terms of assessing actual information extraction rather than merely visual orientation were therefore apparent. Experiments 3 to 7 sought to establish the validity and reliability of the paradigm tor the assessment of individual differences in perceptual strategy used in Experiments 1 and 2. The film task was shown, using dual task methods, to provide comparable attention demands to actually playing and it was shown that concurrent eye movement recording could take place without interference with the subject's response to the film task. Prediction error measures derived from the film task were found to have high reliability with identical conclusions being reached regarding individual subject's perceptual strategies on each occasion the test was administered. Visual search parameters appeared somewhat less reliable with the same anticipatory performance being apparently possible through the use of different search rates, although fixation location and order characteristics remained consistent over time. When the ski II group distinction was reduced and an alternative form of error analysis was adopted the characteristic earlier extraction of information and greater utilization of arm cues by experts again emerged, suggesting that the proficiency-related differences observed in Experiment 1 were robust ones
    • …
    corecore