929 research outputs found

    Unconstrained Scene Text and Video Text Recognition for Arabic Script

    Full text link
    Building robust recognizers for Arabic has always been challenging. We demonstrate the effectiveness of an end-to-end trainable CNN-RNN hybrid architecture in recognizing Arabic text in videos and natural scenes. We outperform previous state-of-the-art on two publicly available video text datasets - ALIF and ACTIV. For the scene text recognition task, we introduce a new Arabic scene text dataset and establish baseline results. For scripts like Arabic, a major challenge in developing robust recognizers is the lack of large quantity of annotated data. We overcome this by synthesising millions of Arabic text images from a large vocabulary of Arabic words and phrases. Our implementation is built on top of the model introduced here [37] which is proven quite effective for English scene text recognition. The model follows a segmentation-free, sequence to sequence transcription approach. The network transcribes a sequence of convolutional features from the input image to a sequence of target labels. This does away with the need for segmenting input image into constituent characters/glyphs, which is often difficult for Arabic script. Further, the ability of RNNs to model contextual dependencies yields superior recognition results.Comment: 5 page

    Determination of baseflow quantity by using unmanned aerial vehicle (UAV) and Google Earth

    Get PDF
    Baseflow is most important in low-flow hydrological features [1]. It is a function of a large number of variables that include factors such as topography, geology, soil, vegetation, and climate. In many catchments, base flow is an important component of streamflow and, therefore, base flow separations have been widely studied and have a long history in science. Baseflow separation methods can be divided into two main groups: non-tracer-based and tracer- based separation methods of hydrology. Besides, the base flow is determined by fitting a unit hydrograph model with information from the recession limbs of the hydrograph and extrapolating it backward

    Improving time efficiency of feedforward neural network learning

    Get PDF
    Feedforward neural networks have been widely studied and used in many applications in science and engineering. The training of this type of networks is mainly undertaken using the well-known backpropagation based learning algorithms. One major problem with this type of algorithms is the slow training convergence speed, which hinders their applications. In order to improve the training convergence speed of this type of algorithms, many researchers have developed different improvements and enhancements. However, the slow convergence problem has not been fully addressed. This thesis makes several contributions by proposing new backpropagation learning algorithms based on the terminal attractor concept to improve the existing backpropagation learning algorithms such as the gradient descent and Levenberg-Marquardt algorithms. These new algorithms enable fast convergence both at a distance from and in a close range of the ideal weights. In particular, a new fast convergence mechanism is proposed which is based on the fast terminal attractor concept. Comprehensive simulation studies are undertaken to demonstrate the effectiveness of the proposed backpropagataion algorithms with terminal attractors. Finally, three practical application cases of time series forecasting, character recognition and image interpolation are chosen to show the practicality and usefulness of the proposed learning algorithms with comprehensive comparative studies with existing algorithms

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    A Computational Theory of Contextual Knowledge in Machine Reading

    Get PDF
    Machine recognition of off–line handwriting can be achieved by either recognising words as individual symbols (word level recognition) or by segmenting a word into parts, usually letters, and classifying those parts (letter level recognition). Whichever method is used, current handwriting recognition systems cannot overcome the inherent ambiguity in writingwithout recourse to contextual information. This thesis presents a set of experiments that use Hidden Markov Models of language to resolve ambiguity in the classification process. It goes on to describe an algorithm designed to recognise a document written by a single–author and to improve recognition by adaptingto the writing style and learning new words. Learning and adaptation is achieved by reading the document over several iterations. The algorithm is designed to incorporate contextual processing, adaptation to modify the shape of known words and learning of new words within a constrained dictionary. Adaptation occurs when a word that has previously been trained in the classifier is recognised at either the word or letter level and the word image is used to modify the classifier. Learning occurs when a new word that has not been in the training set is recognised at the letter level and is subsequently added to the classifier. Words and letters are recognised using a nearest neighbour classifier and used features based on the two–dimensional Fourier transform. By incorporating a measure of confidence based on the distribution of training points around an exemplar, adaptation and learning is constrained to only occur when a word is confidently classified. The algorithm was implemented and tested with a dictionary of 1000 words. Results show that adaptation of the letter classifier improved recognition on average by 3.9% with only 1.6% at the whole word level. Two experiments were carried out to evaluate the learning in the system. It was found that learning accounted for little improvement in the classification results and also that learning new words was prone to misclassifications being propagated

    Optimisation of a weightless neural network using particle swarms

    Get PDF
    Among numerous pattern recognition methods the neural network approach has been the subject of much research due to its ability to learn from a given collection of representative examples. This thesis is concerned with the design of weightless neural networks, which decompose a given pattern into several sets of n points, termed n-tuples. Considerable research has shown that by optimising the input connection mapping of such n-tuple networks classification performance can be improved significantly. In this thesis the application of a population-based stochastic optimisation technique, known as Particle Swarm Optimisation (PSO), to the optimisation of the connectivity pattern of such “n-tuple” classifiers is explored. The research was aimed at improving the discriminating power of the classifier in recognising handwritten characters by exploiting more efficient learning strategies. The proposed "learning" scheme searches for ‘good’ input connections of the n-tuples in the solution space and shrinks the search area step by step. It refines its search by attracting the particles to positions with good solutions in an iterative manner. Every iteration the performance or fitness of each input connection is evaluated, so a reward and punishment based fitness function was modelled for the task. The original PSO was refined by combining it with other bio-inspired approaches like Self-Organized Criticality and Nearest Neighbour Interactions. The hybrid algorithms were adapted for the n-tuple system and the performance was measured in selecting better connectivity patterns. The Genetic Algorithm (GA) has been shown to be accomplishing the same goals as the PSO, so the performances and convergence properties of the GA were compared against the PSO to optimise input connections. Experiments were conducted to evaluate the proposed methods by applying the trained classifiers to recognise handprinted digits from a widely used database. Results revealed the superiority of the particle swarm optimised training for the n-tuples over other algorithms including the GA. Low particle velocity in PSO was favourable for exploring more areas in the solution space and resulted in better recognition rates. Use of hybridisation was helpful and one of the versions of the hybrid PSO was found to be the best performing algorithm in finding the optimum set of input maps for the n-tuple network

    Training Methods for Shunting Inhibitory Artificial Neural Networks

    Get PDF
    This project investigates a new class of high-order neural networks called shunting inhibitory artificial neural networks (SIANN\u27s) and their training methods. SIANN\u27s are biologically inspired neural networks whose dynamics are governed by a set of coupled nonlinear differential equations. The interactions among neurons are mediated via a nonlinear mechanism called shunting inhibition, which allows the neurons to operate as adaptive nonlinear filters. The project\u27s main objective is to devise training methods, based on error backpropagation type of algorithms, which would allow SIANNs to be trained to perform feature extraction for classification and nonlinear regression tasks. The training algorithms developed will simplify the task of designing complex, powerful neural networks for applications in pattern recognition, image processing, signal processing, machine vision and control. The five training methods adapted in this project for SIANN\u27s are error-backpropagation based on gradient descent (GD), gradient descent with variable learning rate (GDV), gradient descent with momentum (GDM), gradient descent with direct solution step (GDD) and APOLEX algorithm. SIANN\u27s and these training methods are implemented in MATLAB. Testing on several benchmarks including the parity problems, classification of 2-D patterns, and function approximation shows that SIANN\u27s trained using these methods yield comparable or better performance with multilayer perceptrons (MLP\u27s)

    The design of a neural network compiler

    Get PDF
    Computer simulation is a flexible and economical way for rapid prototyping and concept evaluation with Neural Network (NN) models. Increasing research on NNs has led to the development of several simulation programs. Not all simulations have the same scope. Some simulations allow only a fixed network model and some are more general. Designing a simulation program for general purpose NN models has become a current trend nowadays because of its flexibility and efficiency. A proper programming language specifically for NN models is preferred since the existing high-level languages such as C are for NN designers from a strong computer background. The program translations for NN languages come from combinations which are either interpreter and/or compiler. There are also various styles of programming languages such as a procedural, functional, descriptive and object-oriented. The main focus of this thesis is to study the feasibility of using a compiler method for the development of a general-purpose simulator - NEUCOMP that compiles the program written as a list of mathematical specifications of the particular NN model and translates it into a chosen target program. The language supported by NEUCOMP is based on a procedural style. Information regarding the list of mathematical statements required by the NN models are written in the program. The mathematical statements used are represented by scalar, vector and matrix assignments. NEUCOMP translates these expressions into actual program loops. NEUCOMP enables compilation of a simulation program written in the NEUCOMP language for any NN model, contains graphical facilities such as portraying the NN architecture and displaying a graph of the result during training and finally to have a program that can run on a parallel shared memory multi-processor system
    • 

    corecore