Search CORE

25 research outputs found

Stable Encoding of Large Finite-State Automata in Recurrent Neural Networks with Sigmoid Discriminants

Author: Giles C. Lee
Omlin Christian W.
Publication venue
Publication date: 15/10/1998
Field of study

We propose an algorithm for encoding deterministic finite-state automata (DFAs) in second-order recurrent neural networks with sigmoidal discriminant function and we prove that the languages accepted by the constructed network and the DFA are identical. The desired finite-state network dynamics is achieved by programming a small subset of all weights. A worst case analysis reveals a relationship between the weight strength and the maximum allowed network size which guarantees finite-state behavior of the constructed network. We illustrate the method by encoding random DFAs with 10, 100, and 1,000 states. While the theory predicts that the weight strength scales with the DFA size, we find the weight strength to be almost constant for all the experiments. These results can be explained by noting that the generated DFAs represent average cases. We empirically demonstrate the existence of extreme DFAs for which the weight strength scales with DFA size. (Also cross-referenced as UMIACS-TR-94-101

Digital Repository at the University of Maryland

Provably Stable Interpretable Encodings of Context Free Grammars in RNNs with a Differentiable Stack

Author: Giles C Lee
Mali Ankur
Stogin John
Publication venue
Publication date: 10/06/2020
Field of study

Given a collection of strings belonging to a context free grammar (CFG) and another collection of strings not belonging to the CFG, how might one infer the grammar? This is the problem of grammatical inference. Since CFGs are the languages recognized by pushdown automata (PDA), it suffices to determine the state transition rules and stack action rules of the corresponding PDA. An approach would be to train a recurrent neural network (RNN) to classify the sample data and attempt to extract these PDA rules. But neural networks are not a priori aware of the structure of a PDA and would likely require many samples to infer this structure. Furthermore, extracting the PDA rules from the RNN is nontrivial. We build a RNN specifically structured like a PDA, where weights correspond directly to the PDA rules. This requires a stack architecture that is somehow differentiable (to enable gradient-based learning) and stable (an unstable stack will show deteriorating performance with longer strings). We propose a stack architecture that is differentiable and that provably exhibits orbital stability. Using this stack, we construct a neural network that provably approximates a PDA for strings of arbitrary length. Moreover, our model and method of proof can easily be generalized to other state machines, such as a Turing Machine.Comment: 20 pages, 2 figure

arXiv.org e-Print Archive

Constructing Deterministic Finite-State Automata in Recurrent Neural Networks

Author: Giles C. Lee
Omlin Christian W.
Publication venue
Publication date: 15/10/1998
Field of study

Recurrent neural networks that are {\it trained} to behave like deterministic finite-state automata (DFAs) can show deteriorating performance when tested on long strings. This deteriorating performance can be attributed to the instability of the internal representation of the learned DFA states. The use of a sigmoidal discriminant function together with the recurrent structure contribute to this instability. We prove that a simple algorithm can {\it construct} second-order recurrent neural networks with a sparse interconnection topology and sigmoidal discriminant function such that the internal DFA state representations are stable, i.e. the constructed network correctly classifies strings of {\it arbitrary length}. The algorithm is based on encoding strengths of weights directly into the neural network. We derive a relationship between the weight strength and the number of DFA states for robust string classification. For a DFA with

n

states and

m

input alphabet symbols, the constructive algorithm generates a ``programmed" neural network with

O(n)

neurons and

O(mn)

weights. We compare our algorithm to other methods proposed in the literature. Revised in February 1996 (Also cross-referenced as UMIACS-TR-95-50

Digital Repository at the University of Maryland

Symbolic and connectionist learning techniques for grammatical inference

Author: Alquézar Mancho René
Publication venue: Universitat Politècnica de Catalunya
Publication date: 13/02/2008
Field of study

This thesis is structured in four parts for a total of ten chapters. The first part, introduction and review (Chapters 1 to 4), presents an extensive state-of-the-art review of both symbolic and connectionist GI methods, that serves also to state most of the basic material needed to describe later the contributions of the thesis. These contributions constitute the contents of the rest of parts (Chapters 5 to 10). The second part, contributions on symbolic and connectionist techniques for regular grammatical inference (Chapters 5 to 7), describes the contributions related to the theory and methods for regular GI, which include other lateral subjects such as the representation oí. finite-state machines (FSMs) in recurrent neural networks (RNNs).The third part of the thesis, augmented regular expressions and their inductive inference, comprises Chapters 8 and 9. The augmented regular expressions (or AREs) are defined and proposed as a new representation for a subclass of CSLs that does not contain all the context-free languages but a large class of languages capable of describing patterns with symmetries and other (context-sensitive) structures of interest in pattern recognition problems.The fourth part of the thesis just includes Chapter 10: conclusions and future research. Chapter 10 summarizes the main results obtained and points out the lines of further research that should be followed both to deepen in some of the theoretical aspects raised and to facilitate the application of the developed GI tools to real-world problems in the area of computer vision

Tesis Doctorals en Xarxa

Combined optimization algorithms applied to pattern classification

Author: Lappas Georgios
Publication venue
Publication date: 01/01/2006
Field of study

Accurate classification by minimizing the error on test samples is the main goal in pattern classification. Combinatorial optimization is a well-known method for solving minimization problems, however, only a few examples of classifiers axe described in the literature where combinatorial optimization is used in pattern classification. Recently, there has been a growing interest in combining classifiers and improving the consensus of results for a greater accuracy. In the light of the "No Ree Lunch Theorems", we analyse the combination of simulated annealing, a powerful combinatorial optimization method that produces high quality results, with the classical perceptron algorithm. This combination is called LSA machine. Our analysis aims at finding paradigms for problem-dependent parameter settings that ensure high classifica, tion results. Our computational experiments on a large number of benchmark problems lead to results that either outperform or axe at least competitive to results published in the literature. Apart from paxameter settings, our analysis focuses on a difficult problem in computation theory, namely the network complexity problem. The depth vs size problem of neural networks is one of the hardest problems in theoretical computing, with very little progress over the past decades. In order to investigate this problem, we introduce a new recursive learning method for training hidden layers in constant depth circuits. Our findings make contributions to a) the field of Machine Learning, as the proposed method is applicable in training feedforward neural networks, and to b) the field of circuit complexity by proposing an upper bound for the number of hidden units sufficient to achieve a high classification rate. One of the major findings of our research is that the size of the network can be bounded by the input size of the problem and an approximate upper bound of 8 + √2n/n threshold gates as being sufficient for a small error rate, where n := log/SL and SL is the training set

University of Hertfordshire Research Archive

Recommended from our members

The application of artificial neural networks to interpret acoustic emissions from submerged arc welding

Author: McCardle John Richard
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/1997
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Automated fusion welding processes play a fundamental role in modern manufacturing industries. The proliferation of joint geometries together with the large permutation of associated process variable configurations has given rise to research into complex system modelling and control strategies. Many of these techniques have involved monitoring of not only the electrical characteristics of the process but visual and acoustic information. Acoustic information derived from certain welding processes is well documented as it is an established fact that skilled manual welders utilise such information as an aid to creating an optimum weld. The experimental investigation presented in this thesis is dedicated to the feasibility of monitoring airborne acoustic emissions of Submerged Arc Welding (SAW) for diagnostic and real time control purposes. The experimental method adopted for this research takes a cybernetic approach to data processing and interpretation in an attempt to replicate the robustness of human biological functions. A custom designed audio hardware system was used to analyse signals obtained from bead on mild steel plate fusion welds. Time and frequency domains were used in an attempt to establish salient characteristics or identify the signatures associated with changes of the process variables. The featured parameters were voltage / current and weld travel speed, due to their ease of validation. However, consideration has also been given to weld defect prediction due to process instabilities. As the data proved to be highly correlated and erratic when subjected to off line statistical analysis, extensive investigation was given to the application of artificial neural networks to signal processing and real time control scenarios. As a consequence, a dedicated neural based software system was developed, utilising supervised and unsupervised neural techniques to monitor the process. The research was aimed at proving the feasibility of monitoring the electrical process parameters and stability of the welding process in real time. It was shown to be possible, by the exploitation of artificial neural networks, to generate a number of monitoring parameters indicative of the welding process state. The limitations of the present neural method and proposed developments are discussed, together with an overview of applied neural network technology and its impact on artificial intelligence and robotic control. Further developments are considered together with recommendations for future areas of research

Brunel University Research Archive

Mining a Small Medical Data Set by Integrating the Decision Tree and t-test

Author: Chang Ming-Yang
Publication venue: 'Academy Publisher'
Publication date
Field of study

[[abstract]]Although several researchers have used statistical methods to prove that aspiration followed by the injection of 95% ethanol left in situ (retention) is an effective treatment for ovarian endometriomas, very few discuss the different conditions that could generate different recovery rates for the patients. Therefore, this study adopts the statistical method and decision tree techniques together to analyze the postoperative status of ovarian endometriosis patients under different conditions. Since our collected data set is small, containing only 212 records, we use all of these data as the training data. Therefore, instead of using a resultant tree to generate rules directly, we use the value of each node as a cut point to generate all possible rules from the tree first. Then, using t-test, we verify the rules to discover some useful description rules after all possible rules from the tree have been generated. Experimental results show that our approach can find some new interesting knowledge about recurrent ovarian endometriomas under different conditions.[[journaltype]]國外[[incitationindex]]EI[[booktype]]紙本[[countrycodes]]FI

Tamkang University Institutional Repository

State of the Art in Face Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Notwithstanding the tremendous effort to solve the face recognition problem, it is not possible yet to design a face recognition system with a potential close to human performance. New computer vision and pattern recognition approaches need to be investigated. Even new knowledge and perspectives from different fields like, psychology and neuroscience must be incorporated into the current field of face recognition to design a robust face recognition system. Indeed, many more efforts are required to end up with a human like face recognition system. This book tries to make an effort to reduce the gap between the previous face recognition research state and the future state

Directory of Open Access Books (DOAB)

A novel approach to handwritten character recognition

Author: Clarke Eddie
Publication venue
Publication date: 01/01/1995
Field of study

A number of new techniques and approaches for off-line handwritten character recognition are presented which individually make significant advancements in the field. First. an outline-based vectorization algorithm is described which gives improved accuracy in producing vector representations of the pen strokes used to draw characters. Later. Vectorization and other types of preprocessing are criticized and an approach to recognition is suggested which avoids separate preprocessing stages by incorporating them into later stages. Apart from the increased speed of this approach. it allows more effective alteration of the character images since more is known about them at the later stages. It also allows the possibility of alterations being corrected if they are initially detrimental to recognition. A new feature measurement. the Radial Distance/Sector Area feature. is presented which is highly robust. tolerant to noise. distortion and style variation. and gives high accuracy results when used for training and testing in a statistical or neural classifier. A very powerful classifier is therefore obtained for recognizing correctly segmented characters. The segmentation task is explored in a simple system of integrated over-segmentation. Character classification and approximate dictionary checking. This can be extended to a full system for handprinted word recognition. In addition to the advancements made by these methods. a powerful new approach to handwritten character recognition is proposed as a direction for future research. This proposal combines the ideas and techniques developed in this thesis in a hierarchical network of classifier modules to achieve context-sensitive. off-line recognition of handwritten text. A new type of "intelligent" feedback is used to direct the search to contextually sensible classifications. A powerful adaptive segmentation system is proposed which. when used as the bottom layer in the hierarchical network. allows initially incorrect segmentations to be adjusted according to the hypotheses of the higher level context modules

Nottingham eTheses