Search CORE

127 research outputs found

A New Constructive Method for the One-Letter Context-Free Grammars

Author: Andrei Å tefan
Chin Wei Ngan
Publication venue
Publication date: 01/01/2004
Field of study

Constructive methods for obtaining the regular grammar counterparts for some sub-classes of the context free grammars (cfg) have been investigated by many researchers. An important class of grammars for which this is always possible is the one-letter cfg. We show in this paper a new constructive method for transforming arbitrary one-letter cfg to an equivalent regular expression of star-height 0 or 1. Our new result is considerably simpler than a previous construction by Leiss, and we also propose a new normal form for a regular expression with single-star occurrence. Through an alphabet factorization theorem, we show how to go beyond the one-letter cfg in a straight-forward way.Singapore-MIT Alliance (SMA

DSpace@MIT

On Complexity Classes of Spiking Neural P Systems

Author: Cienciala Ludek
Rodríguez Patón Alfonso
Sosík Petr
Publication venue: Fénix Editora
Publication date: 01/01/2010
Field of study

A sequence of papers have been recently published, pointing out various intractable problems which may be solved in certain fashions within the framework of spiking neural (SN) P systems. On the other hand, there are also results demonstrating limitations of SN P systems. In this paper we define recognizer SN P systems providing a general platform for this type of results. We intend to give a more systematic characterization of computational power of variants of SN P systems, and establish their relation to standard complexity classes

idUS. Depósito de Investigación Universidad de Sevilla

REGULAR LANGUAGES: TO FINITE AUTOMATA AND BEYOND - SUCCINCT DESCRIPTIONS AND OPTIMAL SIMULATIONS

Author: L. Prigioniero
Publication venue: Università degli Studi di Milano
Publication date: 31/01/2020
Field of study

\uc8 noto che i linguaggi regolari \u2014 o di tipo 3 \u2014 sono equivalenti agli automi a stati finiti. Tuttavia, in letteratura sono presenti altre caratterizzazioni di questa classe di linguaggi, in termini di modelli riconoscitori e grammatiche. Per esempio, limitando le risorse computazionali di modelli pi\uf9 generali, quali grammatiche context-free, automi a pila e macchine di Turing, che caratterizzano classi di linguaggi pi\uf9 ampie, \ue8 possibile ottenere modelli che generano o riconoscono solamente i linguaggi regolari. I dispositivi risultanti forniscono delle rappresentazioni alternative dei linguaggi di tipo 3, che, in alcuni casi, risultano significativamente pi\uf9 compatte rispetto a quelle dei modelli che caratterizzano la stessa classe di linguaggi. Il presente lavoro ha l\u2019obiettivo di studiare questi modelli formali dal punto di vista della complessit\ue0 descrizionale, o, in altre parole, di analizzare le relazioni tra le loro dimensioni, ossia il numero di simboli utilizzati per specificare la loro descrizione. Sono presentati, inoltre, alcuni risultati connessi allo studio della famosa domanda tuttora aperta posta da Sakoda e Sipser nel 1978, inerente al costo, in termini di numero di stati, per l\u2019eliminazione del nondeterminismo dagli automi stati finiti sfruttando la capacit\ue0 degli automi two-way deterministici di muovere la testina avanti e indietro sul nastro di input.It is well known that regular \u2014 or type 3 \u2014 languages are equivalent to finite automata. Nevertheless, many other characterizations of this class of languages in terms of computational devices and generative models are present in the literature. For example, by suitably restricting more general models such as context-free grammars, pushdown automata, and Turing machines, that characterize wider classes of languages, it is possible to obtain formal models that generate or recognize regular languages only. The resulting formalisms provide alternative representations of type 3 languages that may be significantly more concise than other models that share the same expressing power. The goal of this work is to investigate these formal systems from a descriptional complexity perspective, or, in other words, to study the relationships between their sizes, namely the number of symbols used to write down their descriptions. We also present some results related to the investigation of the famous question posed by Sakoda and Sipser in 1978, concerning the size blowups from nondeterministic finite automata to two-way deterministic finite automata

AIR Universita degli studi di Milano

Modeling Dependencies in Natural Languages with Latent Variables

Author: Huang Zhongqiang
Publication venue
Publication date: 01/01/2011
Field of study

In this thesis, we investigate the use of latent variables to model complex dependencies in natural languages. Traditional models, which have a fixed parameterization, often make strong independence assumptions that lead to poor performance. This problem is often addressed by incorporating additional dependencies into the model (e.g., using higher order N-grams for language modeling). These added dependencies can increase data sparsity and/or require expert knowledge, together with trial and error, in order to identify and incorporate the most important dependencies (as in lexicalized parsing models). Traditional models, when developed for a particular genre, domain, or language, are also often difficult to adapt to another. In contrast, previous work has shown that latent variable models, which automatically learn dependencies in a data-driven way, are able to flexibly adjust the number of parameters based on the type and the amount of training data available. We have created several different types of latent variable models for a diverse set of natural language processing applications, including novel models for part-of-speech tagging, language modeling, and machine translation, and an improved model for parsing. These models perform significantly better than traditional models. We have also created and evaluated three different methods for improving the performance of latent variable models. While these methods can be applied to any of our applications, we focus our experiments on parsing. The first method involves self-training, i.e., we train models using a combination of gold standard training data and a large amount of automatically labeled training data. We conclude from a series of experiments that the latent variable models benefit much more from self-training than conventional models, apparently due to their flexibility to adjust their model parameterization to learn more accurate models from the additional automatically labeled training data. The second method takes advantage of the variability among latent variable models to combine multiple models for enhanced performance. We investigate several different training protocols to combine self-training with model combination. We conclude that these two techniques are complementary to each other and can be effectively combined to train very high quality parsing models. The third method replaces the generative multinomial lexical model of latent variable grammars with a feature-rich log-linear lexical model to provide a principled solution to address data sparsity, handle out-of-vocabulary words, and exploit overlapping features during model induction. We conclude from experiments that the resulting grammars are able to effectively parse three different languages. This work contributes to natural language processing by creating flexible and effective latent variable models for several different languages. Our investigation of self-training, model combination, and log-linear models also provides insights into the effective application of these machine learning techniques to other disciplines

Digital Repository at the University of Maryland

Scientific discovery using genetic programming

Author: Keijzer Maarten
Publication venue
Publication date: 01/01/2001
Field of study

Online Research Database In Technology

Recommended from our members

The Roles of Language Models and Hierarchical Models in Neural Sequence-to-Sequence Prediction

Author: Stahlberg Felix
Publication venue: University of Cambridge
Publication date: 17/02/2020
Field of study

With the advent of deep learning, research in many areas of machine learning is converging towards the same set of methods and models. For example, long short-term memory networks are not only popular for various tasks in natural language processing (NLP) such as speech recognition, machine translation, handwriting recognition, syntactic parsing, etc., but they are also applicable to seemingly unrelated fields such as robot control, time series prediction, and bioinformatics. Recent advances in contextual word embeddings like BERT boast with achieving state-of-the-art results on 11 NLP tasks with the same model. Before deep learning, a speech recognizer and a syntactic parser used to have little in common as systems were much more tailored towards the task at hand. At the core of this development is the tendency to view each task as yet another data mapping problem, neglecting the particular characteristics and (soft) requirements tasks often have in practice. This often goes along with a sharp break of deep learning methods with previous research in the specific area. This work can be understood as an antithesis to this paradigm. We show how traditional symbolic statistical machine translation models can still improve neural machine translation (NMT) while reducing the risk for common pathologies of NMT such as hallucinations and neologisms. Other external symbolic models such as spell checkers and morphology databases help neural grammatical error correction. We also focus on language models that often do not play a role in vanilla end-to-end approaches and apply them in different ways to word reordering, grammatical error correction, low-resource NMT, and document-level NMT. Finally, we demonstrate the benefit of hierarchical models in sequence-to-sequence prediction. Hand-engineered covering grammars are effective in preventing catastrophic errors in neural text normalization systems. Our operation sequence model for interpretable NMT represents translation as a series of actions that modify the translation state, and can also be seen as derivation in a formal grammar.EPSRC grant EP/L027623/1 EPSRC Tier-2 capital grant EP/P020259/

Apollo (Cambridge)