129 research outputs found
Streaming algorithms for language recognition problems
We study the complexity of the following problems in the streaming model.
Membership testing for \DLIN We show that every language in \DLIN\ can be
recognised by a randomized one-pass space algorithm with inverse
polynomial one-sided error, and by a deterministic p-pass space
algorithm. We show that these algorithms are optimal.
Membership testing for \LL For languages generated by \LL grammars
with a bound of on the number of nonterminals at any stage in the left-most
derivation, we show that membership can be tested by a randomized one-pass
space algorithm with inverse polynomial (in ) one-sided error.
Membership testing for \DCFL We show that randomized algorithms as efficient
as the ones described above for \DLIN\ and \LL(k) (which are subclasses of
\DCFL) cannot exist for all of \DCFL: there is a language in \VPL\ (a subclass
of \DCFL) for which any randomized p-pass algorithm with error bounded by
must use space.
Degree sequence problem We study the problem of determining, given a sequence
and a graph , whether the degree sequence of is
precisely . We give a randomized one-pass space
algorithm with inverse polynomial one-sided error probability. We show that our
algorithms are optimal.
Our randomized algorithms are based on the recent work of Magniez et al.
\cite{MMN09}; our lower bounds are obtained by considering related
communication complexity problems
Accepting grammars and systems
We investigate several kinds of regulated rewriting (programmed,
matrix, with regular control, ordered, and variants thereof) and
of parallel rewriting mechanisms (Lindenmayer systems, uniformly
limited Lindenmayer systems, limited Lindenmayer systems and
scattered context grammars) as accepting devices, in contrast
with the usual generating mode.
In some cases, accepting mode turns out to be just as powerful as
generating mode, e.g. within the grammars of the Chomsky
hierarchy, within random context, regular control, L systems,
uniformly limited L systems, scattered context. Most of these
equivalences can be proved using a metatheorem on so-called
context condition grammars. In case of matrix grammars and
programmed grammars without appearance checking, a straightforward
construction leads to the desired equivalence result.
Interestingly, accepting devices are (strictly) more powerful than
their generating counterparts in case of ordered grammars,
programmed and matrix grammars with appearance checking (even
programmed grammarsm with unconditional transfer), and 1lET0L
systems. More precisely, if we admit erasing productions, we
arrive at new characterizations of the recursivley enumerable
languages, and if we do not admit them, we get new
characterizations of the context-sensitive languages.
Moreover, we supplement the published literature showing:
- The emptiness and membership problems are recursivley solvable
for generating ordered grammars, even if we admit erasing
productions.
- Uniformly limited propagating systems can be simulated by
programmed grammars without erasing and without appearance
checking, hence the emptiness and membership problems are
recursively solvable for such systems.
- We briefly discuss the degree of nondeterminism and the
degree of synchronization for devices with limited parallelism
Regulated Formal Models and Their Reduction
Department of Theoretical Computer Science and Mathematical LogicKatedra teoretické informatiky a matematické logikyFaculty of Mathematics and PhysicsMatematicko-fyzikální fakult
Complexity and modeling power of insertion-deletion systems
SISTEMAS DE INSERCIÓN Y BORRADO: COMPLEJIDAD Y
CAPACIDAD DE MODELADO
El objetivo central de la tesis es el estudio de los sistemas de inserción y borrado y su
capacidad computacional. Más concretamente, estudiamos algunos modelos de
generación de lenguaje que usan operaciones de reescritura de dos cadenas. También
consideramos una variante distribuida de los sistemas de inserción y borrado en el
sentido de que las reglas se separan entre un número finito de nodos de un grafo.
Estos sistemas se denominan sistemas controlados mediante grafo, y aparecen en
muchas áreas de la Informática, jugando un papel muy importante en los lenguajes
formales, la lingüística y la bio-informática. Estudiamos la decidibilidad/
universalidad de nuestros modelos mediante la variación de los parámetros de tamaño
del vector. Concretamente, damos respuesta a la cuestión más importante
concerniente a la expresividad de la capacidad computacional: si nuestro modelo es
equivalente a una máquina de Turing o no. Abordamos sistemáticamente las
cuestiones sobre los tamaños mínimos de los sistemas con y sin control de grafo.COMPLEXITY AND MODELING POWER OF
INSERTION-DELETION SYSTEMS
The central object of the thesis are insertion-deletion systems and their computational
power. More specifically, we study language generating models that use two string
rewriting operations: contextual insertion and contextual deletion, and their
extensions. We also consider a distributed variant of insertion-deletion systems in the
sense that rules are separated among a finite number of nodes of a graph. Such
systems are refereed as graph-controlled systems. These systems appear in many
areas of Computer Science and they play an important role in formal languages,
linguistics, and bio-informatics. We vary the parameters of the vector of size of
insertion-deletion systems and we study decidability/universality of obtained models.
More precisely, we answer the most important questions regarding the expressiveness
of the computational model: whether our model is Turing equivalent or not. We
systematically approach the questions about the minimal sizes of the insertiondeletion
systems with and without the graph-control
Syntax-based machine translation using dependency grammars and discriminative machine learning
Machine translation underwent huge improvements since the groundbreaking
introduction of statistical methods in the early 2000s, going from very
domain-specific systems that still performed relatively poorly despite the
painstakingly crafting of thousands of ad-hoc rules, to general-purpose
systems automatically trained on large collections of bilingual texts which
manage to deliver understandable translations that convey the general
meaning of the original input.
These approaches however still perform quite below the level of human
translators, typically failing to convey detailed meaning and register, and
producing translations that, while readable, are often ungrammatical and
unidiomatic.
This quality gap, which is considerably large compared to most other
natural language processing tasks, has been the focus of the research in
recent years, with the development of increasingly sophisticated models that
attempt to exploit the syntactical structure of human languages, leveraging
the technology of statistical parsers, as well as advanced machine learning
methods such as marging-based structured prediction algorithms and neural
networks.
The translation software itself became more complex in order to accommodate
for the sophistication of these advanced models: the main translation
engine (the decoder) is now often combined with a pre-processor which
reorders the words of the source sentences to a target language word order, or
with a post-processor that ranks and selects a translation according according
to fine model from a list of candidate translations generated by a coarse
model.
In this thesis we investigate the statistical machine translation problem
from various angles, focusing on translation from non-analytic languages
whose syntax is best described by fluid non-projective dependency grammars
rather than the relatively strict phrase-structure grammars or projectivedependency
grammars which are most commonly used in the literature.
We propose a framework for modeling word reordering phenomena
between language pairs as transitions on non-projective source dependency
parse graphs. We quantitatively characterize reordering phenomena for the
German-to-English language pair as captured by this framework, specifically
investigating the incidence and effects of the non-projectivity of source
syntax and the non-locality of word movement w.r.t. the graph structure.
We evaluated several variants of hand-coded pre-ordering rules in order to
assess the impact of these phenomena on translation quality.
We propose a class of dependency-based source pre-ordering approaches
that reorder sentences based on a flexible models trained by SVMs and and
several recurrent neural network architectures.
We also propose a class of translation reranking models, both syntax-free
and source dependency-based, which make use of a type of neural networks
known as graph echo state networks which is highly flexible and requires
extremely little training resources, overcoming one of the main limitations
of neural network models for natural language processing tasks
- …