24 research outputs found
A non-learnable class of E-pattern languages
We investigate the inferrability of E-pattern languages (also known as extended
or erasing pattern languages) from positive data in Gold’s learning model. As the
main result, our analysis yields a negative outcome for the full class of E-pattern
languages – and even for the subclass of terminal-free E-pattern languages – if the
corresponding terminal alphabet consists of exactly two distinct letters. Furthermore,
we present a positive result for a manifest subclass of terminal-free E-pattern
languages. We point out that the considered problems are closely related to fundamental
questions concerning the nondeterminism of E-pattern languages
Reflective inductive inference of recursive functions
AbstractIn this paper, we investigate reflective inductive inference of recursive functions. A reflective IIM is a learning machine that is additionally able to assess its own competence.First, we formalize reflective learning from arbitrary, and from canonical, example sequences. Here, we arrive at four different types of reflection: reflection in the limit, optimistic, pessimistic and exact reflection.Then, we compare the learning power of reflective IIMs with each other as well as with the one of standard IIMs for learning in the limit, for consistent learning of three different types, and for finite learning
Discontinuities in pattern inference
This paper deals with the inferrability of classes of E-pattern languages—also referred
to as extended or erasing pattern languages—from positive data in Gold’s
model of identification in the limit. The first main part of the paper shows that
the recently presented negative result on terminal-free E-pattern languages over binary
alphabets does not hold for other alphabet sizes, so that the full class of these
languages is inferrable from positive data if and only if the corresponding terminal
alphabet does not consist of exactly two distinct letters. The second main part yields
the insight that the positive result on terminal-free E-pattern languages over alphabets
with three or four letters cannot be extended to the class of general E-pattern
languages. With regard to larger alphabets, the extensibility remains open.
The proof methods developed for these main results do not directly discuss the
(non-)existence of appropriate learning strategies, but they deal with structural
properties of classes of E-pattern languages, and, in particular, with the problem
of finding telltales for these languages. It is shown that the inferrability of classes
of E-pattern languages is closely connected to some problems on the ambiguity
of morphisms so that the technical contributions of the paper largely consist of
combinatorial insights into morphisms in word monoids
Local Patterns
A pattern is a word consisting of constants from an alphabet Sigma of terminal symbols and variables from a set X. Given a pattern alpha, the decision-problem whether a given word w may be obtained by substituting the variables in alpha for words over Sigma is called the matching problem. While this problem is, in general, NP-complete, several classes of patterns for which it can be efficiently solved are already known. We present two new classes of patterns, called k-local, and strongly-nested, and show that the respective matching problems, as well as membership can be solved efficiently for any fixed k
Discontinuities in pattern inference
This paper deals with the inferrability of classes of E-pattern languages—also referred
to as extended or erasing pattern languages—from positive data in Gold’s
model of identification in the limit. The first main part of the paper shows that
the recently presented negative result on terminal-free E-pattern languages over binary
alphabets does not hold for other alphabet sizes, so that the full class of these
languages is inferrable from positive data if and only if the corresponding terminal
alphabet does not consist of exactly two distinct letters. The second main part yields
the insight that the positive result on terminal-free E-pattern languages over alphabets
with three or four letters cannot be extended to the class of general E-pattern
languages. With regard to larger alphabets, the extensibility remains open.
The proof methods developed for these main results do not directly discuss the
(non-)existence of appropriate learning strategies, but they deal with structural
properties of classes of E-pattern languages, and, in particular, with the problem
of finding telltales for these languages. It is shown that the inferrability of classes
of E-pattern languages is closely connected to some problems on the ambiguity
of morphisms so that the technical contributions of the paper largely consist of
combinatorial insights into morphisms in word monoids
Revisiting Shinohara's algorithm for computing descriptive patterns
A pattern α is a word consisting of constants and variables and it describes the pattern language L(α) of all words that can be obtained by uniformly replacing the variables with constant words. In 1982, Shinohara presents an algorithm that computes a pattern that is descriptive for a finite set S of words, i.e., its pattern language contains S in the closest possible way among all pattern languages. We generalise Shinohara’s algorithm to subclasses of patterns and characterise those subclasses for which it is applicable. Furthermore, within this set of pattern classes, we characterise those for which Shinohara’s algorithm has a polynomial running time (under the assumption P 6= N P). Moreover, we also investigate the complexity of the consistency problem of patterns, i.e., finding a pattern that separates two given finite sets of words
When Children Chat with Machine Translated Text: Problems, Possibilities, Potential
Two cross-lingual (Nepalese and English) letter exchanges took place between school children from Nepal and England, using Digipal; an Android chatting application. Digipal uses Google Translate to enable children to read and reply in their native language. In two studies we analysed the errors made and the effect of errors on children’s understanding and on the flow of conversation. We found that errors of input negatively affected translation, although this can be reduced through initial grammar cleaning. We highlight features of children’s text that cause errors in translation whilst showing how children worked with and around these errors. Errors sometimes added humour and contributed to continuing the conversations
Memoization Attacks and Copy Protection in Partitioned Applications
Application source code protection is a major concern for software architects today. Secure platforms have been proposed that protect the secrecy of application algorithms and enforce copy protection assurances. Unfortunately, these capabilities incur a sizeable performance overhead. Partitioning an application into secure and insecure regions can help diminish these overheads but invalidates guarantees of code secrecy and copy protection.This work examines one of the problems of partitioning an application into public and private regions, the ability of an adversary to recreate those private regions. To our knowledge, it is the first to analyze this problem when considering application operation as a whole. Looking at the fundamentals of the issue, we analyze one of the simplest attacks possible, a ``Memoization Attack.'' We implement an efficient Memoization Attack and discuss necessary techniques that limit storage and computation consumption. Experimentation reveals that certain classes of real-world applications are vulnerable to Memoization Attacks. To protect against such an attack, we propose a set of indicator tests that enable an application designer to identify susceptible application code regions