12 research outputs found
Implementation of Code Properties via Transducers
The FAdo system is a symbolic manipulator of formal language objects, implemented in Python. In this work, we extend its capabilities by implementing methods to manipulate transducers and we go one level higher than existing formal language systems and implement methods to manipulate objects representing classes of independent languages (widely known as code properties). Our methods allow users to define their own code properties and combine them between themselves or with fixed properties such as prefix codes, suffix codes, error detecting codes, etc. The satisfaction and maximality decision questions are solvable for any of the definable properties. The new online system LaSer allows one to query about a code property and obtain the answer in a batch mode. Our work is founded on independence theory as well as the theory of rational relations and transducers, and contributes with improved algorithms on these objects
Ideal regular languages and strongly connected synchronizing automata
We introduce the notion of a reset left regular decomposition of an ideal regular language, and we prove that the category formed by these decompositions with the adequate set of morphisms is equivalent to the category of strongly connected synchronizing automata. We show that every ideal regular language has at least one reset left regular decomposition. As a consequence, every ideal regular language is the set of synchronizing words of some strongly connected synchronizing automaton. Furthermore, this one-to-one correspondence allows us to introduce the notion of reset decomposition complexity of an ideal from which follows a reformulation of Černý's conjecture in purely language theoretic terms. Finally, we present and characterize a subclass of ideal regular languages for which a better upper bound for the reset decomposition complexity holds with respect to the general case
Quotient Complexity of Ideal Languages
The final publication is available at Elsevier via http://dx.doi.org/10.1016/j.tcs.2012.10.055 © 2013. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/A language L over an alphabet Σ is a right (left) ideal if it satisfies L=LΣ∗ (L=Σ∗L). It is a two-sided ideal if L=Σ∗LΣ∗, and an all-sided ideal if L=Σ∗L, the shuffle of Σ∗ with L. Ideal languages are not only of interest from the theoretical point of view, but also have applications to pattern matching. We study the state complexity of common operations in the class of regular ideal languages, but prefer to use the equivalent term “quotient complexity”, which is the number of distinct left quotients of a language. We find tight upper bounds on the complexity of each type of ideal language in terms of the complexity of an arbitrary generator and of the minimal generator, and also on the complexity of the minimal generator in terms of the complexity of the language. Moreover, tight upper bounds on the complexity of union, intersection, set difference, symmetric difference, concatenation, star, and reversal of ideal languages are derived.Natural Sciences and Engineering Research Council of Canada grant [OGP0000871]VEGA grant 2/0111/0
Static Program Analysis for String Manipulation Languages
In recent years, dynamic languages, such as JavaScript or Python, have been
increasingly used in a wide range of fields and applications. Their tricky and
misunderstood behaviors pose a hard challenge for static analysis of these
programming languages. A key aspect of any dynamic language program is the
multiple usage of strings, since they can be implicitly converted to another
type value, transformed by string-to-code primitives or used to access an
object-property. Unfortunately, string analyses for dynamic languages still
lack precision and do not take into account some important string features.
Moreover, string obfuscation is very popular in the context of dynamic language
malicious code, for example, to hide code information inside strings and then
to dynamically transform strings into executable code. In this scenario, more
precise string analyses become a necessity. This paper is placed in the context
of static string analysis by abstract interpretation and proposes a new
semantics for string analysis, placing a first step for handling dynamic
languages string features.Comment: In Proceedings VPT 2019, arXiv:1908.0672
Static program analysis for string manipulation languages
In recent years, dynamic languages, such as JavaScript or Python, have been increasingly used in a wide range of fields and applications. Their tricky and misunderstood behaviors pose a hard challenge for static analysis of these programming languages. A key aspect of any dynamic language program is the multiple usage of strings, since they can be implicitly converted to another type value, transformed by string-to-code primitives or used to access an object-property. Unfortunately, string analyses for dynamic languages still lack precision and do not take into account some important string features. Moreover, string obfuscation is very popular in the context of dynamic language malicious code, for example, to hide code information inside strings and then to dynamically transform strings into executable code. In this scenario, more precise string analyses become a necessity. This paper is placed in the context of static string analysis by abstract interpretation and proposes a new semantics for string analysis, placing a first step for handling dynamic languages string features