5,909 research outputs found

    DNA ANALYSIS USING GRAMMATICAL INFERENCE

    Get PDF
    An accurate language definition capable of distinguishing between coding and non-coding DNA has important applications and analytical significance to the field of computational biology. The method proposed here uses positive sample grammatical inference and statistical information to infer languages for coding DNA. An algorithm is proposed for the searching of an optimal subset of input sequences for the inference of regular grammars by optimizing a relevant accuracy metric. The algorithm does not guarantee the finding of the optimal subset; however, testing shows improvement in accuracy and performance over the basis algorithm. Testing shows that the accuracy of inferred languages for components of DNA are consistently accurate. By using the proposed algorithm languages are inferred for coding DNA with average conditional probability over 80%. This reveals that languages for components of DNA can be inferred and are useful independent of the process that created them. These languages can then be analyzed or used for other tasks in computational biology. To illustrate potential applications of regular grammars for DNA components, an inferred language for exon sequences is applied as post processing to Hidden Markov exon prediction to reduce the number of wrong exons detected and improve the specificity of the model significantly

    Human in the Loop: Interactive Passive Automata Learning via Evidence-Driven State-Merging Algorithms

    Get PDF
    We present an interactive version of an evidence-driven state-merging (EDSM) algorithm for learning variants of finite state automata. Learning these automata often amounts to recovering or reverse engineering the model generating the data despite noisy, incomplete, or imperfectly sampled data sources rather than optimizing a purely numeric target function. Domain expertise and human knowledge about the target domain can guide this process, and typically is captured in parameter settings. Often, domain expertise is subconscious and not expressed explicitly. Directly interacting with the learning algorithm makes it easier to utilize this knowledge effectively.Comment: 4 pages, presented at the Human in the Loop workshop at ICML 201

    Universal neural field computation

    Full text link
    Turing machines and G\"odel numbers are important pillars of the theory of computation. Thus, any computational architecture needs to show how it could relate to Turing machines and how stable implementations of Turing computation are possible. In this chapter, we implement universal Turing computation in a neural field environment. To this end, we employ the canonical symbologram representation of a Turing machine obtained from a G\"odel encoding of its symbolic repertoire and generalized shifts. The resulting nonlinear dynamical automaton (NDA) is a piecewise affine-linear map acting on the unit square that is partitioned into rectangular domains. Instead of looking at point dynamics in phase space, we then consider functional dynamics of probability distributions functions (p.d.f.s) over phase space. This is generally described by a Frobenius-Perron integral transformation that can be regarded as a neural field equation over the unit square as feature space of a dynamic field theory (DFT). Solving the Frobenius-Perron equation yields that uniform p.d.f.s with rectangular support are mapped onto uniform p.d.f.s with rectangular support, again. We call the resulting representation \emph{dynamic field automaton}.Comment: 21 pages; 6 figures. arXiv admin note: text overlap with arXiv:1204.546

    Mutation of Directed Graphs -- Corresponding Regular Expressions and Complexity of Their Generation

    Full text link
    Directed graphs (DG), interpreted as state transition diagrams, are traditionally used to represent finite-state automata (FSA). In the context of formal languages, both FSA and regular expressions (RE) are equivalent in that they accept and generate, respectively, type-3 (regular) languages. Based on our previous work, this paper analyzes effects of graph manipulations on corresponding RE. In this present, starting stage we assume that the DG under consideration contains no cycles. Graph manipulation is performed by deleting or inserting of nodes or arcs. Combined and/or multiple application of these basic operators enable a great variety of transformations of DG (and corresponding RE) that can be seen as mutants of the original DG (and corresponding RE). DG are popular for modeling complex systems; however they easily become intractable if the system under consideration is complex and/or large. In such situations, we propose to switch to corresponding RE in order to benefit from their compact format for modeling and algebraic operations for analysis. The results of the study are of great potential interest to mutation testing

    Treo: Textual Syntax for Reo Connectors

    Get PDF
    Reo is an interaction-centric model of concurrency for compositional specification of communication and coordination protocols. Formal verification tools exist to ensure correctness and compliance of protocols specified in Reo, which can readily be (re)used in different applications, or composed into more complex protocols. Recent benchmarks show that compiling such high-level Reo specifications produces executable code that can compete with or even beat the performance of hand-crafted programs written in languages such as C or Java using conventional concurrency constructs. The original declarative graphical syntax of Reo does not support intuitive constructs for parameter passing, iteration, recursion, or conditional specification. This shortcoming hinders Reo's uptake in large-scale practical applications. Although a number of Reo-inspired syntax alternatives have appeared in the past, none of them follows the primary design principles of Reo: a) declarative specification; b) all channel types and their sorts are user-defined; and c) channels compose via shared nodes. In this paper, we offer a textual syntax for Reo that respects these principles and supports flexible parameter passing, iteration, recursion, and conditional specification. In on-going work, we use this textual syntax to compile Reo into target languages such as Java, Promela, and Maude.Comment: In Proceedings MeTRiD 2018, arXiv:1806.0933

    Factory of realities: on the emergence of virtual spatiotemporal structures

    Full text link
    The ubiquitous nature of modern Information Retrieval and Virtual World give rise to new realities. To what extent are these "realities" real? Which "physics" should be applied to quantitatively describe them? In this essay I dwell on few examples. The first is Adaptive neural networks, which are not networks and not neural, but still provide service similar to classical ANNs in extended fashion. The second is the emergence of objects looking like Einsteinian spacetime, which describe the behavior of an Internet surfer like geodesic motion. The third is the demonstration of nonclassical and even stronger-than-quantum probabilities in Information Retrieval, their use. Immense operable datasets provide new operationalistic environments, which become to greater and greater extent "realities". In this essay, I consider the overall Information Retrieval process as an objective physical process, representing it according to Melucci metaphor in terms of physical-like experiments. Various semantic environments are treated as analogs of various realities. The readers' attention is drawn to topos approach to physical theories, which provides a natural conceptual and technical framework to cope with the new emerging realities.Comment: 21 p

    Regular Expression Matching and Operational Semantics

    Full text link
    Many programming languages and tools, ranging from grep to the Java String library, contain regular expression matchers. Rather than first translating a regular expression into a deterministic finite automaton, such implementations typically match the regular expression on the fly. Thus they can be seen as virtual machines interpreting the regular expression much as if it were a program with some non-deterministic constructs such as the Kleene star. We formalize this implementation technique for regular expression matching using operational semantics. Specifically, we derive a series of abstract machines, moving from the abstract definition of matching to increasingly realistic machines. First a continuation is added to the operational semantics to describe what remains to be matched after the current expression. Next, we represent the expression as a data structure using pointers, which enables redundant searches to be eliminated via testing for pointer equality. From there, we arrive both at Thompson's lockstep construction and a machine that performs some operations in parallel, suitable for implementation on a large number of cores, such as a GPU. We formalize the parallel machine using process algebra and report some preliminary experiments with an implementation on a graphics processor using CUDA.Comment: In Proceedings SOS 2011, arXiv:1108.279

    Weighted Logics for Nested Words and Algebraic Formal Power Series

    Full text link
    Nested words, a model for recursive programs proposed by Alur and Madhusudan, have recently gained much interest. In this paper we introduce quantitative extensions and study nested word series which assign to nested words elements of a semiring. We show that regular nested word series coincide with series definable in weighted logics as introduced by Droste and Gastin. For this we establish a connection between nested words and the free bisemigroup. Applying our result, we obtain characterizations of algebraic formal power series in terms of weighted logics. This generalizes results of Lautemann, Schwentick and Therien on context-free languages
    • …
    corecore