Search CORE

5,909 research outputs found

DNA ANALYSIS USING GRAMMATICAL INFERENCE

Author: Cook Cory
Publication venue: SJSU ScholarWorks
Publication date: 14/06/2016
Field of study

An accurate language definition capable of distinguishing between coding and non-coding DNA has important applications and analytical significance to the field of computational biology. The method proposed here uses positive sample grammatical inference and statistical information to infer languages for coding DNA. An algorithm is proposed for the searching of an optimal subset of input sequences for the inference of regular grammars by optimizing a relevant accuracy metric. The algorithm does not guarantee the finding of the optimal subset; however, testing shows improvement in accuracy and performance over the basis algorithm. Testing shows that the accuracy of inferred languages for components of DNA are consistently accurate. By using the proposed algorithm languages are inferred for coding DNA with average conditional probability over 80%. This reveals that languages for components of DNA can be inferred and are useful independent of the process that created them. These languages can then be analyzed or used for other tasks in computational biology. To illustrate potential applications of regular grammars for DNA components, an inferred language for exon sequences is applied as post processing to Hidden Markov exon prediction to reduce the number of wrong exons detected and improve the specificity of the model significantly

SJSU ScholarWorks

Human in the Loop: Interactive Passive Automata Learning via Evidence-Driven State-Merging Algorithms

Author: Hammerschmidt Christian A.
State Radu
Verwer Sicco
Publication venue
Publication date: 28/07/2017
Field of study

We present an interactive version of an evidence-driven state-merging (EDSM) algorithm for learning variants of finite state automata. Learning these automata often amounts to recovering or reverse engineering the model generating the data despite noisy, incomplete, or imperfectly sampled data sources rather than optimizing a purely numeric target function. Domain expertise and human knowledge about the target domain can guide this process, and typically is captured in parameter settings. Often, domain expertise is subconscious and not expressed explicitly. Directly interacting with the learning algorithm makes it easier to utilize this knowledge effectively.Comment: 4 pages, presented at the Human in the Loop workshop at ICML 201

arXiv.org e-Print Archive

Open Repository and Bibliography - Luxembourg

Universal neural field computation

Author: B. McMillan
B.O. Koopman
B.O. Koopman
C. Moore
C. Moore
C. Moore
C. Moore
D. Lind
D.R. Hofstadter
E. Ott
E.D. Sontag
G. Schöner
H.T. Siegelmann
J. Hertz
J.E. Hopcroft
J.J. Hopfield
K. Gödel
M. Budišić
P. Cvitanović
P. Smolensky
P. Smolensky
R.D. Tennent
R.W. Gayler
S.I. Amari
T. Fukai
T.A. Plate
V.K. Jirsa
V.S. Afraimovich
W. Erlhagen
W. Pitts
W. Tabor
W.S. McCulloch
Y. Sandamirskaya
Publication venue
Publication date: 12/12/2013
Field of study

Turing machines and G\"odel numbers are important pillars of the theory of computation. Thus, any computational architecture needs to show how it could relate to Turing machines and how stable implementations of Turing computation are possible. In this chapter, we implement universal Turing computation in a neural field environment. To this end, we employ the canonical symbologram representation of a Turing machine obtained from a G\"odel encoding of its symbolic repertoire and generalized shifts. The resulting nonlinear dynamical automaton (NDA) is a piecewise affine-linear map acting on the unit square that is partitioned into rectangular domains. Instead of looking at point dynamics in phase space, we then consider functional dynamics of probability distributions functions (p.d.f.s) over phase space. This is generally described by a Frobenius-Perron integral transformation that can be regarded as a neural field equation over the unit square as feature space of a dynamic field theory (DFT). Solving the Frobenius-Perron equation yields that uniform p.d.f.s with rectangular support are mapped onto uniform p.d.f.s with rectangular support, again. We call the resulting representation \emph{dynamic field automaton}.Comment: 21 pages; 6 figures. arXiv admin note: text overlap with arXiv:1204.546

arXiv.org e-Print Archive

Crossref

Mutation of Directed Graphs -- Corresponding Regular Expressions and Complexity of Their Generation

Author: A. C. Shaw
A. Gill
A. Salomaa
A. Salomaa
B. Beizer
Bianca Truthe
E. F. Moore
F. Belli
F. Belli
F. Belli
Fevzi Belli
G. H. Mealy
Giovanni Pighizzini
J. A. Brzozowski
J. E. Hopcroft
J. Hromkovic
J. Myhill
Jürgen Dassow
Mutlu Beyazit
R. A. DeMillo
R. E. Stearns
R. V. Binder
S. Gossens
V. Geffert
Y. Han
Publication venue: 'Open Publishing Association'
Publication date: 01/07/2009
Field of study

Directed graphs (DG), interpreted as state transition diagrams, are traditionally used to represent finite-state automata (FSA). In the context of formal languages, both FSA and regular expressions (RE) are equivalent in that they accept and generate, respectively, type-3 (regular) languages. Based on our previous work, this paper analyzes effects of graph manipulations on corresponding RE. In this present, starting stage we assume that the DG under consideration contains no cycles. Graph manipulation is performed by deleting or inserting of nodes or arcs. Combined and/or multiple application of these basic operators enable a great variety of transformations of DG (and corresponding RE) that can be seen as mutants of the original DG (and corresponding RE). DG are popular for modeling complex systems; however they easily become intractable if the system under consideration is complex and/or large. In such situations, we propose to switch to corresponding RE in order to benefit from their compact format for modeling and algebraic operations for analysis. The results of the study are of great potential interest to mutation testing

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Treo: Textual Syntax for Reo Connectors

Author: Arbab Farhad
Dokter Kasper
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2018
Field of study

Reo is an interaction-centric model of concurrency for compositional specification of communication and coordination protocols. Formal verification tools exist to ensure correctness and compliance of protocols specified in Reo, which can readily be (re)used in different applications, or composed into more complex protocols. Recent benchmarks show that compiling such high-level Reo specifications produces executable code that can compete with or even beat the performance of hand-crafted programs written in languages such as C or Java using conventional concurrency constructs. The original declarative graphical syntax of Reo does not support intuitive constructs for parameter passing, iteration, recursion, or conditional specification. This shortcoming hinders Reo's uptake in large-scale practical applications. Although a number of Reo-inspired syntax alternatives have appeared in the past, none of them follows the primary design principles of Reo: a) declarative specification; b) all channel types and their sorts are user-defined; and c) channels compose via shared nodes. In this paper, we offer a textual syntax for Reo that respects these principles and supports flexible parameter passing, iteration, recursion, and conditional specification. In on-going work, we use this textual syntax to compile Reo into target languages such as Java, Promela, and Maude.Comment: In Proceedings MeTRiD 2018, arXiv:1806.0933

arXiv.org e-Print Archive

CWI's Institutional Repository

Leiden University Scholary Publications

Factory of realities: on the emergence of virtual spatiotemporal structures

Author: Zapatrin Roman
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 30/11/2016
Field of study

The ubiquitous nature of modern Information Retrieval and Virtual World give rise to new realities. To what extent are these "realities" real? Which "physics" should be applied to quantitatively describe them? In this essay I dwell on few examples. The first is Adaptive neural networks, which are not networks and not neural, but still provide service similar to classical ANNs in extended fashion. The second is the emergence of objects looking like Einsteinian spacetime, which describe the behavior of an Internet surfer like geodesic motion. The third is the demonstration of nonclassical and even stronger-than-quantum probabilities in Information Retrieval, their use. Immense operable datasets provide new operationalistic environments, which become to greater and greater extent "realities". In this essay, I consider the overall Information Retrieval process as an objective physical process, representing it according to Melucci metaphor in terms of physical-like experiments. Various semantic environments are treated as analogs of various realities. The readers' attention is drawn to topos approach to physical theories, which provides a natural conceptual and technical framework to cope with the new emerging realities.Comment: 21 p

arXiv.org e-Print Archive

Crossref

Regular Expression Matching and Operational Semantics

Author: Alan P. Sexton
Andrew Appel
Asiri Rathnayake
Hayo Thielecke
John C. Reynolds
Josh Berdine
Ken Thompson
M.A. Reniers
Matthias Felleisen
P. Sobocinski
Peter J. Landin
Robin Milner
Robin Milner
Robin Milner
Stanley Tzeng
Publication venue: 'Open Publishing Association'
Publication date: 01/08/2011
Field of study

Many programming languages and tools, ranging from grep to the Java String library, contain regular expression matchers. Rather than first translating a regular expression into a deterministic finite automaton, such implementations typically match the regular expression on the fly. Thus they can be seen as virtual machines interpreting the regular expression much as if it were a program with some non-deterministic constructs such as the Kleene star. We formalize this implementation technique for regular expression matching using operational semantics. Specifically, we derive a series of abstract machines, moving from the abstract definition of matching to increasingly realistic machines. First a continuation is added to the operational semantics to describe what remains to be matched after the current expression. Next, we represent the expression as a data structure using pointers, which enables redundant searches to be eliminated via testing for pointer equality. From there, we arrive both at Thompson's lockstep construction and a machine that performs some operations in parallel, suitable for implementation on a large number of cores, such as a GPU. We formalize the parallel machine using process algebra and report some preliminary experiments with an implementation on a graphics processor using CUDA.Comment: In Proceedings SOS 2011, arXiv:1108.279

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Weighted Logics for Nested Words and Algebraic Formal Power Series

Author: Mathissen Christian
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 19/02/2010
Field of study

Nested words, a model for recursive programs proposed by Alur and Madhusudan, have recently gained much interest. In this paper we introduce quantitative extensions and study nested word series which assign to nested words elements of a semiring. We show that regular nested word series coincide with series definable in weighted logics as introduced by Droste and Gastin. For this we establish a connection between nested words and the free bisemigroup. Applying our result, we obtain characterizations of algebraic formal power series in terms of weighted logics. This generalizes results of Lautemann, Schwentick and Therien on context-free languages

arXiv.org e-Print Archive

Episciences.org