8,029 research outputs found

    Logics for Unranked Trees: An Overview

    Get PDF
    Labeled unranked trees are used as a model of XML documents, and logical languages for them have been studied actively over the past several years. Such logics have different purposes: some are better suited for extracting data, some for expressing navigational properties, and some make it easy to relate complex properties of trees to the existence of tree automata for those properties. Furthermore, logics differ significantly in their model-checking properties, their automata models, and their behavior on ordered and unordered trees. In this paper we present a survey of logics for unranked trees

    Error-tolerant Finite State Recognition with Applications to Morphological Analysis and Spelling Correction

    Get PDF
    Error-tolerant recognition enables the recognition of strings that deviate mildly from any string in the regular set recognized by the underlying finite state recognizer. Such recognition has applications in error-tolerant morphological processing, spelling correction, and approximate string matching in information retrieval. After a description of the concepts and algorithms involved, we give examples from two applications: In the context of morphological analysis, error-tolerant recognition allows misspelled input word forms to be corrected, and morphologically analyzed concurrently. We present an application of this to error-tolerant analysis of agglutinative morphology of Turkish words. The algorithm can be applied to morphological analysis of any language whose morphology is fully captured by a single (and possibly very large) finite state transducer, regardless of the word formation processes and morphographemic phenomena involved. In the context of spelling correction, error-tolerant recognition can be used to enumerate correct candidate forms from a given misspelled string within a certain edit distance. Again, it can be applied to any language with a word list comprising all inflected forms, or whose morphology is fully described by a finite state transducer. We present experimental results for spelling correction for a number of languages. These results indicate that such recognition works very efficiently for candidate generation in spelling correction for many European languages such as English, Dutch, French, German, Italian (and others) with very large word lists of root and inflected forms (some containing well over 200,000 forms), generating all candidate solutions within 10 to 45 milliseconds (with edit distance 1) on a SparcStation 10/41. For spelling correction in Turkish, error-tolerantComment: Replaces 9504031. gzipped, uuencoded postscript file. To appear in Computational Linguistics Volume 22 No:1, 1996, Also available as ftp://ftp.cs.bilkent.edu.tr/pub/ko/clpaper9512.ps.

    Combining Static and Dynamic Analysis for Vulnerability Detection

    Full text link
    In this paper, we present a hybrid approach for buffer overflow detection in C code. The approach makes use of static and dynamic analysis of the application under investigation. The static part consists in calculating taint dependency sequences (TDS) between user controlled inputs and vulnerable statements. This process is akin to program slice of interest to calculate tainted data- and control-flow path which exhibits the dependence between tainted program inputs and vulnerable statements in the code. The dynamic part consists of executing the program along TDSs to trigger the vulnerability by generating suitable inputs. We use genetic algorithm to generate inputs. We propose a fitness function that approximates the program behavior (control flow) based on the frequencies of the statements along TDSs. This runtime aspect makes the approach faster and accurate. We provide experimental results on the Verisec benchmark to validate our approach.Comment: There are 15 pages with 1 figur

    Propagating Regular Counting Constraints

    Full text link
    Constraints over finite sequences of variables are ubiquitous in sequencing and timetabling. Moreover, the wide variety of such constraints in practical applications led to general modelling techniques and generic propagation algorithms, often based on deterministic finite automata (DFA) and their extensions. We consider counter-DFAs (cDFA), which provide concise models for regular counting constraints, that is constraints over the number of times a regular-language pattern occurs in a sequence. We show how to enforce domain consistency in polynomial time for atmost and atleast regular counting constraints based on the frequent case of a cDFA with only accepting states and a single counter that can be incremented by transitions. We also prove that the satisfaction of exact regular counting constraints is NP-hard and indicate that an incomplete algorithm for exact regular counting constraints is faster and provides more pruning than the existing propagator from [3]. Regular counting constraints are closely related to the CostRegular constraint but contribute both a natural abstraction and some computational advantages.Comment: Includes a SICStus Prolog source file with the propagato

    Pseudorandomness for Approximate Counting and Sampling

    Get PDF
    We study computational procedures that use both randomness and nondeterminism. The goal of this paper is to derandomize such procedures under the weakest possible assumptions. Our main technical contribution allows one to “boost” a given hardness assumption: We show that if there is a problem in EXP that cannot be computed by poly-size nondeterministic circuits then there is one which cannot be computed by poly-size circuits that make non-adaptive NP oracle queries. This in particular shows that the various assumptions used over the last few years by several authors to derandomize Arthur-Merlin games (i.e., show AM = NP) are in fact all equivalent. We also define two new primitives that we regard as the natural pseudorandom objects associated with approximate counting and sampling of NP-witnesses. We use the “boosting” theorem and hashing techniques to construct these primitives using an assumption that is no stronger than that used to derandomize AM. We observe that Cai's proof that S_2^P ⊆ PP⊆(NP) and the learning algorithm of Bshouty et al. can be seen as reductions to sampling that are not probabilistic. As a consequence they can be derandomized under an assumption which is weaker than the assumption that was previously known to suffice
    • 

    corecore