5 research outputs found

    Unambiguity of SGML Content Models - Pushdown Automata Revisited

    No full text
    s and 1/TechReports/FullText Via WWW: URL http://www.informatik.uni-trier.de/Reports/List Via email: Send a mail to [email protected], subject 'MAIL ME CLEAR', body 'TechReports.HowTo' followed by an empty line, for detailed instructions Printed copies: Trierer Forschungsberichte Fachbereich IV - Mathematik / Informatik Universitat Trier .. D-54286 Trier ISSN 0944-0488 Forschungsbericht Nr. 97--05 Unambiguity of SGML Content Models -- Pushdown Automata Revisited Andreas Neumann Unambiguity of SGML Content Models -- Pushdown Automata Revisited Andreas Neumann Abteilung Informatik Universitat Trier 54286 Trier, Germany E-Mail: [email protected] March 24, 1997 Abstract We consider the property of unambiguity for regular expressions, extended by an additional operator &. It denotes concatenation in any order, and must have arbitrary arity since it is not associative. This extension gives us high succinctness in expressing equivalent regular expressions wit..

    Unambiguity of SGML Content Models - Pushdown Automata Revisited

    No full text
    We consider the property of unambiguity for regular expressions, extended by an additional operator &. This denotes concatenation in any order, and must have arbitrary arity since it is not associative. This extension gives us high succinctness in expressing equivalent regular expressions without &. The property of unambiguity means for a regular expression e, that a symbol in a word from its language must not match two different occurrences of that symbol in e without lookahead. We extend this notion to &-unambiguity which helps us deal with operator &. We then give a first method for deciding in polynomial time whether a regular expression with & is &-unambiguous, and if it is, whether it is unambiguous. Our method is constructive --- it provides a deterministic automaton with polynomial representation thats accepts the language of the expression, if it is unambiguous and &-unambiguos. If it is only unambiguous then the automaton accepts a subset of the language. 1 Introduction The ..

    Unambiguity of SGML Content Models -- Pushdown Automata Revisited

    No full text
    We consider the property of unambiguity for regular expressions, extended by an additional operator &. It denotes concatenation in any order, and must have arbitrary arity since it is not associative. This extension gives us high succinctness in expressing equivalent regular expressions wit..

    Unsupervised grammar induction with Combinatory Categorial Grammars

    Get PDF
    Language is a highly structured medium for communication. An idea starts in the speaker's mind (semantics) and is transformed into a well formed, intelligible, sentence via the specific syntactic rules of a language. We aim to discover the fingerprints of this process in the choice and location of words used in the final utterance. What is unclear is how much of this latent process can be discovered from the linguistic signal alone and how much requires shared non-linguistic context, knowledge, or cues. Unsupervised grammar induction is the task of analyzing strings in a language to discover the latent syntactic structure of the language without access to labeled training data. Successes in unsupervised grammar induction shed light on the amount of syntactic structure that is discoverable from raw or part-of-speech tagged text. In this thesis, we present a state-of-the-art grammar induction system based on Combinatory Categorial Grammars. Our choice of syntactic formalism enables the first labeled evaluation of an unsupervised system. This allows us to perform an in-depth analysis of the system’s linguistic strengths and weaknesses. In order to completely eliminate reliance on any supervised systems, we also examine how performance is affected when we use induced word clusters instead of gold-standard POS tags. Finally, we perform a semantic evaluation of induced grammars, providing unique insights into future directions for unsupervised grammar induction systems
    corecore