735 research outputs found

    A grammatical specification of human-computer dialogue

    Get PDF
    The Seeheim Model of human-computer interaction partitions an interactive application into a user-interface, a dialogue controller and the application itself. One of the formal techniques of implementing the dialogue controller is based on context-free grammars and automata. In this work, we modify an off-the-shelf compiler generator (YACC) to generate the dialogue controller. The dialogue controller is then integrated into the popular X-window system, to create an interactive-application generator. The actions of the user drive the automaton, which in turn controls the application

    Probabilistic parsing

    Get PDF
    Postprin

    An investigation of grammar design in natural-language speech-recognition.

    Get PDF
    With the growing interest and demand for human-machine interaction, much work concerning speech-recognition has been carried out over the past three decades. Although a variety of approaches have been proposed to address speech-recognition issues, such as stochastic (statistical) techniques, grammar-based techniques, techniques integrated with linguistic features, and other approaches, recognition accuracy and robustness remain among the major problems that need to be addressed. At the state of the art, most commercial speech products are constructed using grammar-based speech-recognition technology. In this thesis, we investigate a number of features involved in grammar design in natural-language speech-recognition technology. We hypothesize that: with the same domain, a semantic grammar, which directly encodes some semantic constraints into the recognition grammar, achieves better accuracy, but less robustness; a syntactic grammar defines a language with a larger size, thereby it has better robustness, but less accuracy; a word-sequence grammar, which includes neither semantics nor syntax, defines the largest language, therefore, is the most robust, but has very poor recognition accuracy. In this Master\u27s thesis, we claim that proper grammar design can achieve the appropriate compromise between recognition accuracy and robustness. The thesis has been proven by experiments using the IBM Voice-Server SDK, which consists of a VoiceXML browser, IBM ViaVoice Speech Recognition and Text-To-Speech (TTS) engines, sample applications, and other tools for developing and testing VoiceXML applications. The experimental grammars are written in the Java Speech Grammar Format (JSGF), and the testing applications are written in VoiceXML. The tentative experimental results suggest that grammar design is a good area for further study. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2003 .S555. Source: Masters Abstracts International, Volume: 43-01, page: 0244. Adviser: Richard A. Frost. Thesis (M.Sc.)--University of Windsor (Canada), 2004

    An investigation of the electrolytic plasma oxidation process for corrosion protection of pure magnesium and magnesium alloy AM50.

    Get PDF
    In this study, silicate and phosphate EPO coatings were produced on pure magnesium using an AC power source. It was found that the silicate coatings possess good wear resistance, while the phosphate coatings provide better corrosion protection. A Design of Experiment (DOE) technique, the Taguchi method, was used to systematically investigate the effect of the EPO process parameters on the corrosion protection properties of a coated magnesium alloy AM50 using a DC power. The experimental design consisted of four factors (treatment time, current density, and KOH and NaAlO2 concentrations), with three levels of each factor. Potentiodynamic polarization measurements were conducted to determine the corrosion resistance of the coated samples. The optimized processing parameters are 12 minutes, 12 mA/cm2 current density, 0.9 g/l KOH, 15.0 g/l NaAlO2. The results of the percentage contribution of each factor determined by the analysis of variance (ANOVA) imply that the KOH concentration is the most significant factor affecting the corrosion resistance of the coatings, while treatment time is a major factor affecting the thickness of the coatings. (Abstract shortened by UMI.)Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .M323. Source: Masters Abstracts International, Volume: 44-03, page: 1479. Thesis (M.A.Sc.)--University of Windsor (Canada), 2005

    Detecting grammatical errors with treebank-induced, probabilistic parsers

    Get PDF
    Today's grammar checkers often use hand-crafted rule systems that define acceptable language. The development of such rule systems is labour-intensive and has to be repeated for each language. At the same time, grammars automatically induced from syntactically annotated corpora (treebanks) are successfully employed in other applications, for example text understanding and machine translation. At first glance, treebank-induced grammars seem to be unsuitable for grammar checking as they massively over-generate and fail to reject ungrammatical input due to their high robustness. We present three new methods for judging the grammaticality of a sentence with probabilistic, treebank-induced grammars, demonstrating that such grammars can be successfully applied to automatically judge the grammaticality of an input string. Our best-performing method exploits the differences between parse results for grammars trained on grammatical and ungrammatical treebanks. The second approach builds an estimator of the probability of the most likely parse using grammatical training data that has previously been parsed and annotated with parse probabilities. If the estimated probability of an input sentence (whose grammaticality is to be judged by the system) is higher by a certain amount than the actual parse probability, the sentence is flagged as ungrammatical. The third approach extracts discriminative parse tree fragments in the form of CFG rules from parsed grammatical and ungrammatical corpora and trains a binary classifier to distinguish grammatical from ungrammatical sentences. The three approaches are evaluated on a large test set of grammatical and ungrammatical sentences. The ungrammatical test set is generated automatically by inserting common grammatical errors into the British National Corpus. The results are compared to two traditional approaches, one that uses a hand-crafted, discriminative grammar, the XLE ParGram English LFG, and one based on part-of-speech n-grams. In addition, the baseline methods and the new methods are combined in a machine learning-based framework, yielding further improvements

    Adapting and developing linguistic resources for question answering

    Get PDF
    As information retrieval becomes more focussed, so too must the techniques involved in the retrieval process. More precise responses to queries require more precise linguistic analysis of both the queries and the factual documents from which the information is being retrieved. In this thesis, I present research into using existing linguistic tools to analyse questions. These tools, as supplied, often underperform on question analysis. I present my work on adapting these tools, and creating new resources for use in developing new tools tailored to question analysis. My work has shown that in order to adapt the treebank- and f-structure annotation algorithmbased wide coverage LFG parsing resources of Cahill et al. (2004) to analyse questions from the ATIS corpus, only the c-structure parser needs to be retrained, the annotation algorithm remains unchanged. The retrained c-structure parser needs only a small amount of appropriate training data added to its training corpus to gain a significant improvement in both c-structure parsing and f-structure annotation. Given the improvements made with a relatively small amount of question data, I developed QuestionBank, a question treebank, to determine what further gains can be made using a larger amount of question data. My question treebank is a corpus of 4000 parse annotated questions. The questions were taken from a number of sources and the question treebank was ā€œbootstrappedā€ in an incremental parsing, hand correction and retraining approach from raw data using existing probabilistic parsing resources. Experiments with QuestionBank show that it is an effective resource for training parsers to analyse questions with an improvement of over 10% on the baseline parsing results. In further experiments I show that a parser retrained with QuestionBank can also parse newspaper text (Penn-II Treebank Section 23) with state-of-the-art accuracy. Long distance dependencies (LDDs) are a vital part of question analysis in determining semantic roles and question focus. I have designed and implemented a novel method to recover WH-traces and coindexed antecedents in c-structure trees from parser output which uses the f-structure LDD resolution method of Cahill et al (2004) to resolve the dependencies and then ā€œreverse engineersā€ the corresponding syntactic components in the c-structure tree

    Pattern Matching and Discourse Processing in Information Extraction from Japanese Text

    Full text link
    Information extraction is the task of automatically picking up information of interest from an unconstrained text. Information of interest is usually extracted in two steps. First, sentence level processing locates relevant pieces of information scattered throughout the text; second, discourse processing merges coreferential information to generate the output. In the first step, pieces of information are locally identified without recognizing any relationships among them. A key word search or simple pattern search can achieve this purpose. The second step requires deeper knowledge in order to understand relationships among separately identified pieces of information. Previous information extraction systems focused on the first step, partly because they were not required to link up each piece of information with other pieces. To link the extracted pieces of information and map them onto a structured output format, complex discourse processing is essential. This paper reports on a Japanese information extraction system that merges information using a pattern matcher and discourse processor. Evaluation results show a high level of system performance which approaches human performance.Comment: See http://www.jair.org/ for any accompanying file

    Neural Combinatory Constituency Parsing

    Get PDF
    ę±äŗ¬éƒ½ē«‹å¤§å­¦Tokyo Metropolitan University博士ļ¼ˆęƒ…å ±ē§‘å­¦ļ¼‰doctoral thesi
    • ā€¦
    corecore