7 research outputs found

    Effective weakly supervised semantic frame induction using expression sharing in hierarchical hidden Markov models

    Get PDF
    We present a framework for the induction of semantic frames from utterances in the context of an adaptive command-and-control interface. The system is trained on an individual user's utterances and the corresponding semantic frames representing controls. During training, no prior information on the alignment between utterance segments and frame slots and values is available. In addition, semantic frames in the training data can contain information that is not expressed in the utterances. To tackle this weakly supervised classification task, we propose a framework based on Hidden Markov Models (HMMs). Structural modifications, resulting in a hierarchical HMM, and an extension called expression sharing are introduced to minimize the amount of training time and effort required for the user. The dataset used for the present study is PATCOR, which contains commands uttered in the context of a vocally guided card game, Patience. Experiments were carried out on orthographic and phonetic transcriptions of commands, segmented on different levels of n-gram granularity. The experimental results show positive effects of all the studied system extensions, with some effect differences between the different input representations. Moreover, evaluation experiments on held-out data with the optimal system configuration show that the extended system is able to achieve high accuracies with relatively small amounts of training data

    A Generative Statistical Approach for Data Classification in a Biologically Inspired Design Tool

    Get PDF
    The objective of the research this thesis describes is to find a way to classify text-based descriptions of biological adaption to support Biologically Inspired design. Biologically inspired design is a fairly new field with ongoing research. There are different tools to assist designers and biologists in bio-inspired design. Some of the most common are BioTRIZ and AskNature. In recent years, more tools have been proposed to aid and make research in the field easier, for example, the Biologically Inspired Adaptive System Design (BIASD) tool. This tool was designed with the goal of helping designers in early design stages generate more robust and innovative designs. Even though this tool offers a vast database of biological examples, many limitations have been encountered in the tool. The most noticeable is the order in which the biological examples are distributed within the tool. The process used to classify them was very subjective and does not follow a pattern. Another challenge is the way in which the user of the tool reaches the biological examples. By addressing these issues, we provide a more objective way to classify the biological adaptive strategies. To do this, we needed a meta classification in order for the questions to be rationally organized within the tool. Then, approaches such as k-means and Latent Dirichlet Allocation (LDA) techniques from machine learning were employed to minimize the randomness and increase the objectivity of the tool. Out of the two, the LDA model provided a more useful classification. A validation of the LDA model was needed, so we used perplexity, which is used in statistical models to measure the accuracy of a language model and better understand datasets. At the end, a rational classification for the analogues of the BIASD tool was generated

    Aplicação para recuperação de informações baseadas em intenções, integrado a um sistema de gestão

    Get PDF
    Os sistemas de gestão são capazes de produzir muita informação, de diversas áreas e com perspectivas diferentes, desta maneira, a simplicidade e a objetividade dão espaço a um sistema flexível, porém, complexo. Tal complexidade resulta em dificuldade e onerosidade na busca por informações do sistema. O presente trabalho visa avaliar a eficácia na utilização de Natural Language Understanding (NLU) como um middleware responsável por interpretar as intenções dos usuários. Esse middleware foi integrado a um sistema de gestão, em uma nova aplicação responsável por repassar os inputs do usuário para o processamento e traduzir as saídas geradas. Assim, permitindo que o usuário se comunique de forma natural com o sistema, facilitando a busca por elementos estratégicos em uma única aplicação. Por meio de uma pesquisa qualitativa, exploratória, de caráter experimental, de procedimento bibliográfico e documental, foi possível comprovar, através de testes, que o middleware foi eficaz na recuperação de informação e elementos estratégicos no sistema gestão.Management systems can produce a lot of information from different areas and perspectives, thus simplicity and objectivity give way to a flexible yet complex system. Such complexity results in difficulty and higher cost when searching for information within the system. The present work aims to evaluate the effectiveness in the use of Natural Language Understanding (NLU), as a middleware responsible for interpreting the users' intentions. This middleware was integrated with a management system within a new application, responsible for passing on user's inputs for processing and translating the outputs generated, therefore allowing the user to communicate naturally with the system and simplifying the search for strategic elements with one single application. Through a qualitative, exploratory, experimental research in a bibliographic and documental procedure, it was possible to prove, through tests, that the middleware was effective in recovering information and strategic elements in the management system

    Semi-automatic acquisition of domain-specific semantic structures.

    Get PDF
    Siu, Kai-Chung.Thesis (M.Phil.)--Chinese University of Hong Kong, 2000.Includes bibliographical references (leaves 99-106).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Thesis Outline --- p.5Chapter 2 --- Background --- p.6Chapter 2.1 --- Natural Language Understanding --- p.6Chapter 2.1.1 --- Rule-based Approaches --- p.7Chapter 2.1.2 --- Stochastic Approaches --- p.8Chapter 2.1.3 --- Phrase-Spotting Approaches --- p.9Chapter 2.2 --- Grammar Induction --- p.10Chapter 2.2.1 --- Semantic Classification Trees --- p.11Chapter 2.2.2 --- Simulated Annealing --- p.12Chapter 2.2.3 --- Bayesian Grammar Induction --- p.12Chapter 2.2.4 --- Statistical Grammar Induction --- p.13Chapter 2.3 --- Machine Translation --- p.14Chapter 2.3.1 --- Rule-based Approach --- p.15Chapter 2.3.2 --- Statistical Approach --- p.15Chapter 2.3.3 --- Example-based Approach --- p.16Chapter 2.3.4 --- Knowledge-based Approach --- p.16Chapter 2.3.5 --- Evaluation Method --- p.19Chapter 3 --- Semi-Automatic Grammar Induction --- p.20Chapter 3.1 --- Agglomerative Clustering --- p.20Chapter 3.1.1 --- Spatial Clustering --- p.21Chapter 3.1.2 --- Temporal Clustering --- p.24Chapter 3.1.3 --- Free Parameters --- p.26Chapter 3.2 --- Post-processing --- p.27Chapter 3.3 --- Chapter Summary --- p.29Chapter 4 --- Application to the ATIS Domain --- p.30Chapter 4.1 --- The ATIS Domain --- p.30Chapter 4.2 --- Parameters Selection --- p.32Chapter 4.3 --- Unsupervised Grammar Induction --- p.35Chapter 4.4 --- Prior Knowledge Injection --- p.40Chapter 4.5 --- Evaluation --- p.43Chapter 4.5.1 --- Parse Coverage in Understanding --- p.45Chapter 4.5.2 --- Parse Errors --- p.46Chapter 4.5.3 --- Analysis --- p.47Chapter 4.6 --- Chapter Summary --- p.49Chapter 5 --- Portability to Chinese --- p.50Chapter 5.1 --- Corpus Preparation --- p.50Chapter 5.1.1 --- Tokenization --- p.51Chapter 5.2 --- Experiments --- p.52Chapter 5.2.1 --- Unsupervised Grammar Induction --- p.52Chapter 5.2.2 --- Prior Knowledge Injection --- p.56Chapter 5.3 --- Evaluation --- p.58Chapter 5.3.1 --- Parse Coverage in Understanding --- p.59Chapter 5.3.2 --- Parse Errors --- p.60Chapter 5.4 --- Grammar Comparison Across Languages --- p.60Chapter 5.5 --- Chapter Summary --- p.64Chapter 6 --- Bi-directional Machine Translation --- p.65Chapter 6.1 --- Bilingual Dictionary --- p.67Chapter 6.2 --- Concept Alignments --- p.68Chapter 6.3 --- Translation Procedures --- p.73Chapter 6.3.1 --- The Matching Process --- p.74Chapter 6.3.2 --- The Searching Process --- p.76Chapter 6.3.3 --- Heuristics to Aid Translation --- p.81Chapter 6.4 --- Evaluation --- p.82Chapter 6.4.1 --- Coverage --- p.83Chapter 6.4.2 --- Performance --- p.86Chapter 6.5 --- Chapter Summary --- p.89Chapter 7 --- Conclusions --- p.90Chapter 7.1 --- Summary --- p.90Chapter 7.2 --- Future Work --- p.92Chapter 7.2.1 --- Suggested Improvements on Grammar Induction Process --- p.92Chapter 7.2.2 --- Suggested Improvements on Bi-directional Machine Trans- lation --- p.96Chapter 7.2.3 --- Domain Portability --- p.97Chapter 7.3 --- Contributions --- p.97Bibliography --- p.99Chapter A --- Original SQL Queries --- p.107Chapter B --- Induced Grammar --- p.109Chapter C --- Seeded Categories --- p.11

    Parsing Inside-Out

    Full text link
    The inside-outside probabilities are typically used for reestimating Probabilistic Context Free Grammars (PCFGs), just as the forward-backward probabilities are typically used for reestimating HMMs. I show several novel uses, including improving parser accuracy by matching parsing algorithms to evaluation criteria; speeding up DOP parsing by 500 times; and 30 times faster PCFG thresholding at a given accuracy level. I also give an elegant, state-of-the-art grammar formalism, which can be used to compute inside-outside probabilities; and a parser description formalism, which makes it easy to derive inside-outside formulas and many others.Comment: Ph.D. Thesis, 257 pages, 40 postscript figure
    corecore