3 research outputs found

    Semi-automatic grammar induction for bidirectional machine translation.

    Get PDF
    Wong, Chin Chung.Thesis (M.Phil.)--Chinese University of Hong Kong, 2002.Includes bibliographical references (leaves 137-143).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Objectives --- p.3Chapter 1.2 --- Thesis Outline --- p.5Chapter 2 --- Background in Natural Language Understanding --- p.6Chapter 2.1 --- Rule-based Approaches --- p.7Chapter 2.2 --- Corpus-based Approaches --- p.8Chapter 2.2.1 --- Stochastic Approaches --- p.8Chapter 2.2.2 --- Phrase-spotting Approaches --- p.9Chapter 2.3 --- The ATIS Domain --- p.10Chapter 2.3.1 --- Chinese Corpus Preparation --- p.11Chapter 3 --- Semi-automatic Grammar Induction - Baseline Approach --- p.13Chapter 3.1 --- Background in Grammar Induction --- p.13Chapter 3.1.1 --- Simulated Annealing --- p.14Chapter 3.1.2 --- Bayesian Grammar Induction --- p.14Chapter 3.1.3 --- Probabilistic Grammar Acquisition --- p.15Chapter 3.2 --- Semi-automatic Grammar Induction 一 Baseline Approach --- p.16Chapter 3.2.1 --- Spatial Clustering --- p.16Chapter 3.2.2 --- Temporal Clustering --- p.18Chapter 3.2.3 --- Post-processing --- p.19Chapter 3.2.4 --- Four Aspects for Enhancements --- p.20Chapter 3.3 --- Chapter Summary --- p.22Chapter 4 --- Semi-automatic Grammar Induction - Enhanced Approach --- p.23Chapter 4.1 --- Evaluating Induced Grammars --- p.24Chapter 4.2 --- Stopping Criterion --- p.26Chapter 4.2.1 --- Cross-checking with Recall Values --- p.29Chapter 4.3 --- Improvements on Temporal Clustering --- p.32Chapter 4.3.1 --- Evaluation --- p.39Chapter 4.4 --- Improvements on Spatial Clustering --- p.46Chapter 4.4.1 --- Distance Measures --- p.48Chapter 4.4.2 --- Evaluation --- p.57Chapter 4.5 --- Enhancements based on Intelligent Selection --- p.62Chapter 4.5.1 --- Informed Selection between Spatial Clustering and Tem- poral Clustering --- p.62Chapter 4.5.2 --- Selecting the Number of Clusters Per Iteration --- p.64Chapter 4.5.3 --- An Example for Intelligent Selection --- p.64Chapter 4.5.4 --- Evaluation --- p.68Chapter 4.6 --- Chapter Summary --- p.71Chapter 5 --- Bidirectional Machine Translation using Induced Grammars ´ؤBaseline Approach --- p.73Chapter 5.1 --- Background in Machine Translation --- p.75Chapter 5.1.1 --- Rule-based Machine Translation --- p.75Chapter 5.1.2 --- Statistical Machine Translation --- p.76Chapter 5.1.3 --- Knowledge-based Machine Translation --- p.77Chapter 5.1.4 --- Example-based Machine Translation --- p.78Chapter 5.1.5 --- Evaluation --- p.79Chapter 5.2 --- Baseline Configuration on Bidirectional Machine Translation System --- p.84Chapter 5.2.1 --- Bilingual Dictionary --- p.84Chapter 5.2.2 --- Concept Alignments --- p.85Chapter 5.2.3 --- Translation Process --- p.89Chapter 5.2.4 --- Two Aspects for Enhancements --- p.90Chapter 5.3 --- Chapter Summary --- p.91Chapter 6 --- Bidirectional Machine Translation ´ؤ Enhanced Approach --- p.92Chapter 6.1 --- Concept Alignments --- p.93Chapter 6.1.1 --- Enhanced Alignment Scheme --- p.95Chapter 6.1.2 --- Experiment --- p.97Chapter 6.2 --- Grammar Checker --- p.100Chapter 6.2.1 --- Components for Grammar Checking --- p.101Chapter 6.3 --- Evaluation --- p.117Chapter 6.3.1 --- Bleu Score Performance --- p.118Chapter 6.3.2 --- Modified Bleu Score --- p.122Chapter 6.4 --- Chapter Summary --- p.130Chapter 7 --- Conclusions --- p.131Chapter 7.1 --- Summary --- p.131Chapter 7.2 --- Contributions --- p.134Chapter 7.3 --- Future work --- p.136Bibliography --- p.137Chapter A --- Original SQL Queries --- p.144Chapter B --- Seeded Categories --- p.146Chapter C --- 3 Alignment Categories --- p.147Chapter D --- Labels of Syntactic Structures in Grammar Checker --- p.14

    Analysis and Design of Speech-Recognition Grammars

    Get PDF
    Currently, most commercial speech-enabled products are constructed using grammar-based technology. Grammar design is a critical issue for good recognition accuracy. Two methods are commonly used for creating grammars: 1) to generate them automatically from a large corpus of input data which is very costly to acquire, or 2) to construct them using an iterative process involving manual design, followed by testing with end-user speech input. This is a time-consuming and very expensive process requiring expert knowledge of language design, as well as the application area. Another hurdle to the creation and use of speech-enabled applications is that expertise is also required to integrate the speech capability with the application code and to deploy the application for wide-scale use. An alternative approach, which we propose, is 1) to construct them using the iterative process described above, but to replace end-user testing by analysis of the recognition grammars using a set of grammar metrics which have been shown to be good indicators of recognition accuracy, 2) to improve recognition accuracy in the design process by encoding semantic constraints in the syntax rules of the grammar, 3) to augment the above process by generating recognition grammars automatically from specifications of the application, and 4) to use tools for creating speech-enabled applications together with an architecture for their deployment which enables expert users, as well as users who do not have expertise in language processing, to easily build speech applications and add them to the web
    corecore