thesis

A robust unification-based parser for Chinese natural language processing.

Abstract

Chan Shuen-ti Roy.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves 168-175).Abstracts in English and Chinese.Chapter 1. --- Introduction --- p.12Chapter 1.1. --- The nature of natural language processing --- p.12Chapter 1.2. --- Applications of natural language processing --- p.14Chapter 1.3. --- Purpose of study --- p.17Chapter 1.4. --- Organization of this thesis --- p.18Chapter 2. --- Organization and methods in natural language processing --- p.20Chapter 2.1. --- Organization of natural language processing system --- p.20Chapter 2.2. --- Methods employed --- p.22Chapter 2.3. --- Unification-based grammar processing --- p.22Chapter 2.3.1. --- Generalized Phase Structure Grammar (GPSG) --- p.27Chapter 2.3.2. --- Head-driven Phrase Structure Grammar (HPSG) --- p.31Chapter 2.3.3. --- Common drawbacks of UBGs --- p.33Chapter 2.4. --- Corpus-based processing --- p.34Chapter 2.4.1. --- Drawback of corpus-based processing --- p.35Chapter 3. --- Difficulties in Chinese language processing and its related works --- p.37Chapter 3.1. --- A glance at the history --- p.37Chapter 3.2. --- Difficulties in syntactic analysis of Chinese --- p.37Chapter 3.2.1. --- Writing system of Chinese causes segmentation problem --- p.38Chapter 3.2.2. --- Words serving multiple grammatical functions without inflection --- p.40Chapter 3.2.3. --- Word order of Chinese --- p.42Chapter 3.2.4. --- The Chinese grammatical word --- p.43Chapter 3.3. --- Related works --- p.45Chapter 3.3.1. --- Unification grammar processing approach --- p.45Chapter 3.3.2. --- Corpus-based processing approach --- p.48Chapter 3.4. --- Restatement of goal --- p.50Chapter 4. --- SERUP: Statistical-Enhanced Robust Unification Parser --- p.54Chapter 5. --- Step One: automatic preprocessing --- p.57Chapter 5.1. --- Segmentation of lexical tokens --- p.57Chapter 5.2. --- "Conversion of date, time and numerals" --- p.61Chapter 5.3. --- Identification of new words --- p.62Chapter 5.3.1. --- Proper nouns ´ؤ Chinese names --- p.63Chapter 5.3.2. --- Other proper nouns and multi-syllabic words --- p.67Chapter 5.4. --- Defining smallest parsing unit --- p.82Chapter 5.4.1. --- The Chinese sentence --- p.82Chapter 5.4.2. --- Breaking down the paragraphs --- p.84Chapter 5.4.3. --- Implementation --- p.87Chapter 6. --- Step Two: grammar construction --- p.91Chapter 6.1. --- Criteria in choosing a UBG model --- p.91Chapter 6.2. --- The grammar in details --- p.92Chapter 6.2.1. --- The PHON feature --- p.93Chapter 6.2.2. --- The SYN feature --- p.94Chapter 6.2.3. --- The SEM feature --- p.98Chapter 6.2.4. --- Grammar rules and features principles --- p.99Chapter 6.2.5. --- Verb phrases --- p.101Chapter 6.2.6. --- Noun phrases --- p.104Chapter 6.2.7. --- Prepositional phrases --- p.113Chapter 6.2.8. --- """Ba2"" and ""Bei4"" constructions" --- p.115Chapter 6.2.9. --- The terminal node S --- p.119Chapter 6.2.10. --- Summary of phrasal rules --- p.121Chapter 6.2.11. --- Morphological rules --- p.122Chapter 7. --- Step Three: resolving structural ambiguities --- p.128Chapter 7.1. --- Sources of ambiguities --- p.128Chapter 7.2. --- The traditional practices: an illustration --- p.132Chapter 7.3. --- Deficiency of current practices --- p.134Chapter 7.4. --- A new point of view: Wu (1999) --- p.140Chapter 7.5. --- Improvement over Wu (1999) --- p.142Chapter 7.6. --- Conclusion on semantic features --- p.146Chapter 8. --- "Implementation, performance and evaluation" --- p.148Chapter 8.1. --- Implementation --- p.148Chapter 8.2. --- Performance and evaluation --- p.150Chapter 8.2.1. --- The test set --- p.150Chapter 8.2.2. --- Segmentation of lexical tokens --- p.150Chapter 8.2.3. --- New word identification --- p.152Chapter 8.2.4. --- Parsing unit segmentation --- p.156Chapter 8.2.5. --- The grammar --- p.158Chapter 8.3. --- Overall performance of SERUP --- p.162Chapter 9. --- Conclusion --- p.164Chapter 9.1. --- Summary of this thesis --- p.164Chapter 9.2. --- Contribution of this thesis --- p.165Chapter 9.3. --- Future work --- p.166References --- p.168Appendix I --- p.176Appendix II --- p.181Appendix III --- p.18

    Similar works