592 research outputs found

    Nodalida 2005 - proceedings of the 15th NODALIDA conference

    Get PDF

    Natural language response generation in mixed-initiative dialogs.

    Get PDF
    Yip Wing Lin Winnie.Thesis (M.Phil.)--Chinese University of Hong Kong, 2004.Includes bibliographical references (leaves 102-105).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Overview --- p.1Chapter 1.2 --- Thesis Goals --- p.3Chapter 1.3 --- Thesis Outline --- p.5Chapter 2 --- Background --- p.6Chapter 2.1 --- Natural Language Generation --- p.6Chapter 2.1.1 --- Template-based Approach --- p.7Chapter 2.1.2 --- Rule-based Approach --- p.8Chapter 2.1.3 --- Statistical Approach --- p.9Chapter 2.1.4 --- Hybrid Approach --- p.10Chapter 2.1.5 --- Machine Learning Approach --- p.11Chapter 2.2 --- Evaluation Method --- p.12Chapter 2.2.1 --- Cooperative Principles --- p.13Chapter 2.3 --- Chapter Summary --- p.13Chapter 3 --- Natural Language Understanding --- p.14Chapter 3.1 --- The CUHK Restaurant Domain --- p.15Chapter 3.2 --- "Task Goals, Dialog Acts, Concept Categories and Annotation" --- p.17Chapter 3.2.1 --- Task Goals (TGs) and Dialog Acts (DAs) --- p.17Chapter 3.2.2 --- Concept Categories (CTG/CDA) --- p.20Chapter 3.2.3 --- Utterance Segmentation and Annotation --- p.21Chapter 3.3 --- Task Goal and Dialog Act Identification --- p.22Chapter 3.3.1 --- Belief Networks Development --- p.22Chapter 3.3.2 --- Task Goal and Dialog Act Inference --- p.24Chapter 3.3.3 --- Network Dimensions --- p.25Chapter 3.4 --- Chapter Summary --- p.29Chapter 4 --- Automatic Utterance Segmentation --- p.30Chapter 4.1 --- Utterance Definition --- p.31Chapter 4.2 --- Segmentation Procedure --- p.33Chapter 4.2.1 --- Tokenization --- p.35Chapter 4.2.2 --- POS Tagging --- p.36Chapter 4.2.3 --- Multi-Parser Architecture (MPA) Language Parsing --- p.38Chapter 4.2.4 --- Top-down Generalized Representation --- p.40Chapter 4.3 --- Evaluation --- p.47Chapter 4.3.1 --- Results --- p.47Chapter 4.3.2 --- Analysis --- p.48Chapter 4.4 --- Chapter Summary --- p.50Chapter 5 --- Natural Language Response Generation --- p.52Chapter 5.1 --- System Overview --- p.52Chapter 5.2 --- Corpus-derived Dialog State Transition Rules --- p.55Chapter 5.3 --- Hand-designed Text Generation Templates --- p.56Chapter 5.4 --- Performance Evaluation --- p.59Chapter 5.4.1 --- Task Completion Rate --- p.61Chapter 5.4.2 --- Grice's Maxims and Perceived User Satisfaction --- p.62Chapter 5.4.3 --- Error Analysis --- p.64Chapter 5.5 --- Chapter Summary --- p.65Chapter 6 --- Bilingual Response Generation using Semi-Automatically- Induced Response Templates --- p.67Chapter 6.1 --- Response Data --- p.68Chapter 6.2 --- Semi-Automatic Grammar Induction --- p.69Chapter 6.2.1 --- Agglomerative Clustering --- p.69Chapter 6.2.2 --- Parameters Selection --- p.70Chapter 6.3 --- Application to Response Grammar Induction --- p.71Chapter 6.3.1 --- Parameters Selection --- p.73Chapter 6.3.2 --- Unsupervised Grammar Induction --- p.76Chapter 6.3.3 --- Post-processing --- p.80Chapter 6.3.4 --- Prior Knowledge Injection --- p.82Chapter 6.4 --- Response Templates Generation --- p.84Chapter 6.4.1 --- Induced Response Grammar --- p.84Chapter 6.4.2 --- Template Formation --- p.84Chapter 6.4.3 --- Bilingual Response Templates --- p.89Chapter 6.5 --- Evaluation --- p.89Chapter 6.5.1 --- "Task Completion Rate, Grice's Maxims and User Sat- isfaction" --- p.91Chapter 6.6 --- Chapter Summary --- p.94Chapter 7 --- Conclusion --- p.96Chapter 7.1 --- Summary --- p.96Chapter 7.2 --- Contributions --- p.98Chapter 7.3 --- Future Work --- p.100Bibliography --- p.102Chapter A --- Domain-Specific Task Goals in the CUHK Restaurants Do- main --- p.107Chapter B --- Full List of VERBMOBIL-2 Dialog Acts --- p.109Chapter C --- Dialog Acts for Customer Requests and Waiter Responsesin the CUHK Restaurants Domain --- p.111Chapter D --- Grammar for Task Goal and Dialog Act Identification --- p.116Chapter E --- Utterance Definition --- p.119Chapter F --- Dialog State Transition Rules --- p.121Chapter G --- Full List of Templates Selection Conditions --- p.125Chapter H --- Hand-designed Text Generation Templates --- p.130Chapter I --- Evaluation Test Questionnaire for Dialog System in the CUHK Restaurant Domain --- p.135Chapter J --- POS Tags --- p.137Chapter K --- Full List of Lexicon and contextual rule modifications --- p.139Chapter L --- Top-down Generalized Representations --- p.141Chapter M --- Sample Outputs for Automatic Utterance Segmentation --- p.144Chapter N --- Induced Grammar --- p.145Chapter O --- Seeded Categories --- p.148Chapter P --- Semi-Automatically-Induced Response Templates --- p.150Chapter Q --- Details of the Statistical Testing Regarding Grice's Maxims and User Satisfaction --- p.15

    Satellite Workshop On Language, Artificial Intelligence and Computer Science for Natural Language Processing Applications (LAICS-NLP): Discovery of Meaning from Text

    Get PDF
    This paper proposes a novel method to disambiguate important words from a collection of documents. The hypothesis that underlies this approach is that there is a minimal set of senses that are significant in characterizing a context. We extend Yarowsky’s one sense per discourse [13] further to a collection of related documents rather than a single document. We perform distributed clustering on a set of features representing each of the top ten categories of documents in the Reuters-21578 dataset. Groups of terms that have a similar term distributional pattern across documents were identified. WordNet-based similarity measurement was then computed for terms within each cluster. An aggregation of the associations in WordNet that was employed to ascertain term similarity within clusters has provided a means of identifying clusters’ root senses

    Proceedings of the COLING 2004 Post Conference Workshop on Multilingual Linguistic Ressources MLR2004

    No full text
    International audienceIn an ever expanding information society, most information systems are now facing the "multilingual challenge". Multilingual language resources play an essential role in modern information systems. Such resources need to provide information on many languages in a common framework and should be (re)usable in many applications (for automatic or human use). Many centres have been involved in national and international projects dedicated to building har- monised language resources and creating expertise in the maintenance and further development of standardised linguistic data. These resources include dictionaries, lexicons, thesauri, word-nets, and annotated corpora developed along the lines of best practices and recommendations. However, since the late 90's, most efforts in scaling up these resources remain the responsibility of the local authorities, usually, with very low funding (if any) and few opportunities for academic recognition of this work. Hence, it is not surprising that many of the resource holders and developers have become reluctant to give free access to the latest versions of their resources, and their actual status is therefore currently rather unclear. The goal of this workshop is to study problems involved in the development, management and reuse of lexical resources in a multilingual context. Moreover, this workshop provides a forum for reviewing the present state of language resources. The workshop is meant to bring to the international community qualitative and quantitative information about the most recent developments in the area of linguistic resources and their use in applications. The impressive number of submissions (38) to this workshop and in other workshops and conferences dedicated to similar topics proves that dealing with multilingual linguistic ressources has become a very hot problem in the Natural Language Processing community. To cope with the number of submissions, the workshop organising committee decided to accept 16 papers from 10 countries based on the reviewers' recommendations. Six of these papers will be presented in a poster session. The papers constitute a representative selection of current trends in research on Multilingual Language Resources, such as multilingual aligned corpora, bilingual and multilingual lexicons, and multilingual speech resources. The papers also represent a characteristic set of approaches to the development of multilingual language resources, such as automatic extraction of information from corpora, combination and re-use of existing resources, online collaborative development of multilingual lexicons, and use of the Web as a multilingual language resource. The development and management of multilingual language resources is a long-term activity in which collaboration among researchers is essential. We hope that this workshop will gather many researchers involved in such developments and will give them the opportunity to discuss, exchange, compare their approaches and strengthen their collaborations in the field. The organisation of this workshop would have been impossible without the hard work of the program committee who managed to provide accurate reviews on time, on a rather tight schedule. We would also like to thank the Coling 2004 organising committee that made this workshop possible. Finally, we hope that this workshop will yield fruitful results for all participants

    Natural Language Processing: Emerging Neural Approaches and Applications

    Get PDF
    This Special Issue highlights the most recent research being carried out in the NLP field to discuss relative open issues, with a particular focus on both emerging approaches for language learning, understanding, production, and grounding interactively or autonomously from data in cognitive and neural systems, as well as on their potential or real applications in different domains
    • …
    corecore