thesis

Automatic bilingual text document summarization.

Abstract

Lo Sau-Han Silvia.Thesis (M.Phil.)--Chinese University of Hong Kong, 2002.Includes bibliographical references (leaves 137-143).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Definition of a summary --- p.2Chapter 1.2 --- Definition of text summarization --- p.3Chapter 1.3 --- Previous work --- p.4Chapter 1.3.1 --- Extract-based text summarization --- p.5Chapter 1.3.2 --- Abstract-based text summarization --- p.8Chapter 1.3.3 --- Sophisticated text summarization --- p.9Chapter 1.4 --- Summarization evaluation methods --- p.10Chapter 1.4.1 --- Intrinsic evaluation --- p.10Chapter 1.4.2 --- Extrinsic evaluation --- p.11Chapter 1.4.3 --- The TIPSTER SUMMAC text summarization evaluation --- p.11Chapter 1.4.4 --- Text Summarization Challenge (TSC) --- p.13Chapter 1.5 --- Research contributions --- p.14Chapter 1.5.1 --- Text summarization based on thematic term approach --- p.14Chapter 1.5.2 --- Bilingual news summarization based on an event-driven approach --- p.15Chapter 1.6 --- Thesis organization --- p.16Chapter 2 --- Text Summarization based on a Thematic Term Approach --- p.17Chapter 2.1 --- System overview --- p.18Chapter 2.2 --- Document preprocessor --- p.20Chapter 2.2.1 --- English corpus --- p.20Chapter 2.2.2 --- English corpus preprocessor --- p.22Chapter 2.2.3 --- Chinese corpus --- p.23Chapter 2.2.4 --- Chinese corpus preprocessor --- p.24Chapter 2.3 --- Corpus thematic term extractor --- p.24Chapter 2.4 --- Article thematic term extractor --- p.26Chapter 2.5 --- Sentence score generator --- p.29Chapter 2.6 --- Chapter summary --- p.30Chapter 3 --- Evaluation for Summarization using the Thematic Term Ap- proach --- p.32Chapter 3.1 --- Content-based similarity measure --- p.33Chapter 3.2 --- Experiments using content-based similarity measure --- p.36Chapter 3.2.1 --- English corpus and parameter training --- p.36Chapter 3.2.2 --- Experimental results using content-based similarity mea- sure --- p.38Chapter 3.3 --- Average inverse rank (AIR) method --- p.59Chapter 3.4 --- Experiments using average inverse rank method --- p.60Chapter 3.4.1 --- Corpora and parameter training --- p.61Chapter 3.4.2 --- Experimental results using AIR method --- p.62Chapter 3.5 --- Comparison between the content-based similarity measure and the average inverse rank method --- p.69Chapter 3.6 --- Chapter summary --- p.73Chapter 4 --- Bilingual Event-Driven News Summarization --- p.74Chapter 4.1 --- Corpora --- p.75Chapter 4.2 --- Topic and event definitions --- p.76Chapter 4.3 --- Architecture of bilingual event-driven news summarization sys- tem --- p.77Chapter 4.4 --- Bilingual event-driven approach summarization --- p.80Chapter 4.4.1 --- Dictionary-based term translation applying on English news articles --- p.80Chapter 4.4.2 --- Preprocessing for Chinese news articles --- p.89Chapter 4.4.3 --- Event clusters generation --- p.89Chapter 4.4.4 --- Cluster selection and summary generation --- p.96Chapter 4.5 --- Evaluation for summarization based on event-driven approach --- p.101Chapter 4.6 --- Experimental results on event-driven summarization --- p.103Chapter 4.6.1 --- Experimental settings --- p.103Chapter 4.6.2 --- Results and analysis --- p.105Chapter 4.7 --- Chapter summary --- p.113Chapter 5 --- Applying Event-Driven Summarization to a Parallel Corpus --- p.114Chapter 5.1 --- Parallel corpus --- p.115Chapter 5.2 --- Parallel documents preparation --- p.116Chapter 5.3 --- Evaluation methods for the event-driven summaries generated from the parallel corpus --- p.118Chapter 5.4 --- Experimental results and analysis --- p.121Chapter 5.4.1 --- Experimental settings --- p.121Chapter 5.4.2 --- Results and analysis --- p.123Chapter 5.5 --- Chapter summary --- p.132Chapter 6 --- Conclusions and Future Work --- p.133Chapter 6.1 --- Conclusions --- p.133Chapter 6.2 --- Future work --- p.135Bibliography --- p.137Chapter A --- English Stop Word List --- p.144Chapter B --- Chinese Stop Word List --- p.149Chapter C --- Event List Items on the Corpora --- p.151Chapter C.1 --- "Event list items for the topic ""Upcoming Philippine election""" --- p.151Chapter C.2 --- "Event list items for the topic ""German train derail"" " --- p.153Chapter C.3 --- "Event list items for the topic ""Electronic service delivery (ESD) scheme"" " --- p.154Chapter D --- The sample of an English article (9505001.xml). --- p.15

    Similar works