Location of Repository

A Comparative Evaluation of a New Unsupervised Sentence Boundary Detection Approach on Documents in English and Portuguese

By Jan Strunk, Carlos N. Silla Jr and Celso A.A. Kaestner

Abstract

In this paper, we describe a new unsupervised sentence boundary detection system and present a comparative study evaluating its performance against dierent systems found in the literature that have been used to perform the task of automatic text segmentation into sentences for English and Portuguese documents. The results achieved by this new approach were as good as those of the previous systems, especially considering that the method does not require any additional training resources

Topics: QA76
Publisher: Springer
Year: 2006
OAI identifier: oai:kar.kent.ac.uk:24093

Suggested articles

Preview

Citations

  1. (1997). A.: A maximum entropy approach to identifying sentence boundaries. In: doi
  2. (1993). Accurate methods for the statistics of surprise and coincidence.
  3. (1997). Adaptive multilingual sentence boundary disambiguation. doi
  4. (2004). C.A.A.: An analysis of sentence boundary detection systems for English and Portuguese documents. doi
  5. (1999). Foundations of Statistical Natural Language Processing.
  6. (2003). H.R.: How much information. Retrieved from http://www.sims.berkeley.edu/how-much-info-2003 on [01/19/2004]
  7. Multilingual unsupervised sentence boundary detection. http://www.linguistics.rub.de/∼strunk/ks2005FINAL.pdf (Under Review) doi
  8. Multilingual unsupervised sentence boundary detection. http://www.linguistics.rub.de/strunk/ks2005FINAL.pdf (Under Review) doi
  9. (2002). Scaled log likelihood ratios for the detection of abbreviations in text corpora, doi
  10. (1999). Sch utze, H.: Foundations of Statistical Natural Language Processing.
  11. (2003). The LacioWeb Project: Overview and issues in Brazilian Portuguese corpora creation. In:
  12. (1990). The Linguistics of Punctuation. doi
  13. (2002). Viewing sentence boundary detection as collocation identi
  14. (2002). Viewing sentence boundary detection as collocation identification,

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.