This paper explores the impact of light filtering on automatic key phrase
extraction (AKE) applied to Broadcast News (BN). Key phrases are words and
expressions that best characterize the content of a document. Key phrases are
often used to index the document or as features in further processing. This
makes improvements in AKE accuracy particularly important. We hypothesized that
filtering out marginally relevant sentences from a document would improve AKE
accuracy. Our experiments confirmed this hypothesis. Elimination of as little
as 10% of the document sentences lead to a 2% improvement in AKE precision and
recall. AKE is built over MAUI toolkit that follows a supervised learning
approach. We trained and tested our AKE method on a gold standard made of 8 BN
programs containing 110 manually annotated news stories. The experiments were
conducted within a Multimedia Monitoring Solution (MMS) system for TV and radio
news/programs, running daily, and monitoring 12 TV and 4 radio channels.Comment: In 15th International Conference on Text, Speech and Dialogue (TSD
2012