Location of Repository

Automatic extraction of knowledge from web documents

By Harith Alani, Sanghee Kim, David E. Millard, Mark J. Weal, Paul H. Lewis, Wendy Hall and Nigel R. Shadbolt

Abstract

A large amount of digital information available is written as text documents in the form of web pages, reports, papers, emails, etc. Extracting the knowledge of interest from such documents from multiple sources in a timely fashion is therefore crucial. This paper provides an update on the Artequakt system which uses natural language tools to automatically extract knowledge about artists from multiple documents based on a predefined ontology. The ontology represents the type and form of knowledge to extract. This knowledge is then used to generate tailored biographies. The information extraction process of Artequakt is detailed and evaluated in this paper

Year: 2003
OAI identifier: oai:oro.open.ac.uk:20050
Provided by: Open Research Online

Suggested articles

Preview


To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.