Automatic extraction of knowledge from web documents

Alani, Harith; Kim, Sanghee; Millard, David E.; Weal, Mark J.; Lewis, Paul H.; Hall, Wendy; Shadbolt, Nigel R.

research

oai:oro.open.ac.uk:20050

Automatic extraction of knowledge from web documents

Authors: Harith Alani
Sanghee Kim
David E. Millard
Mark J. Weal
Paul H. Lewis
Wendy Hall
Nigel R. Shadbolt
Publication date: 1 January 2003
Publisher

Abstract

A large amount of digital information available is written as text documents in the form of web pages, reports, papers, emails, etc. Extracting the knowledge of interest from such documents from multiple sources in a timely fashion is therefore crucial. This paper provides an update on the Artequakt system which uses natural language tools to automatically extract knowledge about artists from multiple documents based on a predefined ontology. The ontology represents the type and form of knowledge to extract. This knowledge is then used to generate tailored biographies. The information extraction process of Artequakt is detailed and evaluated in this paper

Similar works

Full text

Open in the Core reader

Download PDF

Open Research Online (The Open University)

oai:oro.open.ac.uk:20050

Last time updated on 26/06/2012Provided by our Sustaining member

This paper was published in Open Research Online (The Open University).

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.