Weaving Entities into Relations: From Page Retrieval to Relation Mining on the Web

Chang, Kevin Chen-Chuan; Cheng, Tao; Chuang, Shui-Lung; Davis, William; Kelley, Joseph M.

research

Weaving Entities into Relations: From Page Retrieval to Relation Mining on the Web

Authors: Kevin Chen-Chuan Chang
Tao Cheng
Shui-Lung Chuang
William Davis
Joseph M. Kelley
Publication date: 1 November 2004
Publisher

Abstract

With its sheer amount of information, the Web is clearly an important frontier for data mining. While Web mining must start with content on the Web, there is no effective ``search-based'' mechanism to help sifting through the information on the Web. Our goal is to provide a such online search-based facility for supporting query primitives, upon which Web mining applications can be built. As a first step, this paper aims at entity-relation discovery, or E-R discovery, as a useful function-- to weave scattered entities on the Web into coherent relations. To begin with, as our proposal, we formalize the concept of E-R discovery. Further, to realize E-R discovery, as our main thesis, we abstract tuple ranking-- the essential challenge of E-R discovery-- as pattern-based cooccurrence analysis. Finally, as our key insight, we observe that such relation mining shares the same core functions as traditional page-retrieval systems, which enables us to build the new E-R discovery upon today's search engines, almost for free. We report our system prototype and testbed, WISDM-ER, with real Web corpus. Our case studies have demonstrated a high promise, achieving 83%-91% accuracy for real benchmark queries-- and thus the real possibilities of enabling ad-hoc Web mining tasks with online E-R discovery

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Illinois Digital Environment for Access to Learning and Scholarship Repository

oai:www.ideals.illinois.edu:21...

Last time updated on 22/06/2012

IDEALS @ Illinois

oai:www.ideals.illinois.edu:21...

Last time updated on 05/04/2020