A Frame Work for Text Mining using Learned Information Extraction System

Abstract

Text mining is a very exciting research area as it tries to discover knowledge from unstructured texts These texts can be found on a computer desktop intranets and the internet The aim of this paper is to give an overview of text mining in the contexts of its techniques application domains and the most challenging issue The Learned Information Extraction LIE is about locating specific items in natural-language documents This paper presents a framework for text mining called DTEX Discovery Text Extraction using a learned information extraction system to transform text into more structured data which is then mined for interesting relationships The initial version of DTEX integrates an LIE module acquired by an LIE learning system and a standard rule induction module In addition rules mined from a database extracted from a corpus of texts are used to predict additional information to extract from future documents thereby improving the recall of the underlying extraction system Applying these techniques best results are presented to a corpus of computer job announcement postings from an Internet newsgrou

    Similar works