thesis

Automatic classification of documents with an in-depth analysis of information extraction and automatic summarization

Abstract

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Civil and Environmental Engineering, 2004.Includes bibliographical references (leaves 78-80).Today, annual information fabrication per capita exceeds two hundred and fifty megabytes. As the amount of data increases, classification and retrieval methods become more necessary to find relevant information. This thesis describes a .Net application (named I-Document) that establishes an automatic classification scheme in a peer-to-peer environment that allows free sharing of academic, business, and personal documents. A Web service architecture for metadata extraction, Information Extraction, Information Retrieval, and text summarization is depicted. Specific details regarding the coding process, competition, business model, and technology employed in the project are also discussed.by Joseph Brandon Hohm.M.Eng

    Similar works