slides

Document Clustering and Social Networks

Abstract

Text Mining has become a specialized offshoot of Data Mining, Information Retrieval, and Natural Language Processing. One of the major tools of this area is the vector space representation of documents. On the other hand, social network analysis has found its mathematical underpinnings primarily in mathematical graph theory. A graph has a dual representation as an adjacency matrix. So-called two-mode social networks have actors of two different types, frequently individuals and organizations. The adjacency matrix for these two-mode social networks has the same structure as the so-called term-document matrices used in text mining. In the talk we discuss these connections and show how these ideas can be exploited in both fields. In particular, methods for block modeling in social network analysis can be used for document clustering.Army Research Office, Contract W911NF-04-1-0447Army Research Laboratory, Contract W911-NF-07-1-0059National Institute on Alcohol Abuse And Alcoholism, Grant Number F32AA015876Isaac Newton Instituteunpublishednot peer reviewe

    Similar works