1 research outputs found

    Mining Topics on Participations for Community Discovery

    No full text
    Community discovery on large-scale linked document corpora has been a hot research topic for decades. There are two types of links. The first one, which we call d2d-link, indicates connectiveness among different documents, such as blog references and research paper citations. The other one, which we call u2u-link, represents co-occurrences or simultaneous participations of different users in one document and typically each document from u2u-link corpus has more than one user/author. Examples of u2u-link data covers email archives and research paper co-authorship networks. Community discovery in d2d-link data has achieved much success, while methods for that in u2u-link data either make no use ofthe textualcontentof thedocuments or make oversimplified assumptions about the users and the textual content. Inthispaperweproposeageneralapproachofcommunity discovery for u2u-link data, i.e., multiple user data, by placing topical variables on multiple authors ’ participations in documents. Experiments on a research proceeding co-authorship corpus and a New York Times news corpus show the effectiveness of our model. 1
    corecore