1 research outputs found

    Authorship Identification with Modality Specific Meta Features Notebook for PAN at CLEF 2011

    No full text
    Abstract This paper presents the approach used in the PAN ’11 authorship identification competition. Our method extracts meta features from several independently generated clustering solutions from the training set. Each clustering solution uses a disjoint set of features that represent a specific linguistic modality. The different clustering solutions encode similarities in writing styles of authors across specific dimensions. The final classifier is trained with a combination of the meta features with first level features. Our approach has outperformed a more syntactic oriented state-of-the-art method on web forum data. We achieved moderately successful results on this PAN competition, with better results on the test set with a smaller number of authors. However, considering that our system was not fine tuned for the PAN evaluation data we found our results very encouraging.
    corecore