Using Conjunctions and Adverbs for Author Verification

Abstract

Abstract: Linguistics and stylistics have been investigated for author identification for quite a while, but recently, we have testified a impressive growth in the volume with which lawyers and courts have called upon the expertise of linguists in cases of disputed authorship. This motivates computer science researchers to look to the problem of author identification from a different perspective. In this work, we propose a stylometric feature set based on conjunctions and adverbs of the Portuguese language to address the problem of author identification. Two different approaches of classification were considered. The first one is called writer-independent and it reduces the pattern recognition problem to a single model and two classes, hence, makes it possible to build robust system even when few genuine samples per writer are available. The second one is called the personal model, or writer-dependent, which very often performs better but needs a bigger number of samples per writer. Experiments on a database composed of short articles from 30 different authors and Support Vector Machine (SVM) as classifier demonstrate that the proposed strategy can produced results comparable to the literature

    Similar works