Search CORE

14 research outputs found

A Profile-Based Method for Authorship Verification

Author: D.I. Holmes
E. Stamatatos
E. Stamatatos
M. Koppel
M. Koppel
M. Koppel
M. Koppel
P. Juola
Publication venue
Publication date: 01/01/2014
Field of study

Abstract. Authorship verification is one of the most challenging tasks in stylebased text categorization. Given a set of documents, all by the same author, and another document of unknown authorship the question is whether or not the latter is also by that author. Recently, in the framework of the PAN-2013 evaluation lab, a competition in authorship verification was organized and the vast majority of submitted approaches, including the best performing models, followed the instance-based paradigm where each text sample by one author is treated separately. In this paper, we show that the profile-based paradigm (where all samples by one author are treated cumulatively) can be very effective surpassing the performance of PAN-2013 winners without using any information from external sources. The proposed approach is fully-trainable and we demonstrate an appropriate tuning of parameter settings for PAN-2013 corpora achieving accurate answers especially when the cost of false negatives is high.

CiteSeerX

Crossref

Cross-domain authorship attribution combining instance-based and profile-based features notebook for PAN at CLEF 2019

Author: Bacciu A.
La Morgia M.
Mei A.
Nemmi E. N.
Neri V.
Stefa J.
Publication venue: CEUR-WS
Publication date: 01/01/2019
Field of study

Being able to identify the author of an unknown text is crucial. Although it is a well-studied field, it is still an open problem, since a standard approach has yet to be found. In this notebook, we propose our model for the Authorship Attribution task of PAN 2019, that focuses on cross-domain setting covering 4 different languages: French, Italian, English, and Spanish. We use n-grams of characters, words, stemmed words, and distorted text. Our model has an SVM for each feature and an ensemble architecture. Our final results outperform the baseline given by PAN in almost every problem. With this model, we reach the second place in the task with an F1-score of 68%

Archivio della ricerca- Università di Roma La Sapienza

Exploring the Potential of Bootstrap Consensus Networks for Large-scale Authorship Attribution in Luxdorph’s Freedom of the Press Writings

Author: Larsen Birger
Meier Florian
Stjernfelt Frederik
Publication venue
Publication date: 01/01/2020
Field of study

VBN

Authenticating the writings of Julius Caesar

Author: Argamon
Argamon
Aronoff
Barthes
Binongo
Burrows
Burrows
Burrows
Cha
Chaski
Cronin
Daelemans
Daelemans
Eder
Efstathios
Folgert Karsdorp
Foucault
Gaertner
Grieve
Halteren
Hermann
Holmes
Holmes
Hoover
Jockers
Juola
Juola
Juola
Justin Stover
Kestemont
Kestemont
Kestemont
Kešelj
Khonji
Kjell
Koppel
Koppel
Koppel
Koppel
Love
Luyckx
Manning
Maurer
Mayer
Mike Kestemont
Moshe Koppel
Mosteller
Peng
Peñas
Potha
Puig
Ružička
Rybicki
Salton
Sapkota
Sapkota
Schubert
Sebastiani
Seidman
Seroussi
Sidorov
Smith
Stamatatos
Stamatatos
Stamatatos
Stamatatos
Stamatatos
Stamatatos
Stein
Stover
Trauth
Walter Daelemans
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

In this paper, we shed new light on the authenticity of the Corpus Caesarianum, a group of five commentaries describing the campaigns of Julius Caesar (100–44 BC), the founder of the Roman empire. While Caesar himself has authored at least part of these commentaries, the authorship of the rest of the texts remains a puzzle that has persisted for nineteen centuries. In particular, the role of Caesar’s general Aulus Hirtius, who has claimed a role in shaping the corpus, has remained in contention. Determining the authorship of documents is an increasingly important authentication problem in information and computer science, with valuable applications, ranging from the domain of art history to counter-terrorism research. We describe two state-of-the-art authorship verification systems and benchmark them on 6 present-day evaluation corpora, as well as a Latin benchmark dataset. Regarding Caesar’s writings, our analyses allow us to establish that Hirtius’s claims to part of the corpus must be considered legitimate. We thus demonstrate how computational methods constitute a valuable methodological complement to traditional, expert-based approaches to document authentication

Crossref

Edinburgh Research Explorer

Oxford University Research Archive

Institutional Repository Universiteit Antwerpen

Le corpus hagiographique de Trèves au début du XVIe siècle:Enquête stylométrique et premiers résultats

Author: Dubuisson Bastien
Publication venue: 'OpenEdition'
Publication date: 01/01/2022
Field of study

Repository of the University of Namur