Search CORE

11 research outputs found

Stylometry and Immigration: A Case Study

Author: Juola Ph.D., Patrick
Publication venue: BrooklynWorks
Publication date: 01/01/2013
Field of study

Brooklyn Law School: BrooklynWorks

bepress Legal Repository

Style Obfuscation by Invariance

Author: Chrupała Grzegorz
Emmery Chris
Manjavacas Enrique
Publication venue
Publication date: 01/01/2018
Field of study

The task of obfuscating writing style using sequence models has previously been investigated under the framework of obfuscation-by-transfer, where the input text is explicitly rewritten in another style. These approaches also often lead to major alterations to the semantic content of the input. In this work, we propose obfuscation-by-invariance, and investigate to what extent models trained to be explicitly style-invariant preserve semantics. We evaluate our architectures on parallel and non-parallel corpora, and compare automatic and human evaluations on the obfuscated sentences. Our experiments show that style classifier performance can be reduced to chance level, whilst the automatic evaluation of the output is seemingly equal to models applying style-transfer. However, based on human evaluation we demonstrate a trade-off between the level of obfuscation and the observed quality of the output in terms of meaning preservation and grammaticality.Comment: Accepted for presentation at COLING1

arXiv.org e-Print Archive

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Adversarial Stylometry in the Wild: Transferable Lexical Substitution Attacks on Author Profiling

Author: Chrupała Grzegorz
Emmery Chris
Kádár Ákos
Publication venue
Publication date: 27/01/2021
Field of study

Written language contains stylistic cues that can be exploited to automatically infer a variety of potentially sensitive author information. Adversarial stylometry intends to attack such models by rewriting an author's text. Our research proposes several components to facilitate deployment of these adversarial attacks in the wild, where neither data nor target models are accessible. We introduce a transformer-based extension of a lexical replacement attack, and show it achieves high transferability when trained on a weakly labeled corpus -- decreasing target model performance below chance. While not completely inconspicuous, our more successful attacks also prove notably less detectable by humans. Our framework therefore provides a promising direction for future privacy-preserving adversarial attacks.Comment: Accepted to EACL 202

arXiv.org e-Print Archive

Tilburg University Repository

Authorship Attribution Through Words Surrounding Named Entities

Author: Jacovino Julia Maureen
Publication venue: Duquesne Scholarship Collection
Publication date: 01/01/2013
Field of study

In text analysis, authorship attribution occurs in a variety of ways. The field of computational linguistics becomes more important as the need of authorship attribution and text analysis becomes more widespread. For this research, pre-existing authorship attribution software, Java Graphical Authorship Attribution Program (JGAAP), implements a named entity recognizer, specifically the Stanford Named Entity Recognizer, to probe into similar genre text and to aid in extricating the correct author. This research specifically examines the words authors use around named entities in order to test the ability of these words at attributing authorshi

Duquesne University: Digital Commons

Aplicación de estilometría para la atribución autorías en e-mails y documentos informáticos

Author: Maldonado Galiano Jorge Roberto
Publication venue: Quito: USFQ, 2015
Publication date: 01/10/2015
Field of study

Stylometry is the analysis by which authorship of a written text can be determined, analyzing special features that are unconsciously placed by a writer in his publications. In this integrative and research paper an application with the ability to extract various features of writing is presented. These features are compared against another profile, composed by features, in order to obtain a similarity percentage between two styles of composition that may be from the same or different authors. This has been achieved by incorporating several selected features that are considered relevant in order to perform an stylometric analysis, which include statistical observations about the pattern presented in the document, without setting aside the fact that it is applied to the Spanish language.La Estilometría es el análisis por el cual se puede determinar la autoría de un texto, que incluye el estudio de rasgos propios que utiliza un escritor al redactar documentos. En este trabajo de investigación e integración se presenta un programa con la capacidad de extraer diversos rasgos característicos de escritura, los mismos que son comparados contra otro tipo de redacción con la finalidad de obtener un porcentaje de similitud entre estilos diferentes de composición manejados por uno o varios autores específicos. Esto se ha logrado mediante incorporación de varios parámetros que son considerados relevantes en el momento de realizar un análisis, los que incluyen a observaciones estadísticas sobre componentes léxicos, sintácticos, semánticos y estructurales aplicados al español

BIBLIOTECA USFQ

Analyzing Stylometric Approaches to Author Obfuscation

Author: Juola Patrick
Vescovi Darren
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/01/2011
Field of study

Part 2: FORENSIC TECHNIQUESInternational audienceAuthorship attribution is an important and emerging security tool. However, just as criminals may wear gloves to hide their fingerprints, so too may criminal authors mask their writing styles to escape detection. Most authorship studies have focused on cooperative and/or unaware authors who do not take such precautions. This paper analyzes the methods implemented in the Java Graphical Authorship Attribution Program (JGAAP) against essays in the Brennan-Greenstadt obfuscation corpus that were written in deliberate attempts to mask style. The results demonstrate that many of the more robust and accurate methods implemented in JGAAP are effective in the presence of active deception

AIUCD2017 - Book of Abstracts

Author
Publication venue
Publication date: 01/01/2017
Field of study

Questo volume raccoglie gli abstract degli interventi presentati alla conferenza AIUCD 2017. AIUCD 2017 si è svolta dal 26 al 28 Gennaio 2017 a Roma, ed è stata verrà organizzata dal Digilab, Università Sapienza in cooperazione con il network ITN DiXiT (Digital Scholarly Editions Initial Training Network). AIUCD 2017 ha ospitato anche la terza edizione dell’EADH Day, tenutosi il 25 Gennaio 2017. Gli abstract pubblicati in questo volume hanno ottenuto il parere favorevole da parte di valutatori esperti della materia, attraverso un processo di revisione anonima sotto la responsabilità del Comitato di Programma Internazionale di AIUCD 2017

AIUCD2017 - Book of Abstracts

Author
Publication venue
Publication date: 01/01/2017
Field of study

AMS Acta

Humanidades Digitales: Construcciones locales en contextos globales

Author: del Rio Riande Gimena
Publication venue: Facultad de Filosofía y Letras de la Universidad de Buenos Aires
Publication date: 01/01/2018
Field of study

Proceedings of the II International Conference of the Argentine Association of Digital Humanities/ Asocición Argentina de Humanidades Digitales (AAHD). "Humanidades Digitales: Construcciones locales en contextos globales". 47 articles in Spanish and Portuguese

E-LIS