6,626 research outputs found

    ASAP: A Source Code Authorship Program

    Get PDF
    Source code authorship attribution is the task of determining who wrote a computer program, based on its source code, usually when the author is either unknown or under dispute. Areas where this can be applied include software forensics, cases of software copyright infringement, and detecting plagiarism. Numerous methods of source code authorship attribution have been proposed and studied. However, there are no known easily accessible and user-friendly programs that perform this task. Instead, researchers typically develop software in an ad hoc manner for use in their studies, and the software is rarely made publicly available. In this paper, we present a software tool called A Source Code Authorship Program (ASAP), which is suitable to be used by either the layperson or the expert. An author can be attributed to individual documents one at a time, or complex authorship attribution experiments can easily be performed on large datasets. In this paper, the interface and implementation of the ASAP tool is presented, and the tool is validated by using it to replicate previously published authorship attribution experiments

    Drawing Elena Ferrante's Profile. Workshop Proceedings, Padova, 7 September 2017

    Get PDF
    Elena Ferrante is an internationally acclaimed Italian novelist whose real identity has been kept secret by E/O publishing house for more than 25 years. Owing to her popularity, major Italian and foreign newspapers have long tried to discover her real identity. However, only a few attempts have been made to foster a scientific debate on her work. In 2016, Arjuna Tuzzi and Michele Cortelazzo led an Italian research team that conducted a preliminary study and collected a well-founded, large corpus of Italian novels comprising 150 works published in the last 30 years by 40 different authors. Moreover, they shared their data with a select group of international experts on authorship attribution, profiling, and analysis of textual data: Maciej Eder and Jan Rybicki (Poland), Patrick Juola (United States), Vittorio Loreto and his research team, Margherita Lalli and Francesca Tria (Italy), George Mikros (Greece), Pierre Ratinaud (France), and Jacques Savoy (Switzerland). The chapters of this volume report the results of this endeavour that were first presented during the international workshop Drawing Elena Ferrante's Profile in Padua on 7 September 2017 as part of the 3rd IQLA-GIAT Summer School in Quantitative Analysis of Textual Data. The fascinating research findings suggest that Elena Ferrante\u2019s work definitely deserves \u201cmany hands\u201d as well as an extensive effort to understand her distinct writing style and the reasons for her worldwide success

    The Poems of Ch : Taxonomizing Literary Tradition

    Get PDF

    Artificial Intelligence and Pattern Evidence: A Legal Application for AI

    Get PDF
    Artificial intelligence changes everything, and almost no jobs will be immune. The application of AI to the practice of law is well-known and well-understood. In this paper, we present some aspects of the related disciplines of forensic science and specifically the development and analysis of “pattern [and impression] evidence.” We show that pattern evidence has a great need for AI.We discuss several applications in detail but focus mostly on the application of AI-based text analysis technology to forensic linguistics.Sociedad Argentina de Informática e Investigación Operativ

    Approaching Questions of Text Reuse in Ancient Greek Using Computational Syntactic Stylometry

    Get PDF
    We are investigating methods by which data from dependency syntax treebanks of ancient Greek can be applied to questions of authorship in ancient Greek historiography. From the Ancient Greek Dependency Treebank were constructed syntax words (sWords) by tracing the shortest path from each leaf node to the root for each sentence tree. This paper presents the results of a preliminary test of the usefulness of the sWord as a stylometric discriminator. The sWord data was subjected to clustering analysis. The resultant groupings were in accord with traditional classifications. The use of sWords also allows a more fine-grained heuristic exploration of difficult questions of text reuse. A comparison of relative frequencies of sWords in the directly transmitted Polybius book 1 and the excerpted books 9–10 indicate that the measurements of the two texts are generally very close, but when frequencies do vary, the differences are surprisingly large. These differences reveal that a certain syntactic simplification is a salient characteristic of Polybius’ excerptor, who leaves conspicuous syntactic indicators of his modifications

    Approaching Questions of Text Reuse in Ancient Greek Using Computational Syntactic Stylometry

    Get PDF
    We are investigating methods by which data from dependency syntax treebanks of ancient Greek can be applied to questions of authorship in ancient Greek historiography. From the Ancient Greek Dependency Treebank were constructed syntax words (sWords) by tracing the shortest path from each leaf node to the root for each sentence tree. This paper presents the results of a preliminary test of the usefulness of the sWord as a stylometric discriminator. The sWord data was subjected to clustering analysis. The resultant groupings were in accord with traditional classifications. The use of sWords also allows a more fine-grained heuristic exploration of difficult questions of text reuse. A comparison of relative frequencies of sWords in the directly transmitted Polybius book 1 and the excerpted books 9–10 indicate that the measurements of the two texts are generally very close, but when frequencies do vary, the differences are surprisingly large. These differences reveal that a certain syntactic simplification is a salient characteristic of Polybius’ excerptor, who leaves conspicuous syntactic indicators of his modifications

    Visualization of Co-authorshipin DIT Arrow

    Get PDF
    With the popularization of information technology and the unprecedented development of online reading, the management and service of the library are facing severe challenges; the traditional library operation mode has been challenging to optimize the service. At the same time, there is also a fatal impact on library collection and systematic management, however, with the development of visualization techniques in management and service, the library can alleviate the eïŹ€ect of the current network information basically, which achieves the intellectual development of library ïŹeld. This study empirically provides the evidence to indicate that the force directed layout has the statistically signiïŹcant performance than the radial layout for visualization of co-authorship in DIT Arrow repository based on the results of surveys
    • 

    corecore