166 research outputs found

    Drawing Elena Ferrante's Profile. Workshop Proceedings, Padova, 7 September 2017

    Get PDF
    Elena Ferrante is an internationally acclaimed Italian novelist whose real identity has been kept secret by E/O publishing house for more than 25 years. Owing to her popularity, major Italian and foreign newspapers have long tried to discover her real identity. However, only a few attempts have been made to foster a scientific debate on her work. In 2016, Arjuna Tuzzi and Michele Cortelazzo led an Italian research team that conducted a preliminary study and collected a well-founded, large corpus of Italian novels comprising 150 works published in the last 30 years by 40 different authors. Moreover, they shared their data with a select group of international experts on authorship attribution, profiling, and analysis of textual data: Maciej Eder and Jan Rybicki (Poland), Patrick Juola (United States), Vittorio Loreto and his research team, Margherita Lalli and Francesca Tria (Italy), George Mikros (Greece), Pierre Ratinaud (France), and Jacques Savoy (Switzerland). The chapters of this volume report the results of this endeavour that were first presented during the international workshop Drawing Elena Ferrante's Profile in Padua on 7 September 2017 as part of the 3rd IQLA-GIAT Summer School in Quantitative Analysis of Textual Data. The fascinating research findings suggest that Elena Ferrante\u2019s work definitely deserves \u201cmany hands\u201d as well as an extensive effort to understand her distinct writing style and the reasons for her worldwide success

    Authorship Categorization With Neural Network

    Get PDF
    This paper explores the use of neural networks in author classification. Also exploring the effect of stylometry is another aim of the research. Choosing the algorithm and descriptors are important issues in the research. In this paper methods for the multi-topic machine learning of an authorship attribution classifier were investigated using texts from novels as the data set. Artificial neural network is proposed to classify the texts of authors using a set of lexical descriptors and feed-forward neural network using back propagation. The result shows that Turkish authors Peyami Safa, Orhan Pamuk and Mustafa Necati Sepetcioglu’s two novels are successfully classified

    Teaching Neural Networks to Classify the Authors of Texts

    Get PDF
    A lot of research has been done on author classification using various methodologies. One of them is using artificial neural networks. It is common that the number of descriptors used for author classification exceeds two. In this paper we propose a means of using artificial neural network to classify the authors of texts using only two descriptors: the number of words in a paragraph and a number of characters per word in a paragraph. The approach taken uses committee machines based on ensemble averaging. The basic idea is to solve the complex computational task by dividing it into a number of computationally simple tasks and then combining the solution of these tasks. The high performance achieved is because the committee is much better than the single best constituent in the isolation. Our results show that with the above approach we succeeded to correctly classify the works of Leo Tolstoy and George Orwell

    Classification of Texts' Authorship Using a Regression Model on Compressed Data

    Get PDF
    2010 Mathematics Subject Classification: 68T50,62H30,62J05.An algorithm for text authorship identification is proposed. The procedure is based on the Kolmogorov complexity and uses regression models on the length of the compressed texts. The classification employs the regression parameters estimates. Different combinations of compressor parameters and the preliminary processing on the data are examined using prose texts of a few English classics

    Text stylometry for chat bot identification and intelligence estimation.

    Get PDF
    Authorship identification is a technique used to identify the author of an unclaimed document, by attempting to find traits that will match those of the original author. Authorship identification has a great potential for applications in forensics. It can also be used in identifying chat bots, a form of intelligent software created to mimic the human conversations, by their unique style. The online criminal community is utilizing chat bots as a new way to steal private information and commit fraud and identity theft. The need for identifying chat bots by their style is becoming essential to overcome the danger of online criminal activities. Researchers realized the need to advance the understanding of chat bots and design programs to prevent criminal activities, whether it was an identity theft or even a terrorist threat. The more research work to advance chat bots’ ability to perceive humans, the more duties needed to be followed to confront those threats by the research community. This research went further by trying to study whether chat bots have behavioral drift. Studying text for Stylometry has been the goal for many researchers who have experimented many features and combinations of features in their experiments. A novel feature has been proposed that represented Term Frequency Inverse Document Frequency (TFIDF) and implemented that on a Byte level N-Gram. Term Frequency-Inverse Token Frequency (TF-ITF) used these terms and created the feature. The initial experiments utilizing collected data demonstrated the feasibility of this approach. Additional versions of the feature were created and tested for authorship identification. Results demonstrated that the feature was successfully used to identify authors of text, and additional experiments showed that the feature is language independent. The feature successfully identified authors of a German text. Furthermore, the feature was used in text similarities on a book level and a paragraph level. Finally, a selective combination of features was used to classify text that ranges from kindergarten level to scientific researches and novels. The feature combination measured the Quality of Writing (QoW) and the complexity of text, which were the first step to correlate that with the author’s IQ as a future goal
    • …
    corecore