3 research outputs found

    Automated authorship attribution using advanced signal classification techniques

    Get PDF
    In this paper, we develop two automated authorship attribution schemes, one based on Multiple Discriminant Analysis (MDA) and the other based on a Support Vector Machine (SVM). The classification features we exploit are based on word frequencies in the text. We adopt an approach of preprocessing each text by stripping it of all characters except a-z and space. This is in order to increase the portability of the software to different types of texts. We test the methodology on a corpus of undisputed English texts, and use leave-one-out cross validation to demonstrate classification accuracies in excess of 90%. We further test our methods on the Federalist Papers, which have a partly disputed authorship and a fair degree of scholarly consensus. And finally, we apply our methodology to the question of the authorship of the Letter to the Hebrews by comparing it against a number of original Greek texts of known authorship. These tests identify where some of the limitations lie, motivating a number of open questions for future work. An open source implementation of our methodology is freely available for use at https://github.com/matthewberryman/autho​r-detection.Maryam Ebrahimpour, Tālis J. Putniņš, Matthew J. Berryman, Andrew Allison, Brian W.-H. Ng, Derek Abbot

    Identifying Human Trafficking Networks in Louisiana by Using Authorship Attribution and Network Modeling

    Get PDF
    Human trafficking or modern slavery is a problem that has plagued every U.S. state, in both urban and rural areas. During the past decades, online advertisements for sex trafficking have rapidly increased in numbers. The advancement of the Internet and smart phones have made it easier for sex traffickers to contact and recruit their victims and advertise and sell them online. Also, they have made it more difficult for law enforcement to trace the victims and identify the traffickers. Sadly, more than fifty percent of the victims of sex trafficking are children, many of which are exploited through the Internet. The first step for preventing and fighting human trafficking is to identify the traffickers. The primary goal of this study is to identify potential organized sex trafficking networks in Louisiana by analyzing the ads posted online in Louisiana and its five neighboring states. The secondary goal of this study is to examine the possibility of using authorship attribution techniques (in addition to phone numbers and ad IDs) to group together the online advertisements that may have been posted by the same entity. The data used in this study was collected from the website Backpage for a time period of ten months. After cleaning the data set, we were left with 123,436 ads from 47 cities in the specified area. Through the application of network analysis, we found many entities that are potentially such networks, all of which posted a large number of ads with many phone numbers in different cities. Also, we identified the time period that each phone number was used in and the cities and states that each entity posted ads for, which shows how these entities moved around between different cities and states. The four supervised machine learning methods that we used to classify the collected advertisements are Support Vector Machines (SVMs), the Naïve Bayesian classifier, Logistic Regression, and Neural Networks. We calculated 40 accuracy rates, 35 of which were over 90% for classifying any number of ads per entity, as long as each entity (or author) posted more than 10 ads

    Continuous User Authentication Using Multi-Modal Biometrics

    Get PDF
    It is commonly acknowledged that mobile devices now form an integral part of an individual’s everyday life. The modern mobile handheld devices are capable to provide a wide range of services and applications over multiple networks. With the increasing capability and accessibility, they introduce additional demands in term of security. This thesis explores the need for authentication on mobile devices and proposes a novel mechanism to improve the current techniques. The research begins with an intensive review of mobile technologies and the current security challenges that mobile devices experience to illustrate the imperative of authentication on mobile devices. The research then highlights the existing authentication mechanism and a wide range of weakness. To this end, biometric approaches are identified as an appropriate solution an opportunity for security to be maintained beyond point-of-entry. Indeed, by utilising behaviour biometric techniques, the authentication mechanism can be performed in a continuous and transparent fashion. This research investigated three behavioural biometric techniques based on SMS texting activities and messages, looking to apply these techniques as a multi-modal biometric authentication method for mobile devices. The results showed that linguistic profiling; keystroke dynamics and behaviour profiling can be used to discriminate users with overall Equal Error Rates (EER) 12.8%, 20.8% and 9.2% respectively. By using a combination of biometrics, the results showed clearly that the classification performance is better than using single biometric technique achieving EER 3.3%. Based on these findings, a novel architecture of multi-modal biometric authentication on mobile devices is proposed. The framework is able to provide a robust, continuous and transparent authentication in standalone and server-client modes regardless of mobile hardware configuration. The framework is able to continuously maintain the security status of the devices. With a high level of security status, users are permitted to access sensitive services and data. On the other hand, with the low level of security, users are required to re-authenticate before accessing sensitive service or data
    corecore