4 research outputs found

    Detection of illicit behaviours and mining for contrast patterns

    Get PDF
    This thesis describes a set of novel algorithms and models designed to detect illicit behaviour. This includes development of domain specific solutions, focusing on anti-money laundering and detection of opinion spam. In addition, advancements are presented for the mining and application of contrast patterns, which are a useful tool for characterising illicit behaviour. For anti-money laundering, this thesis presents a novel approach for detection based on analysis of financial networks and supervised learning. This includes the development of a network model, features extracted from this model, and evaluation of classifiers trained using real financial data. Results indicate that this approach successfully identifies suspicious groups whose collaborative behaviour is indicative of money laundering. For the detection of opinion spam, this thesis presents a model of reviewer behaviour and a method for detection based on statistical anomaly detection. This method considers review ratings, and does not rely on text-based features. Evaluation using real data shows that spammers are successfully identified. Comparison with existing methods shows a small improvement in accuracy, but significant improvements in computational efficiency. This thesis also considers the application of contrast patterns to network analysis and presents a novel algorithm for mining contrast patterns in a distributed system. Contrast patterns may be used to characterise illicit behaviour by contrasting illicit and non-illicit behaviour and uncovering significant differences. However, existing mining algorithms are limited by serial processing making them unsuitable for large data sets. This thesis advances the current state-of-the-art, describing an algorithm for mining in parallel. This algorithm is evaluated using real data and is shown to achieve a high level of scalability, allowing mining of large, high-dimensional data sets. In addition, this thesis explores methods for mapping network features to an item-space suitable for analysis using contrast patterns. Experiments indicate that contrast patterns may become a valuable tool for network analysis

    Evaluating the Persuasiveness of Mobile Health: The Intersection of Persuasive System Design and Data Science

    Get PDF
    Persuasive technology is an umbrella term that encompasses any software (e.g., mobile app) or hardware (e.g., smartwatch) designed to influence users to perform a preferable behavior once or on a long-term basis. Considering the ubiquitous nature of mobile devices across all socioeconomic groups, user behavior modification thrives under the personalized care that persuasive technology can offer. This research examines the roles psychological characteristics play in interpreted mHealth screen perceived persuasiveness. A review of the literature revealed a gap regarding how developers of digital health technologies are often tasked with developing tools designed to engage patients, yet little emphasis has been placed on understanding what psychological characteristics motivate and demotivate their users to engage with digital health technologies. Developers must move past using a cookie-cutter, one size fits all solution, and seek to develop digital health technologies designed to traverse the terrain that navigates between the fluid nature of goals and user preferences. This terrain is often determined by user’s psychological characteristics and demographic (control) variables. An experiment was designed to evaluate how psychological characteristics (self-efficacy, xiv health consciousness, health motivation, and the Big Five personality traits) impact the perceived persuasiveness of digital health technologies utilizing the Persuasive System Design (PSD) framework. This study used multiple linear regressions and Contrast, a publicly available Python implementation of the contrast pattern mining algorithm Search and Testing for Understandable Consistent Contrasts (STUCCO), to study the multifaceted needs of the users of digital health technologies based on psychological characteristics. The results of this experiment show psychological characteristics (selfefficacy, health consciousness, health motivation, and extraversion) enhancing the perceived persuasiveness of digital health technologies. The findings of the study revealed that screens utilizing techniques for the primary task support have high perceived persuasiveness scores. System credibility techniques were found to be a contributor to perceived persuasiveness and should be used in the development of persuasive technologies. The results of this study show practitioners should abstain from using social support techniques. Persuasive techniques from the social support category were found to have very low perceived persuasive scores which indicate a lower ability to persuade mHealth app users to utilize the tool. The findings strongly suggest the distribution of perceived persuasiveness shifts from negatively skewed to positively skewed as participants get older. Additionally, this shift occurs earlier in females (i.e., in the 40-59 age group) compared to males who do not shift until the oldest age group (i.e., in the 60 and older age group). The results imply that an individual user’s psychological characteristics affect interpreted mHealth screen perceived persuasiveness, and that combinations of persuasive principles and psychological characteristics lead to greater perceived persuasiveness

    Distributed mining of contrast patterns

    No full text
    In this paper we propose a novel algorithm for mining contrast patterns using a distributed, map-reduce like framework. Contrast patterns describe differences between contrasted data sets and have previously been used for building highly accurate classifiers. However, mining for contrast patterns is a computationally expensive task and existing algorithms are designed to run in a sequential manner on a single machine. Consequently, existing approaches are unable to handle dense, high volume and high dimensional databases. Our algorithm addresses this problem by partitioning the search-space for contrast patterns into small, independent units. These units can be mined in parallel, providing a scalable solution for mining large data sets. Using three different real-world data sets we test an implementation of our algorithm on a Spark cluster. Results of these tests indicate that our algorithm achieves a high-degree of parallelism and scalability

    Distributed Mining of Contrast Patterns

    No full text
    corecore