39 research outputs found

    Enhancing the Prediction of Missing Targeted Items from the Transactions of Frequent, Known Users

    Get PDF
    The ability for individual grocery retailers to have a single view of its customers across all of their grocery purchases remains elusive, and is considered the “holy grail” of grocery retailing. This has become increasingly important in recent years, especially in the UK, where competition has intensified, shopping habits and demographics have changed, and price sensitivity has increased. Whilst numerous studies have been conducted on understanding independent items that are frequently bought together, there has been little research conducted on using this knowledge of frequent itemsets to support decision making for targeted promotions. Indeed, having an effective targeted promotions approach may be seen as an outcome of the “holy grail”, as it will allow retailers to promote the right item, to the right customer, using the right incentives to drive up revenue, profitability, and customer share, whilst minimising costs. Given this, the key and original contribution of this study is the development of the market target (mt) model, the clustering approach, and the computer-based algorithm to enhance targeted promotions. Tests conducted on large scale consumer panel data, with over 32000 customers and 51 million individual scanned items per year, show that the mt model and the clustering approach successfully identifies both the best items, and customers to target. Further, the algorithm segregates customers into differing categories of loyalty, in this case it is four, to enable retailers to offer customised incentives schemes to each group, thereby enhancing customer engagement, whilst preventing unnecessary revenue erosion. The proposed model is compared with both a recently published approach, and the cross-sectional shopping patterns of the customers on the consumer scanner panel. Tests show that the proposed approach outperforms the other approach in that it significantly reduces the probability of having “false negatives” and “false positives” in the target customer set. Tests also show that the customer segmentation approach is effective, in that customers who are classed as highly loyal to a grocery retailer, are indeed loyal, whilst those that are classified as “switchers” do indeed have low levels of loyalty to the selected grocery retailer. Applying the mt model to other fields has not only been novel but yielded success. School attendance is improved with the aid of the mt model being applied to attendance data. In this regard, an action research study, involving the proposed mt model and approach, conducted at a local UK primary school, has resulted in the school now meeting the required attendance targets set by the government, and it has halved its persistent absenteeism for the first time in four years. In medicine, the mt model is seen as a useful tool that could rapidly uncover associations that may lead to new research hypotheses, whilst in crime prevention, the mt value may be used as an effective, tangible, efficiency metric that will lead to enhanced crime prevention outcomes, and support stronger community engagement. Future work includes the development of a software program for improving school attendance that will be offered to all schools, while further progress will be made on demonstrating the effectiveness of the mt value as a tangible crime prevention metric

    Generating High Precision Classification Rules for Screening of Irrelevant Studies in Systematic Review Literature Searches

    Get PDF
    Systematic reviews aim to produce repeatable, unbiased, and comprehensive answers to clinical questions. Systematic reviews are an essential component of modern evidence based medicine, however due to the risks of omitting relevant research they are highly time consuming to create and are largely conducted manually. This thesis presents a novel framework for partial automation of systematic review literature searches. We exploit the ubiquitous multi-stage screening process by training the classifier using annotations made by reviewers in previous screening stages. Our approach has the benefit of integrating seamlessly with the existing screening process, minimising disruption to users. Ideally, classification models for systematic reviews should be easily interpretable by users. We propose a novel, rule based algorithm for use with our framework. A new approach for identifying redundant associations when generating rules is also presented. The proposed approach to redundancy seeks to both exclude redundant specialisations of existing rules (those with additional terms in their antecedent), as well as redundant generalisations (those with fewer terms in their antecedent). We demonstrate the ability of the proposed approach to improve the usability of the generated rules. The proposed rule based algorithm is evaluated by simulated application to several existing systematic reviews. Workload savings of up to 10% are demonstrated. There is an increasing demand for systematic reviews related to a variety of clinical disciplines, such as diagnosis. We examine reviews of diagnosis and contrast them against more traditional systematic reviews of treatment. We demonstrate existing challenges such as target class heterogeneity and high data imbalance are even more pronounced for this class of reviews. The described algorithm accounts for this by seeking to label subsets of non-relevant studies with high precision, avoiding the need to generate a high recall model of the minority class

    19th SC@RUG 2022 proceedings 2021-2022

    Get PDF

    19th SC@RUG 2022 proceedings 2021-2022

    Get PDF

    19th SC@RUG 2022 proceedings 2021-2022

    Get PDF

    19th SC@RUG 2022 proceedings 2021-2022

    Get PDF

    19th SC@RUG 2022 proceedings 2021-2022

    Get PDF

    19th SC@RUG 2022 proceedings 2021-2022

    Get PDF

    Génération et sélection d'ensembles de motifs de graphes avec le principe MDL

    Get PDF
    Nowadays, large quantities of graph data can be found in many fields, encoding information about their respective domains. Such data can reveal useful knowledge to the user that analyzes it. However, the size and complexity of real-life datasets hinders their usage by human analysts. To help the users, pattern mining approaches extract frequent local structures, called patterns, from the data, so that they can focus on inferring knowledge from them, instead of analyzing the whole data at once. A well-known problem in pattern mining is the so-called problem of pattern explosion. Even on small datasets, the set of patterns that are extracted by classic pattern mining approaches can be very large in size, and contain many redundancies. In this thesis we propose three approaches that use the Minimum Description Length principle inorder to generate and select small, human-sized sets of descriptive graph patterns from graph data. For that, we instantiate the MDL principle in a graph pattern mining context and we propose MDL measures to evaluate sets of graph patterns. We also introduce the notion of ports, allowing to describe the data as a composition of pattern occurrences with no loss of information. We evaluate all our contributions on real-life graph datasets from different domains, including the semantic web.De nos jours, dans de nombreux domaines, de grandes quantités de données sont disponibles sous la forme de graphes. En les analysant, un utilisateur peut en extraire de la connaissance utile. Cependant, la taille et la complexité des données rendent leur exploitation complexe pour un humain. Afin de faciliter l’analyse de ces données, des approches de fouille de motifs ont été développées. Elles permettent d’extraire des structures locales fréquentes, appelées motifs, desquels l’utilisateur peut déduire de la connaissance, au lieu d’analyser l’intégralité des données. Un problème courant en fouille de motifs est l’explosion du nombre de motifs extraits. Même sur de petits jeux de données, les ensembles de motifs extraits par les approches classiques sont de très grande taille et contiennent de nombreuses redondances. Dans cette thèse, nous proposons trois approches qui utilisent le principe Minimum Description Length (MDL) afin de générer et de sélectionner des petits ensembles de motifs descriptifs de type graphe à partir de données de type graphe. Pour cela, nous instancions le principe MDL dans un contexte de fouille de motifs de graphe et nous proposons des mesures MDL pour évaluer des ensembles de motifs. Nous introduisons également la notion de ports, permettant de décrire les données comme une composition d’occurrences de motifs sans perte d’information. Nous évaluons toutes nos contributions sur des jeux de données de graphes provenant de différents domaines, y compris du web sémantique

    A bottom-up approach to real-time search in large networks and clouds

    Full text link
    corecore