    Exploring Data Hierarchies to Discover Knowledge in Different Domains

    New Approach for Market Intelligence Using Artificial and Computational Intelligence

    Small and medium sized retailers are central to the private sector and a vital contributor to economic growth, but often they face enormous challenges in unleashing their full potential. Financial pitfalls, lack of adequate access to markets, and difficulties in exploiting technology have prevented them from achieving optimal productivity. Market Intelligence (MI) is the knowledge extracted from numerous internal and external data sources, aimed at providing a holistic view of the state of the market and influence marketing related decision-making processes in real-time. A related, burgeoning phenomenon and crucial topic in the field of marketing is Artificial Intelligence (AI) that entails fundamental changes to the skillssets marketers require. A vast amount of knowledge is stored in retailers’ point-of-sales databases. The format of this data often makes the knowledge they store hard to access and identify. As a powerful AI technique, Association Rules Mining helps to identify frequently associated patterns stored in large databases to predict customers’ shopping journeys. Consequently, the method has emerged as the key driver of cross-selling and upselling in the retail industry. At the core of this approach is the Market Basket Analysis that captures knowledge from heterogeneous customer shopping patterns and examines the effects of marketing initiatives. Apriori, that enumerates frequent itemsets purchased together (as market baskets), is the central algorithm in the analysis process. Problems occur, as Apriori lacks computational speed and has weaknesses in providing intelligent decision support. With the growth of simultaneous database scans, the computation cost increases and results in dramatically decreasing performance. Moreover, there are shortages in decision support, especially in the methods of finding rarely occurring events and identifying the brand trending popularity before it peaks. As the objective of this research is to find intelligent ways to assist small and medium sized retailers grow with MI strategy, we demonstrate the effects of AI, with algorithms in data preprocessing, market segmentation, and finding market trends. We show with a sales database of a small, local retailer how our Åbo algorithm increases mining performance and intelligence, as well as how it helps to extract valuable marketing insights to assess demand dynamics and product popularity trends. We also show how this results in commercial advantage and tangible return on investment. Additionally, an enhanced normal distribution method assists data pre-processing and helps to explore different types of potential anomalies.Små och medelstora detaljhandlare är centrala aktörer i den privata sektorn och bidrar starkt till den ekonomiska tillväxten, men de möter ofta enorma utmaningar i att uppnå sin fulla potential. Finansiella svårigheter, brist på marknadstillträde och svårigheter att utnyttja teknologi har ofta hindrat dem från att nå optimal produktivitet. Marknadsintelligens (MI) består av kunskap som samlats in från olika interna externa källor av data och som syftar till att erbjuda en helhetssyn av marknadsläget samt möjliggöra beslutsfattande i realtid. Ett relaterat och växande fenomen, samt ett viktigt tema inom marknadsföring är artificiell intelligens (AI) som ställer nya krav på marknadsförarnas färdigheter. Enorma mängder kunskap finns sparade i databaser av transaktioner samlade från detaljhandlarnas försäljningsplatser. Ändå är formatet på dessa data ofta sådant att det inte är lätt att tillgå och utnyttja kunskapen. Som AI-verktyg erbjuder affinitetsanalys en effektiv teknik för att identifiera upprepade mönster som statistiska associationer i data lagrade i stora försäljningsdatabaser. De hittade mönstren kan sedan utnyttjas som regler som förutser kundernas köpbeteende. I detaljhandel har affinitetsanalys blivit en nyckelfaktor bakom kors- och uppförsäljning. Som den centrala metoden i denna process fungerar marknadskorgsanalys som fångar upp kunskap från de heterogena köpbeteendena i data och hjälper till att utreda hur effektiva marknadsföringsplaner är. Apriori, som räknar upp de vanligt förekommande produktkombinationerna som köps tillsammans (marknadskorgen), är den centrala algoritmen i analysprocessen. Trots detta har Apriori brister som algoritm gällande låg beräkningshastighet och svag intelligens. När antalet parallella databassökningar stiger, ökar också beräkningskostnaden, vilket har negativa effekter på prestanda. Dessutom finns det brister i beslutstödet, speciellt gällande metoder att hitta sällan förekommande produktkombinationer, och i att identifiera ökande popularitet av varumärken från trenddata och utnyttja det innan det når sin höjdpunkt. Eftersom målet för denna forskning är att hjälpa små och medelstora detaljhandlare att växa med hjälp av MI-strategier, demonstreras effekter av AI med hjälp av algoritmer i förberedelsen av data, marknadssegmentering och trendanalys. Med hjälp av försäljningsdata från en liten, lokal detaljhandlare visar vi hur Åbo-algoritmen ökar prestanda och intelligens i datautvinningsprocessen och hjälper till att avslöja värdefulla insikter för marknadsföring, framför allt gällande dynamiken i efterfrågan och trender i populariteten av produkterna. Ytterligare visas hur detta resulterar i kommersiella fördelar och konkret avkastning på investering. Dessutom hjälper den utvidgade normalfördelningsmetoden i förberedelsen av data och med att hitta olika slags anomalier

    Mining Antagonistic Communities From Social Networks

    In this thesis, we examine the problem of mining antagonistic communities from social networks. In social networks, people with opposite opinions normally behave differently and form sub-communities each of which containing people sharing some common behaviors. In one scenario, people with opposite opinions show differences in their views on a set of items. Another scenario is people explicitly expressing whom they agree with, like or trust as well as whom they disagree with, dislike or distrust. We defined the indirect and direct antagonistic groups based on the two scenarios. We have developed algorithms to mine the two types of antagonistic groups. For indirect antagonistic group mining, our algorithm explores the search space of all the possible antagonistic groups starting from antagonistic groups of size two, followed by searching antagonistic groups of larger sizes. We have als

    Exploring fish purchasing behaviour using data analytics

    Nas últimas décadas têm ocorrido mudanças significativas no setor do retalho resultantes da globalização, do aumento de competitividade e da transformação do comportamento de compra do consumidor. Esta mudança de paradigma também se aplica ao setor do peixe fresco, que tem sido alvo do interesse de investigadores internacionais por razões políticas e económicas. Tendo em conta este ambiente competitivo, que valoriza a qualidade e o serviço fornecido ao consumidor assente em custos aceitáveis, é necessário a adoção de estratégias focadas no cliente. Esta dissertação está integrada no projeto ValorMar, que nasceu do compromisso de um conjunto alargado de entidades, desde empresas até centros de investigação posicionados pela relevância da economia marítima na cadeia de valor do pescado. Assim, esta dissertação irá tentar compreender relações que se revelem críticas para a tomada de decisão dos consumidores no momento de compra de peixe fresco. Para tal, irão ser usados dados transacionais e técnicas de data mining adequadas ao problema.A metodologia proposta por esta dissertação tem como objetivo não só a identificação de clientes recorrendo a técnicas de segmentação, mas também uma análise ao carrinho de compras de um cliente de peixe fresco. Estas análises aos dados irão mostrar que a extração de conhecimento de grandes bases de dados permite melhorar as decisões estratégicas das empresas e a sua relação com os clientes.In the last decades there have been significant changes in the retail sector resulting from globalization, the increased competitiveness and transformation on consumer's purchasing behaviour. This paradigm shift also applies to the fish sector, that has been capturing the interest of researchers internationally for political and economic reasons. Taking this competitive environment into account, which values the quality and the service given to the customer based on acceptable costs, it is necessary to adopt customer focused strategies.This thesis is integrated in the ValorMar's project, which was born from the commitment of a broad spectrum of entities, from companies to research centers, positioned by the relevance of the sea economy in the fishery value chain. Thus, this dissertation will try to understand critical relations for the decision making of customers when buying fresh fish.For this, transactional data and data mining techniques appropriate to the problem will be used.The methodology proposed by this thesis aims not only to identify customers using clustering techniques, but also to analyze the market basket of a fresh fish customer. These data analyzis will show that the knowledge extraction from large databases allows to improve the companies strategic decisions and their relationship with customers

    Enhancing the Prediction of Missing Targeted Items from the Transactions of Frequent, Known Users

    The ability for individual grocery retailers to have a single view of its customers across all of their grocery purchases remains elusive, and is considered the “holy grail” of grocery retailing. This has become increasingly important in recent years, especially in the UK, where competition has intensified, shopping habits and demographics have changed, and price sensitivity has increased. Whilst numerous studies have been conducted on understanding independent items that are frequently bought together, there has been little research conducted on using this knowledge of frequent itemsets to support decision making for targeted promotions. Indeed, having an effective targeted promotions approach may be seen as an outcome of the “holy grail”, as it will allow retailers to promote the right item, to the right customer, using the right incentives to drive up revenue, profitability, and customer share, whilst minimising costs. Given this, the key and original contribution of this study is the development of the market target (mt) model, the clustering approach, and the computer-based algorithm to enhance targeted promotions. Tests conducted on large scale consumer panel data, with over 32000 customers and 51 million individual scanned items per year, show that the mt model and the clustering approach successfully identifies both the best items, and customers to target. Further, the algorithm segregates customers into differing categories of loyalty, in this case it is four, to enable retailers to offer customised incentives schemes to each group, thereby enhancing customer engagement, whilst preventing unnecessary revenue erosion. The proposed model is compared with both a recently published approach, and the cross-sectional shopping patterns of the customers on the consumer scanner panel. Tests show that the proposed approach outperforms the other approach in that it significantly reduces the probability of having “false negatives” and “false positives” in the target customer set. Tests also show that the customer segmentation approach is effective, in that customers who are classed as highly loyal to a grocery retailer, are indeed loyal, whilst those that are classified as “switchers” do indeed have low levels of loyalty to the selected grocery retailer. Applying the mt model to other fields has not only been novel but yielded success. School attendance is improved with the aid of the mt model being applied to attendance data. In this regard, an action research study, involving the proposed mt model and approach, conducted at a local UK primary school, has resulted in the school now meeting the required attendance targets set by the government, and it has halved its persistent absenteeism for the first time in four years. In medicine, the mt model is seen as a useful tool that could rapidly uncover associations that may lead to new research hypotheses, whilst in crime prevention, the mt value may be used as an effective, tangible, efficiency metric that will lead to enhanced crime prevention outcomes, and support stronger community engagement. Future work includes the development of a software program for improving school attendance that will be offered to all schools, while further progress will be made on demonstrating the effectiveness of the mt value as a tangible crime prevention metric