    How do fashion retail customers search on the Internet?: Exploring the use of data mining tools to enhance CRM

    This paper seeks to determine the usefulness of data mining tools to SMEs in developing customer relationship management (CRM) in the fashion retail sector. Kalakota & Robinson’s (1999, p.114) model of ‘The Three Phases of CRM’ acts as a basis to explore the use of data mining software. This paper reviews the nature and type of data that is available for collection and its relevance to CRM; providing an advisory framework for practitioners for them to examine the scope and limitations of using data analysis to improve CRM. The data mining tool examined was Google Analytics (GA); an online freeware tool that enables businesses to understand how people find their site, how they navigate through it, and, ultimately, how they do or don’t become customers of it (Google Analytics, 2009). Establishing these relationships should lead to retailer development of enhanced web site aesthetics and functionality to coincide with consumer expectations. The paper finds that the competitive nature and homogeneity of the fashion retail sector requires retailers to improve the ‘reach, richness and affiliation’ (Hackney et al) of their sites by using technology to explore CRM

    Analytical customer relationship management in retailing supported by data mining techniques

    Tese de doutoramento. Engenharia Industrial e Gestão. Faculdade de Engenharia. Universidade do Porto.

    Recommender Systems for Grocery Retail - A Machine Learning Approach

    Recommender systems are present in our daily activities in different moments, such as when choosing a song to listen to or when doing online shopping. It is an everyday reality for people to have the help of computer systems in order to simplify regular decision activities. Grocery shopping is an essential part of people’s life and a frequent activity. Despite being a common habit, each customer has unique routines, needs and preferences regarding products and brands. This information is valuable for grocery retailers to know their customers better and to improve their marketing and operational activities. This dissertation aims to apply machine learning algorithms to the development of a recommender system capable of preparing personalized grocery shopping lists. The proposed architecture is designed to allow integration with different grocery retailers and support distinct TensorFlow algorithms. The process of extracting information from the dataset as features was explored, as well as the tuning of the model hyperparameters, to obtain better results. The recommendation engine is exposed via a distributed software architecture designed to allow retailers to integrate the recommender system with different existing solutions (e.g., websites or mobile applications). A case study to validate the implemented solution was performed, integrating it with a public dataset provided by Instacart. A comparison study between different machine learning algorithms over the adopted dataset has lead to the choice of the gradient boosted trees algorithm. The solution developed in the case study was compared against two non-machine learning approaches at predicting the last purchase of 360 arbitrary test customers. A pattern miningbased solution and a SQL-based heuristic were used. Different evaluation metrics (namely, the average accuracy, precision, recall, and f1-score) were registered. The way association rules with different strengths were reflected in the predictions of the developed solution was also analyzed. The gradient boosted trees-based implementation from the case study was capable of outperforming the compared solutions as far as evaluation metrics are concerned, and has shown a higher capability of predicting at least one correct item per customer. Also, it became evident that the strictest association rules were frequently found in the recommendations. The adopted solution and algorithm have shown promising results and a remarkable capability to provide meaningful predictions to the different customers, evidencing its capability to add value to grocery retail. Nevertheless, there is still potential for further expansion.Os sistemas de recomendação estão presentes no nosso quotidiano, em momentos como a escolha da música a ouvir ou a preparação de compras online. Estamos acostumados a contar com a ajuda de sistemas computacionais para simplificar tarefas habituais que envolvem decisões. Realizar compras de retalho alimentar é uma parte importante e frequente da nossa vida. Apesar de ser um hábito comum, cada um de nós tem as suas próprias rotinas, necessidades e preferências no que toca a produtos e marcas. Esta informação é valiosa para que os retalhistas alimentares consigam conhecer melhor os seus clientes e melhorar atividades operacionais e de marketing. Esta dissertação tem como objetivo a aplicação de algoritmos de machine learning na criação de um sistema de recomendação capaz de preparar listas de compras personalizadas. A arquitetura proposta é desenhada com o objetivo de permitir a integração com diferentes retalhistas e a utilização de diferentes algoritmos em TensorFlow. O processo de extração de informação na forma de features foi explorado, tal como a afinação dos hiperparâmetros do modelo, para obter melhores resultados. O motor de recomendações é exposto através de uma arquitetura de software distribuída, com o propósito de permitir que os retalhistas alimentares possam integrar este sistema com diferentes soluções existentes (e.g., websites ou aplicações móveis). Foi realizado um caso de estudo para validar a solução implementada, através da integração da solução com os dados públicos disponibilizados pelo retalhista Instacart. Uma comparação entre a aplicação de diferentes algoritmos de machine learning aos dados utilizados, levou à adoção do algoritmo gradient boosted trees. A solução desenvolvida no caso de estudo foi comparada com duas abordagens não baseadas em machine learning para a previsão da última compra de 360 clientes arbitrários. Foi usada uma abordagem baseada em pattern mining e uma abordagem baseada em SQL. Diferentes métricas de avaliação (nomeadamente accuracy, precision, recall e f1-score médios) foram registadas. Foi também analisada a forma como diferentes regras de associação se encontraram refletidas nas recomendações da solução desenvolvida. A implementação baseada em gradient boosted trees do caso de estudo superou as soluções com as quais foi comparada quanto às métricas de avaliação, e mostrou uma maior capacidade de recomendar pelo menos um produto correto por cliente. Verificou-se também que as regras de associação mais fortes estão frequentemente refletidas nas recomendações. A abordagem adotada e o algoritmo aprofundado mostraram resultados promissores e uma capacidade notável de fornecer recomendações úteis aos diferentes clientes, evidenciando a sua aptidão para adicionar valor ao retalho alimentar. Ainda assim, este sistema apresenta um elevado potencial para expansão

    Recommender Systems for Scientific and Technical Information Providers

    Providers of scientific and technical information are a promising application area of recommender systems due to high search costs for their goods and the general problem of assessing the quality of information products. Nevertheless, the usage of recommendation services in this market is still in its infancy. This book presents economical concepts, statistical methods and algorithms, technical architectures, as well as experiences from case studies on how recommender systems can be integrated

    Big data analytics in healthcare : are end-users ready?

    This dissertation aims to understand if end-users are aware of big data analytics, and given this, if the perceived value of healthcare products that use big data techniques is sufficient to surpass their concerns for sharing personal data. Additionally, it is tested whether they are interested in purchasing such products. In order to address this topic, the theoretical foundations are based on the Theory of Reasoned Model, which studies human’s decision-making process. Based on the data from a questionnaire directed to end-users, a Chi-square test studies if it exists an association between the different variables and Simple Linear Regressions evaluate the strength of the associations. The results obtained from both type of tests prove that a higher perception of value from health products that require the use of big data technologies (PV) is positively correlated with a superior willingness to share personal data (WTS), as well as a higher willingness to buy (WTB), a positive word of mouth for both sharing data (WoM_sd) and purchasing such devices (WoM_d). Finally, three Multiple Regression models are created. The first model explains WTB as a positive influence of PV, WTS, WoM_d and WoM_sd. The second regression tests the WoM_d dimension as a result of PV, WTB and WTS. The third model shows that WoM_sd is explained by PV, WTB and WTS. These three models are in line with the previous conclusions obtained from both the Chi-Square test and the Simple Linear Regressions.Esta dissertação tem como objetivo compreender se os consumidores finais estão cientes das técnicas analiticas de big data e, em caso afirmativo, se o valor percepcionado de produtos na área da saúde que usem tecnologias big data é suficiente para ultrapassar os receios de partilha de dados pessoais. Adicionalmente é testado se estes estão interessados na compra de tais produtos. Para tal, a abordagem teórica é baseada no modelo Theory of Reasoned Action, o qual estuda o processo de tomada de decisão do ser humano. Com base nos dados obtidos através de um questionário destinado aos consumidores finais, um teste de tabelas de contingência de Qui-quadrado testa se existe associação entre as diferentes variáveis, enquanto regressões lineares simples avaliam a força destas associações. Os resultados obtidos comprovam que uma maior percepção de valor dos produtos de saúde que exigem o uso de tecnologias big data (PV) está positivamente corelacionada com uma maior predisposição para a partilha de dados pessoais (WTS), bem como uma maior intenção para a aquisição deste tipo de produtos (WTB) e, finalmente, com uma positiva recomendação, tanto para a partilha de dados pessoais (WoM_sd) como para a compra de tais dispositivos (WoM_d). Finalmente, são criados três modelos de regressões lineares múltiplas. O primeiro modelo relaciona a dimensão de WTB com uma influência positiva de PV, WTS, WoM_d e WoM_sd. A segunda regressão testa a dimensão de WoM_d associada a PV, WTB e WTS. O terceiro modelo mostra que WoM_sd é explicado por PV, WTB e WTS. Estes três modelos estão em linha com as conclusões anteriormente obtidas no teste Qui-quadrado e nas regressões lineares simples

    Unsupervised Learning Framework for Customer Requisition and Behavioral Pattern Classification

    Maintaining healthy organization-customers relationship has positive influence on customers’ behavioral tendencies as regards preference to products and services, buying behavior, loyalty, satisfaction, and so on. To achieve this, an in-depth analysis of customers’ characteristics and purchasing behavioral trend is required. This paper proposes a hybrid unsupervised learning framework consisting of k-means algorithm and self-organizing maps (SOMs) for customer segmentation and behavior analysis. K-means algorithm was used to partition the entire input space of customers’ transaction dataset into 3 and 4 disjoint segments based on customers’ frequency (F) and monetary value (MV). SOM provided visualization of the underlying clusters and discovered customers’ relationships in the dataset. Interaction of F and MV clusters resulted in 12 sub-clusters. An in-depth analysis of each sub-cluster was also performed and appropriate customer relationship management (CRM) strategies established for each sub-cluster. Discovered knowledge will guide effective allocation of resources to each customer cluster and other organizational decision support functions much required by CRM systems. Keywords: customer relationship, data mining, k-means, pattern recognition, self organizing ma