User voice overview: topic recognition and sentiment analysis of customer feedback in the B2C sector

Abstract

With deep customer knowledge being a competitive advantage for a lot of companies, especially those in the B2C field, integration of Support teams into Product organization is considered essential nowadays. However, a proper feedback loop between those teams is usually a common pitfall, especially for companies that have a lot of customer contacts. Consolidating feedback and getting an overview of the users’ voice is a challenging task, and this research study investigates whether using Natural Language Processing (NLP) can help to automate this manual process. This thesis describes the implementation of the proof-of-concept prototype for conducting practical research focused around topic recognition and sentiment analysis domains. After introducing the main concepts and concluding, which NLP methods would suit best in the customer support setting, the research environment gets appointed. Namely, work is conducted using Zendesk for corpus extraction and Python for the prototype implementation. In turn, GuidedLDA package gets selected for topic recognition, VADER — for sentiment analysis, and PyInquirer for the CLI implementation. To evaluate the performance of the prototype, it was tested on real customer requests data. In order to closely supervise the results, the tests were first carried out on a small data-set using a supervised approach. Afterwards, tests on larger datasets were performed and the accuracy rate got evaluated using random samples. These experiments showed satisfactory results for both of the evaluation methods. Supervised tests managed to reach the accuracy rate of 94% for topic recognition and 93% for sentiment analysis. With the larger data sets, random sample evaluation showed 75% success rate for topic assignments, and 80% success rate for sentiment recognition. Aforementioned evaluations helped to conclude that the chosen methods for automating feedback analysis can, in fact, be utilised in the “real world” scenario. While occasional human input and supervision is still required, possible inaccuracies are outbalanced by the advantages of applying the NLP algorithms. With the achieved accuracy rates, both GuidedLDA and VADER can be considered suitable for the purpose of analysing the sentiment of customer feedback and providing an overview of the users’ voice

    Similar works

    Full text

    thumbnail-image