24 research outputs found

    CORE: A Few-Shot Company Relation Classification Dataset for Robust Domain Adaptation

    Full text link
    We introduce CORE, a dataset for few-shot relation classification (RC) focused on company relations and business entities. CORE includes 4,708 instances of 12 relation types with corresponding textual evidence extracted from company Wikipedia pages. Company names and business entities pose a challenge for few-shot RC models due to the rich and diverse information associated with them. For example, a company name may represent the legal entity, products, people, or business divisions depending on the context. Therefore, deriving the relation type between entities is highly dependent on textual context. To evaluate the performance of state-of-the-art RC models on the CORE dataset, we conduct experiments in the few-shot domain adaptation setting. Our results reveal substantial performance gaps, confirming that models trained on different domains struggle to adapt to CORE. Interestingly, we find that models trained on CORE showcase improved out-of-domain performance, which highlights the importance of high-quality data for robust domain adaptation. Specifically, the information richness embedded in business entities allows models to focus on contextual nuances, reducing their reliance on superficial clues such as relation-specific verbs. In addition to the dataset, we provide relevant code snippets to facilitate reproducibility and encourage further research in the field.Comment: Accepted to EMNLP 2023 main conferenc

    L’innovation dans la notation des clients pour le secteur des services financiers

    No full text
    Cette thèse améliore la notation des clients. L’évaluation des clients est importante pour les entreprises dans leurs processus de prise de décision parce qu'elle aide à résoudre des problèmes de gestion clés tels que le choix des clients à cibler pour une campagne de marketing ou l'analyse des clients qui sont susceptibles de quitter l'entreprise. La recherche effectuée dans le cadre de cette thèse apporte plusieurs contributions dans trois domaines de la littérature sur la notation des clients. Premièrement, de nouvelles sources de données sont utilisées pour évaluer les clients. Deuxièmement, la méthodologie pour passer des données aux décisions est améliorée. Troisièmement, la prédiction des événements courants du client est proposée comme une nouvelle application de la notation des clients. Tous les résultats présentés dans cette thèse sont issus de données réelles et sont non seulement d'une grande valeur académique, mais aussi d'une grande pertinence commerciale.This dissertation improves customer scoring. Customer scoring is important for companies in their decision making processes because it helps to solve key managerial issues such as the decision of which customers to target for a marketing campaign or the assessment of customer that are likely to leave the company. The research in this dissertation makes several contributions in three areas of the customer scoring literature. First, new sources of data are used to score customers. Second, methodology to go from data to decisions is improved. Third, customer life event prediction is proposed as a new application of customer scoring

    Innovation in customer scoring for the financial services industry

    No full text
    International audienc

    Customer Lifetime Value Modeling with Applications in Python and R: Lessons and Experiences from Industry and Research on how to Become a Customer-Centric

    No full text
    Firms and organisations cannot exist without customers. They essentially constitute the key ingredient to make a firm profitable and add shareholder and societal value. Despite recent technological advances in both data storage as well as processing and analysis, many small to large-scale firms are still struggling to quantify customer value, optimise customer relationships, facilitate customer experiences and identify customer journeys.Due to a nearly continuously expanding product portfolio, with new products and services being developed and marketed on an on-going basis, along a diversity of existing as well as innovative channels, modeling customer lifetime value is afar from simple exercise with many challenges and difficulties arising. More specifically, throughout our dealings with firms, we often found that simple questions such as "Who is actually your customer?", "Who are your most valuable customers?", "What is the best way to acquire new customers"?, "Why do your customers leave you?", "What product/service should be offered to what customer?", "How can you sell more to your customers?", "How do you measure customer value?", often provoked intense (if not fierce) discussions with answers not always readily available and uniformly agreed upon by business practitioners across different departments. This book tries to answer exactly these questions using data-driven and analytical techniques and insights. More specifically, we try to provide a clear and to-the-point guide of how to define, quantify, model and deploy Customer Lifetime Value (CLV) models from various perspectives by first identifying and defining the key problems and then offering ways to tackle them using carefully selected data combined with state of the art analytics

    Spline-rule ensemble classifiers with structured sparsity regularization for interpretable customer churn modeling

    No full text
    International audienceAn important business domain that relies heavily on advanced statistical- and machine learning algorithms to support operational decision-making is customer retention management. Customer churn prediction is a crucial tool to support customer retention. It allows an early identification of customers who are at risk to abandon the company and provides the ability to gain insights into why customers are at risk. Hence, customer churn prediction models should complement predictive performance with model insights. Inspired by their ability to reconcile strong predictive performance and interpretability, this study introduces rule ensembles and their extension, spline-rule ensembles, as a promising family of classification algorithms to the customer churn prediction domain. Spline-rule ensembles combine the flexibility of a tree-based ensemble classifier with the simplicity of regression analysis. They do, however, neglect the relatedness between potentially conflicting model components which can introduce unnecessary complexity in the models and compromises model interpretability. To tackle this issue, a novel algorithmic extension, spline-rule ensembles with sparse group lasso regularization (SRE-SGL) is proposed to enhance interpretability through structured regularization. Experiments on fourteen real-world customer churn data sets in different industries (i) demonstrate the superior predictive performance of spline-rule ensembles with sparse group lasso over a set well yet powerful benchmark methods in terms of AUC and top decile lift; (ii) show that spline-rule ensembles with sparse group lasso regularization significantly outperform conventional rule ensembles whilst performing at least as well as conventional spline-rule ensembles; and (iii) illustrate the interpretable nature of a spline-rule ensemble model and the advantage of structured regularization in SRE-SGL by means of a case study on customer churn prediction for a telecommunications company

    Leveraging fine-grained transaction data for customer life event predictions

    No full text
    International audienceThis real-world study with a large European financial services provider combines aggregated customer data including customer demographics, behavior and contact with the firm, with fine-grained transaction data to predict four different customer life events: moving, birth of a child, new relationship, and end of a relationship. The fine-grained transaction data—approximately 60 million debit transactions involving around 132,000 customers to >1.5 million different counterparties over a one-year period—reveal a pseudo-social network that supports the derivation of behavioral similarity measures. To advance decision support systems literature, this study validates the proposed customer life event prediction model in a real-world setting in the financial services industry; compares models that rely on aggregated data, fine-grained transaction data, and their combination; and extends existing methods to incorporate fine-grained data that preserve recency, frequency, and monetary value information of the transactions. The results show that the proposed model predicts life events significantly better than random guessing, especially with the combination of fine-grained transaction and aggregated data. Incorporating recency, frequency, and monetary value information of fine-grained transaction data also significantly improves performance compared with models based on binary logs. Fine-grained transaction data accounts for the largest part of the total variable importance, for all but one of the life events

    A decision support framework to incorporate textual data for early student dropout prediction in higher education

    No full text
    Managing student dropout in higher education is critical, considering its substantial impacts on students' lives, academic institutions, and society as a whole. Using predictive modeling can be instrumental for this task, as a means to identify dropouts proactively on the basis of student characteristics and their academic performance. To enhance these predictions, textual student feedback also might be relevant; this article proposes a hybrid decision support framework that combines predictive modeling with student segmentation efforts. A real-life data set from a French higher education institution, containing information of 14,391 students and 62,545 feedback documents, confirms the superior performance of the proposed framework, in terms of the area under the curve and top decile lift, compared with various benchmarks. In contributing to decision support system research, this study (1) proposes a new framework for automatic, data-driven segmentation of students based on textual data; (2) compares multiple text representation methods and confirms that incorporating student textual feedback data improves the predictive performance of student dropout models; and (3) establishes useful insights to help decision-makers anticipate and manage student dropout behaviors
    corecore