Optimizing collection processes using conservative Q-learning

Abstract

This study proposes a reinforcement learning framework based on Conservative Q-Learning (CQL) to optimize debt collection strategies while mitigating customer churn. Traditional rule-based approaches often fail to adapt to individual customer profiles or evolving behaviors. To address this limitation, our framework dynamically recommends actions tailored to each customer's characteristics. Using customer datasets, we evaluate the performance of the proposed model across various reward coefficient settings (representing potential future profit loss in case of churn). The results show that while standard Q-Learning generally underperforms the rule-based strategy, CQL achieves overall performance comparable to rule-based approaches. Notably, product-level analysis shows statistically significant improvements for general-purpose installment loan (GPL) customers, while outcomes for credit card (CC) and overdraft (OD) customers are weaker. This likely reflects reinforcement learning's tendency to prioritize higher-value cases, suggesting that product-specific models may further enhance performance across loan types

Similar works

Full text

thumbnail-image

eResearch@Ozyegin

redirect
Last time updated on 21/01/2026

This paper was published in eResearch@Ozyegin.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.