Customer Segmentation Using Machine Learning Model: An Application of RFM Analysis

Document Type

Article

Publication Date

Spring 9-8-2023

Abstract

Machine learning (ML) encompasses a diverse array of both supervised and unsupervised techniques that facilitate prediction, classification, and anomaly detection. Among the many fields of application for such techniques, customer churn prediction is a prominent one. In order to forecast customer switching, data scientists employ a variety of demographic, social, transactional, and behavioral variables and attributes. Unfortunately, many businesses in the United Kingdom still lack the comprehensive and adaptable consumer data required to perform accurate analyses. As a result, they often rely heavily on data produced by enterprise resource planning systems, which is primarily transactional in nature. Consequently, businesses are often limited to modeling and forecasting on transactional data alone and are unlikely to invest significantly in marketing research or other customer-related sources. Businesses are often limited to performing modeling and forecasting on transactional data that are most often not based on advanced techniques like recency, frequency and monetary (RFM) and ML. So, the major objective of the current work is to provide a mix of ML and RFM analysis techniques for churn prediction using mostly transactional data. The dataset was taken from the dataset search website containing online retail datasets. Every customer's RFM scores are computed based on the available data. A churn metric that indicates whether or not the customer has made a transaction in a limited time. Through this paper, different techniques are compared. We used K-means and DBSCAN clustering. By the end of this paper, it may be inferred that the act of dividing customers into six distinct clusters is a more practical and straightforward approach.

Comments

The topic is relevant and well motivated, particularly the focus on churn prediction using predominantly transactional data. However, the presentation could be improved by reducing repetition and tightening the narrative, especially in the discussion of data limitations and methodological choices. Clarifying the definition of the churn metric and more explicitly stating the main contribution and key findings would enhance clarity. Minor language and stylistic refinements are also recommended to improve readability.

Share

COinS