Leveraging Non-respondent Data in Customer Satisfaction Modeling
Picture (image file) Created by Uwe Kils (iceberg) and User:Wiska Bodo (sky) (external link, opens in new window) ., CC BY-SA 3.0 (external link, opens in new window) , via Wikimedia Commons.
Retaining customers is a constant challenge for marketers, particularly for businesses with fragile customer base and highly competitive markets. Marketers need to be proactive in designing effective service strategies with the goal of improving customer satisfaction and retention. Consequently, customer satisfaction surveys are frequently done by companies since the status of a customer’s satisfaction changes from time to time.
In market research practices, collecting survey data is prohibitively expensive and nearly impossible to obtain from all current customers. More importantly, the response rates from customer satisfaction surveys are very low. This introduces a bias in statistical estimates, if the salient features of non-respondents are completely ignored.
By considering only respondents’ data during the customer satisfaction, the model represents attitudes, opinions and other information of respondents while non-respondent customers are a huge portion of the firm’s ecosystem. Such bias has a huge impact on both the reliability and validity of survey study findings.
The inevitable relationship between response rate and the quality of the modeling is based on the assumption that the higher the response rate is, the greater the probability of the customers represent the entire population. Based on the gaps in prior studies and market research practices discussed above, we present the following research questions in this study: (1) How can we leverage non-respondent data to build a valid model for customer satisfaction? (2) Can non-respondents’ data be useful in predicting customer satisfaction?
We propose an end-to-end machine learning framework to leverage non-respondents’ data for time-aware customer satisfaction modeling. We propose a novel framework that models customer satisfaction and satisfaction time based on domain-driven and data-driven attributes. Our approach predicts customer satisfaction while considering non-respondents’ data during the modeling.
To learn from non-respondents (NR), we propose a learn-to-rank approach that uses NR data for building an accurate satisfaction model. We validate that NR improve the quality of the customer satisfaction prediction significantly. We conduct extensive experiments on a real dataset provided by an auto insurance company and show that the proposed framework predicts the satisfaction or dissatisfaction time effectively and outperforms extant methods in the literature.
Our framework extends existing modeling approaches in the literature and enhances the performance accuracy of prediction algorithms for customer satisfaction. It also offers a unique practical contribution to marketers by mitigating the challenges and costs involved in frequent customer satisfaction surveys and non-response follow-up campaigns.
One of the main implications of this study is the fact that non-respondents can be as valuable as respondents during customer satisfaction modeling. Low response rate can introduce a bias in statistical estimates, if the salient features of non-respondents are completely ignored. Because non-respondents usually represent a larger portion of the customer base, the bias may have a significant impact on both the reliability and validity of models and their findings

To learn more, see the full article: