Analyzing Customer Churn Using Logistic Regression
VerifiedAdded on 2020/05/04
|15
|2208
|180
AI Summary
The task at hand is to analyze customer churn in the telecom industry through logistic regression modeling. This involves understanding various factors that lead to churn such as customer demographics, service usage patterns, and communication clarity. The analysis begins with exploratory data techniques like clustering to identify high-risk customers and proceeds with feature selection using methods such as F-score to hone in on significant predictors of churn. Logistic regression is then applied to predict the likelihood of a customer discontinuing services. The model's performance is assessed using metrics like recall, precision, and accuracy, ultimately selecting decision trees for their balanced performance in this classification task. Insights from this analysis are leveraged to recommend targeted acquisition strategies, compliance improvements in sales processes, and proactive customer retention efforts. These recommendations aim at reducing churn by tailoring campaigns, enhancing communication clarity, optimizing service offerings, and providing personalized plans based on purchasing behaviors.

Contents
1.1 Task 1: Conduct descriptive analysis based on the customer data and construct customer
profiles for each desired customer group...........................................................................................1
1.2 Task 2: Developing and evaluating models to predict propensity to churn............................6
1.2.1 Data Partition.................................................................................................................7
1.2.2 Variable Selection..........................................................................................................8
1.2.3 Model Building:..............................................................................................................8
1.2.4 Decision Tree.................................................................................................................9
1.2.5 Gradient Boosting Method..........................................................................................10
1.2.6 Model Comparison.......................................................................................................11
1.3 Campaign recommendations based on insights obtained from Tasks 1 and 2....................12
1.3.1 Targeted Acquisition:...................................................................................................12
1.3.2 Effective & compliant sales:.........................................................................................12
1.3.3 Customer Retention:....................................................................................................13
1.1 Task 1: Conduct descriptive analysis based on the customer data and construct
customer profiles for each desired customer group.
a. Churn Rate by Contract Type
1.1 Task 1: Conduct descriptive analysis based on the customer data and construct customer
profiles for each desired customer group...........................................................................................1
1.2 Task 2: Developing and evaluating models to predict propensity to churn............................6
1.2.1 Data Partition.................................................................................................................7
1.2.2 Variable Selection..........................................................................................................8
1.2.3 Model Building:..............................................................................................................8
1.2.4 Decision Tree.................................................................................................................9
1.2.5 Gradient Boosting Method..........................................................................................10
1.2.6 Model Comparison.......................................................................................................11
1.3 Campaign recommendations based on insights obtained from Tasks 1 and 2....................12
1.3.1 Targeted Acquisition:...................................................................................................12
1.3.2 Effective & compliant sales:.........................................................................................12
1.3.3 Customer Retention:....................................................................................................13
1.1 Task 1: Conduct descriptive analysis based on the customer data and construct
customer profiles for each desired customer group.
a. Churn Rate by Contract Type
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Figure 1 Churn rate by contract type
As inferred from the graph, there is significantly high churn rate in ‘’PAYG” variable (almost
50%). Contract with two year has very less churn rate. This may be due to the customer is
bound for these years & has already made the full payment.
b. Churn rate when the friend has churned
As inferred from the graph, there is significantly high churn rate in ‘’PAYG” variable (almost
50%). Contract with two year has very less churn rate. This may be due to the customer is
bound for these years & has already made the full payment.
b. Churn rate when the friend has churned

Figure 2 Churn rate if the friend has churned
If a friend has churned then the person is highly likely to churn. So, clearly there is peer
effect on churn analysis. This also shows that the churn rate is influenced by the external
factors also apart from the individual’s own behavior(Fathi and Khorasani, 2013).
c. Churn rate by people who watch Internet movies
The churn rate of people who watch internet movies is shown in the figure below. As
per the figures there is no significant difference in churn rate with people using
internet movies or not. So, it doesn’t provide much information in segmenting the
customers. In other words the individuals churn behaviour cannot be determined by
their habit of watching internet movies. Some other factors should be taken into
consideration to get the better churn rate.
If a friend has churned then the person is highly likely to churn. So, clearly there is peer
effect on churn analysis. This also shows that the churn rate is influenced by the external
factors also apart from the individual’s own behavior(Fathi and Khorasani, 2013).
c. Churn rate by people who watch Internet movies
The churn rate of people who watch internet movies is shown in the figure below. As
per the figures there is no significant difference in churn rate with people using
internet movies or not. So, it doesn’t provide much information in segmenting the
customers. In other words the individuals churn behaviour cannot be determined by
their habit of watching internet movies. Some other factors should be taken into
consideration to get the better churn rate.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

Figure 3 Churn rate of people how watches internet movies
d. Churn rate for customer who has dependent people with them.
Figure 4 Churn rate of people with dependent on them
From the graph we can see that people who has dependency from other people are
less likely to churn than people with no dependency.
e. Churn rate for customer who subscribed to technical support
d. Churn rate for customer who has dependent people with them.
Figure 4 Churn rate of people with dependent on them
From the graph we can see that people who has dependency from other people are
less likely to churn than people with no dependency.
e. Churn rate for customer who subscribed to technical support
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Figure 5 Churn rate of people with subscription to technical support
The churn rate for people who are not subscribed to technical support are very likely
to churn. From this it can be inferred that the technical support is actually very helpful
in retaining the customer. It can also be the reason that the interface for the customers
is not very user friendly and one can enjoy the subscription only with the technical
support. So it would be better for the company to make the entire process easy rather
than providing technical support to everyone.
f. Churn rate for people with ebill
Figure 6
Figure 7 Churn rate of people with e bill
It could be inferred that the people who does their bill payment through online has higher
churn rate than who doesn’t. This might be because people who does the online payment are
more advanced technologically & have option to cancel their policy very easily. On the other
hand, people who pays the bill offline are not very technologically advanced. So, the
awareness for various easy methods of cancellations would be less in this segment.
g. Churn rate by customer loyalty
The churn rate for people who are not subscribed to technical support are very likely
to churn. From this it can be inferred that the technical support is actually very helpful
in retaining the customer. It can also be the reason that the interface for the customers
is not very user friendly and one can enjoy the subscription only with the technical
support. So it would be better for the company to make the entire process easy rather
than providing technical support to everyone.
f. Churn rate for people with ebill
Figure 6
Figure 7 Churn rate of people with e bill
It could be inferred that the people who does their bill payment through online has higher
churn rate than who doesn’t. This might be because people who does the online payment are
more advanced technologically & have option to cancel their policy very easily. On the other
hand, people who pays the bill offline are not very technologically advanced. So, the
awareness for various easy methods of cancellations would be less in this segment.
g. Churn rate by customer loyalty

For the analysis we have first we have divided the customer into 4 buckets. Buckets were
decided based on the quartile the customer falls in. Then for each 4 bucket, the churn rate has
been analysed.
Figure 8 Churn rate of people in different bucket
Results for churn rate in different bucket are shown in the table above. However it shows that
there is no difference churn rate from the new customer to the loyal customer. The churn rate
in the entire four buckets is almost equal. The churn rate in the first bucket is 30 % who falls
in the first quartile followed by 32 % churn rate in the second quartile (Iaci and Singh, 2012;
Trebuna, Halcinova and Fil’o, 2014).
1.2 Task 2: Developing and evaluating models to predict propensity to churn
The first step on building the predictive models for churn analysis is to understand the data.
The data provided has 3333 observations & 21 variables.
Data has been imported into SAS EM. The Churn_ variable has been labelled as ‘target’. Few
variables like code, phone number are removed because they are not useful with the business
requirement.
Using the Stats Explorer, the important variables has been identified & plotted sequentially.
decided based on the quartile the customer falls in. Then for each 4 bucket, the churn rate has
been analysed.
Figure 8 Churn rate of people in different bucket
Results for churn rate in different bucket are shown in the table above. However it shows that
there is no difference churn rate from the new customer to the loyal customer. The churn rate
in the entire four buckets is almost equal. The churn rate in the first bucket is 30 % who falls
in the first quartile followed by 32 % churn rate in the second quartile (Iaci and Singh, 2012;
Trebuna, Halcinova and Fil’o, 2014).
1.2 Task 2: Developing and evaluating models to predict propensity to churn
The first step on building the predictive models for churn analysis is to understand the data.
The data provided has 3333 observations & 21 variables.
Data has been imported into SAS EM. The Churn_ variable has been labelled as ‘target’. Few
variables like code, phone number are removed because they are not useful with the business
requirement.
Using the Stats Explorer, the important variables has been identified & plotted sequentially.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

Figure 9 Stats explorer in SAS EM
As it can be seen Day_charge, Day_Mins are highly important variables. The night mins &
eve calls are less important.
Note: We have used cramer’s v for to show the variable importance. Cramer’s v takes care of
the scale difference of the variables.
1.2.1 Data Partition
Next we have splited the data into training & validation where 70% of the observations goes
to training & rest 30% goes for validation.
We have to make sure that the target variable distribution to be same in both training &
validation data set. If this check is not done then the data sampled would be biased &
wouldn’t give good representation of the model.
Data=TRAIN
Numeric Formatted Frequency
Variable Value Value Count Percent Label
Churn_ . False. 1994 85.5427 Churn?
Churn_ . True. 337 14.4573 Churn?
As it can be seen Day_charge, Day_Mins are highly important variables. The night mins &
eve calls are less important.
Note: We have used cramer’s v for to show the variable importance. Cramer’s v takes care of
the scale difference of the variables.
1.2.1 Data Partition
Next we have splited the data into training & validation where 70% of the observations goes
to training & rest 30% goes for validation.
We have to make sure that the target variable distribution to be same in both training &
validation data set. If this check is not done then the data sampled would be biased &
wouldn’t give good representation of the model.
Data=TRAIN
Numeric Formatted Frequency
Variable Value Value Count Percent Label
Churn_ . False. 1994 85.5427 Churn?
Churn_ . True. 337 14.4573 Churn?
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Data=VALIDATE
Numeric Formatted Frequency
Variable Value Value Count Percent Label
Churn_ . False. 856 85.4291 Churn?
Churn_ . True. 146 14.5709 Churn?
1.2.2 Variable Selection
Next we have run the variable selection clustering to identify the similar groups of variables.
Figure 10 Selecting the cluster of variables in SAS
As shown in the table below there are 6 different clusters of variable. The variable with
smallest 1-R2 is selected to represent the cluster.
1.2.3 Model Building:
Logistic Regression
We have built a logistic regression model since the target variable is binary. In other words
the target variable can take only two values either 0 or 1, where 1 indicates the success and 0
is used for the failure outcomes (Dutta, Bandopadhyay and Sengupta, 2012).
Numeric Formatted Frequency
Variable Value Value Count Percent Label
Churn_ . False. 856 85.4291 Churn?
Churn_ . True. 146 14.5709 Churn?
1.2.2 Variable Selection
Next we have run the variable selection clustering to identify the similar groups of variables.
Figure 10 Selecting the cluster of variables in SAS
As shown in the table below there are 6 different clusters of variable. The variable with
smallest 1-R2 is selected to represent the cluster.
1.2.3 Model Building:
Logistic Regression
We have built a logistic regression model since the target variable is binary. In other words
the target variable can take only two values either 0 or 1, where 1 indicates the success and 0
is used for the failure outcomes (Dutta, Bandopadhyay and Sengupta, 2012).

Figure 11 Results from the logistic regression model
Total 8 variables are selected for the final model. The wald chi-square provides the strength
of each variable relation to the target variable. From the results we could infer that Int_l_plan
has strong negative relationship with churn rate. If the p values are more than 0.05 then the
Independent variables are considered to be statistically insignificant. Eve_calls is
insignificant in the model. All other variables show p value less 0.05 suggesting that the
variables have significant impact on target variable.
1.2.4 Decision Tree
Decision tree are good method of predictive analytics. It helps business in identifying the key
variables. Also from the modelling perspective it is quite easy as it doesn’t take much time on
data preparation (Fokin and Hagrot, 2016).
Figure 12 Results from the decision tree
Note: please check the SAS results to view the output in better quality.
Total 8 variables are selected for the final model. The wald chi-square provides the strength
of each variable relation to the target variable. From the results we could infer that Int_l_plan
has strong negative relationship with churn rate. If the p values are more than 0.05 then the
Independent variables are considered to be statistically insignificant. Eve_calls is
insignificant in the model. All other variables show p value less 0.05 suggesting that the
variables have significant impact on target variable.
1.2.4 Decision Tree
Decision tree are good method of predictive analytics. It helps business in identifying the key
variables. Also from the modelling perspective it is quite easy as it doesn’t take much time on
data preparation (Fokin and Hagrot, 2016).
Figure 12 Results from the decision tree
Note: please check the SAS results to view the output in better quality.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

Figure 13 Results from the training and validation data
Above graph shows the % of true that lies in training set as well as validation set. Except few
nodes the model has performed well. This can be concluded because for most of the nodes
the training and the validation percentage are almost equal. Only in case of node 8 and 11
there is some difference between the two.
1.2.5 Gradient Boosting Method
It is a regression or classification technique, falls into machine learning, produces results
from ensemble of weak prediction.
Figure 14 Results from the Gradient boosting model
We have obtained important variables from the gradient boosting algorithm. Since, it splits
the data & create many trees on it is quite not possible to visualize all the trees here.
Above graph shows the % of true that lies in training set as well as validation set. Except few
nodes the model has performed well. This can be concluded because for most of the nodes
the training and the validation percentage are almost equal. Only in case of node 8 and 11
there is some difference between the two.
1.2.5 Gradient Boosting Method
It is a regression or classification technique, falls into machine learning, produces results
from ensemble of weak prediction.
Figure 14 Results from the Gradient boosting model
We have obtained important variables from the gradient boosting algorithm. Since, it splits
the data & create many trees on it is quite not possible to visualize all the trees here.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

1.2.6 Model Comparison
Figure 15 Comparison of model with ROC curve
ROC curve shows how much power we gain on decision making by using the underlying
model. From the graph, we could see that decision tree has highest lift. Higher the difference
between the base line and the ROC curve better the model. So the decision trees come out as
the best model as compared to other models included in the analysis (Fokin and Hagrot,
2016).
Figure 16 Results from the event classification table
Also since it is a classification problem, we can calculate accuracy, recall, and precision for
each model.
Figure 15 Comparison of model with ROC curve
ROC curve shows how much power we gain on decision making by using the underlying
model. From the graph, we could see that decision tree has highest lift. Higher the difference
between the base line and the ROC curve better the model. So the decision trees come out as
the best model as compared to other models included in the analysis (Fokin and Hagrot,
2016).
Figure 16 Results from the event classification table
Also since it is a classification problem, we can calculate accuracy, recall, and precision for
each model.

Where, tp = true positive
Fp= false positive
Fn= false negative
Regression: recall = 0.16, precision=0.54
Decision tree: recall=0.55, precision=0.90
Gradient Boosting: recall=0.16, precision=1
As shown in the results above the recall value for regression is 0.16 and precision 0.54.
Similarly in case of decision tree recall and precision values are 0.55 and 0.90 respectively.
The gradient boosting shows recall of 0.16 and precision of 1. Even though the precision of
gradient boosting is higher than that of decision tree, the recall value is very low. So from the
above result, we have selected the decision tree as best model.
1.3 Campaign recommendations based on insights obtained from Tasks 1
and 2
Based on the analysis the business can actually implement various policies around the
customer.
1.3.1 Targeted Acquisition:
Business should target only those customers who are less likely to cancel. Based on the
customer profiles, the business should create personalized campaign & sales. From the task 1
who knew that the friends could also influence the person reason to churn. So, creating
awareness of the product to the customer should be created. Clarity on communication done
to the customer would also be very useful in retaining the customer. Similarly the company
can also offer some discount to the customers who are more likely to cancel the plan. It can
also offer the plan in fewer prices if the customer opts for entire year (Oghojafor, Bakarea
and Omoera, 2012).
Fp= false positive
Fn= false negative
Regression: recall = 0.16, precision=0.54
Decision tree: recall=0.55, precision=0.90
Gradient Boosting: recall=0.16, precision=1
As shown in the results above the recall value for regression is 0.16 and precision 0.54.
Similarly in case of decision tree recall and precision values are 0.55 and 0.90 respectively.
The gradient boosting shows recall of 0.16 and precision of 1. Even though the precision of
gradient boosting is higher than that of decision tree, the recall value is very low. So from the
above result, we have selected the decision tree as best model.
1.3 Campaign recommendations based on insights obtained from Tasks 1
and 2
Based on the analysis the business can actually implement various policies around the
customer.
1.3.1 Targeted Acquisition:
Business should target only those customers who are less likely to cancel. Based on the
customer profiles, the business should create personalized campaign & sales. From the task 1
who knew that the friends could also influence the person reason to churn. So, creating
awareness of the product to the customer should be created. Clarity on communication done
to the customer would also be very useful in retaining the customer. Similarly the company
can also offer some discount to the customers who are more likely to cancel the plan. It can
also offer the plan in fewer prices if the customer opts for entire year (Oghojafor, Bakarea
and Omoera, 2012).
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide
1 out of 15
Related Documents
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2025 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.



