Data Mining Business Case Analysis

Verified

Added on  2020/03/23

|7
|841
|187
Report
AI Summary
This report presents a detailed analysis of a data mining business case, focusing on Principal Component Analysis (PCA) and Naïve Bayes probability calculations. It discusses the advantages and disadvantages of PCA, the interpretation of customer data, and the implications for financial and operational performance. The analysis includes pivot tables and probability assessments based on customer behavior regarding credit card usage and loan acceptance.
tabler-icon-diamond-filled.svg

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
DATA MINING
BUSINESS CASE ANALYSIS
STUDENT ID
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Question 1
a) Output of PCA using XLMiner
Interpretation
Taking the total variance that ought to be explained as 80%, only the principal components
named 1, 2, 3 and 4 would be considered while the remaining would be considered noise.
Based on PCM (Principal Control Matrix), the key features for these shortlisted principal
components based on the respective magnitude of the Eigen values are identified below.
PC1 – Key features – X1 and X2 (Aspect of performance: Financial)
PC2 – Key features – X4 and X8 (Aspect of performance: Operational)
PC3 – Key features – X3 and X7 (Aspect of performance: Financial)
PC4 – Key features – X1 and X3 (Aspect of performance: Financial)
1
Document Page
Data normalisation need – No
Reason: Scale distortion does not seems significant which is captured from the respective
contribution in the total variance for the PCA. Presence of significant difference in scales would
lead to a much skewed result than has been obtained. Hence, PCA can be obtained without
normalising the data on utilities given for this exercise.
(b) Advantages and disadvantages of using the principal component analysis (PCA) are given
below:
Advantages – The structure of the given set of variable is easily visualized with the help of
PCA. This is a variable dimension reduction technique which reduces highly complex variables
into simpler set of variable called as principal component. It reduces the noise in the variable set
which provide more accurate result. The complexity of the data would be reduced in PCA and
therefore, the complexity of visualizing the data is also minimized. This technique is more
common in “Criminal investigation.” The newly developed variables (principal components)
would show zero correlation with the original (initial) variables. Therefore, it is easy to
differentiate the variance change and reduction among the variables.
Disadvantages This technique is appropriate only for set of variables that show linear relations.
Further, PCA analysis of dimension reduction is difficult to interpret. Also, when the data shows
discrimination in terms of variances distribution, then data normalization needs to be done before
applying PCA in order to mitigate the risk that would arise due to the scale change. The
produced “covariance matrix” is difficult to analyze. At times, when the training data is taken
from blind source, the simple invariance also cannot be evaluated in PCA. PCA is also
2
Document Page
insensitive towards the scaling effect. PCA is not considered a powerful method as compared
with other method which can also suitable for complex relation and categorical variables.
Question 2
The training data of 3000 customers has been derived by applying standard partition of
XLMiner and the data has been used for creating the pivot table which is shown in excel.
(a) Pivot table for training data is shown below:
Column variable : Online
First row variable : CC (credit card)
Second row variable : Loan (Personal loan)
Variable = 0 (Customer does not use the respective variable)
Variable =1 (Customer uses the respective variable)
For example: CC = 1 represents that customer does not use the credit card issued by universal
bank. Further, Online = 0 represents that customer would not accept the loan offer provided
by the universal bank.
3
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
(b) Total number of customers who are holding the credit card and also using the respective
universal bank service through online mode = 51
Total number of customers who are holding the credit card and also using the respective
universal bank service through online mode would also accept the offer of loan = 522
Probability = 51 /522 = 0.1074
10.74% is the probability that a randomly chosen customer will accept the offer of loan while
having credit card and online banking service.
(c) The two pivot tables and probability quantities are shown below:
Pivot table (1)
Column variable : Online
Row variable : Loan
Pivot table (2)
Column variable : Credit Card
Row variable : Loan
4
Document Page
Probability quantities
(i) Proportion of customers who has credit card and
would ready for the loan offer
P ( CC=1|Loan=1 ¿
¿ 93
304
= 0.305
(ii) Probability P ( Online=1|Loan=1 ¿ ¿ 183
304
¿ 0.601
(iii) Proportion of customers who would ready for the
loan offer
P ( Loan=1 )
¿ 304
3000 =0.101
(iv) Probability P ( CC=1|Loan=0¿ ¿ 800
2696 =0.296
(v) P ( Online=1|Loan=0 ¿ ¿ 1586
2696 =0.588
(vi) Probability P ( Loan=0 ) ¿ 2696
3000 =0.898
(d) Naïve Bayes Probability has been calculated based on the above computed quantities.
P ( A ) = { ( 0.3050.6010.101 ) }
P ( T ) = { ( 0.3050.6010.101 ) + ( 0.2960.5880.898 ) }
5
Document Page
P( Loan=1CC=1 , Online=1)= ( 0.3050.6010.101 )
( 0.3050.6010.101 ) + ( 0.2960.5880.898 ) =10.6 %
Naïve Bayes Probability for the training data would be 10.60%.
(e) Customers who are active online service users and also have been issued a credit card by
the bank would be most likely candidates for receiving personal loan on the basis of the
computation from part c and part d.
6
chevron_up_icon
1 out of 7
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]