Data Mining Analysis with XL Miner

Verified

Added on 2020/03/15

AI Summary

This assignment delves into data mining techniques using XL Miner. It analyzes association rules based on customer purchase behavior and demonstrates hierarchical clustering to group customers into distinct segments. The analysis highlights the importance of considering confidence intervals in rule generation and explores how normalization impacts distance calculations in clustering. The assignment concludes by discussing cluster targeting and offer strategies based on identified customer groups.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.

DATA MINING
STUDENT ID:
[Pick the date]

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

DATA MINING
Question 1
XL Miner Output
i) The various rules of association are listed above and are arranged in the decreasing order of life
ratio. The first three rules are highlighted below.
 In accordance with rule 1, the brush purchase tends to be followed by the nail polish
purchase. The associated confidence level is 100%.
 In accordance with rule 2, the nail polish purchase tends to be followed by the brush
purchase. The associated confidence level is 63.22%.
 In accordance with rule 3, the nail polish purchase tends to be followed by the bronzer
purchase. The associated confidence level is 59.20%.
ii) By lowering the confidence interval, the first couple of dozen rules would be outlined.
Redundancy situation is observed for Rule 16 and Rule 17. Also a similar situation is noticeable in

DATA MINING
case of Rule 2 when compared with Rule 1. However, the confidence interval in case of the
former is much lower (Ana, 2014).
These rules have significant utility as they tend to outline the customer behaviour in terms of
buying items. But this utility may only be assessed if the rules are considered as a whole rather
than considering them individually. Also, in this process due consideration needs to be paid to the
individual characteristics of the rules that are captured by namely two aspects support and
confidence (Abramowics, 2013).
iii) The number of rules outlined tends to be driven by the minimum confidence level that is selected
in the Xl Miner. If it is increased to 75%, then rules having lower than this would not be displayed.
Hence, essentially only one rule appears as output for this case. This is apparent from the output
attached.
However precaution is to be observed while increasing this level to very high limits as significant
rules may be omitted (Ragsdale, 2014).
Question 2
a) XL Miner Output

DATA MINING
In line with the dendogram produced above, three major clusters are observable if the threshold
distance is selected as 1000 and a horizontal line is drawn.
b) If normalisation of data is not practiced, then potentially two issues can arise. The accuracy of the
computation of distance would be impaired because of higher weight being awarded to the
higher scale measure. Also, the measure would be dominated by the scale that is the largest and
hence normalisation becomes necessary (Shumueli et. al., 2016).
c) The cluster labelling is carried out below (Ana, 2014).
The above table has been drawn based on the following output for different clusters obtained
through hierarchical clustering.
Cluster 1:

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

DATA MINING
Cluster 2:
Cluster 3:

DATA MINING
d) K-Means Clustering (XL Miner)
Based on the above output, the following cluster formation can be obtained.
Middle Class Flyers – Cluster 2 (Limited flight transactions along with very limited bonus non-flight
transactions)
High Net-worth Flyers – Cluster 1 (Balance is the highest along with flight transactions in the last one
year)
Non-Frequent Flyers – Cluster 3 ( Non-flight bonus transactions are the highest for all the clusters
but comparative performance in terms of in-flight transaction is very dismal).
The logical conclusion from the above is that the results from the two different clustering are not the
same (Shumueli et. al., 2016).
e) Cluster Targeting And Offer

DATA MINING
References
Abramowics, W. (2013) Business Information Systems Workshops: BIS 2013 International
Workshops (5th ed.). New York: Springer.
Ragsdale, C. (2014) Spread sheet Modeling and Decision Analysis: A Practical Introduction to
Business Analytics (7th ed.). London: Cengage Learning.
Shumueli, G., Bruce, C.P., Yahav, I., Patel, R. N., Kenneth, C., & Lichtendahl, J. (2016) Data
Mining For Business Analytics: Concepts Techniques and Application (2nd ed.).London:
John Wiley & Sons.
Ana, A. (2014) Integration of Data Mining in Business Intelligence System (4th ed.). Sydney:
IGA Global