ProductsLogo
LogoStudy Documents
LogoAI Grader
LogoAI Answer
LogoAI Code Checker
LogoPlagiarism Checker
LogoAI Paraphraser
LogoAI Quiz
LogoAI Detector
PricingBlogAbout Us
logo

Data Mining: Association & Hierarchical Clustering

Verified

Added on  2020/03/16

|9
|972
|38
AI Summary
This assignment delves into data mining concepts, focusing on association rule mining and hierarchical clustering. It examines the identification of patterns and relationships within datasets using XL Miner software. The analysis includes generating non-redundant association rules, interpreting cluster structures, and comparing results from different clustering methods. Furthermore, it discusses targeting specific customer segments based on identified patterns.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Data Mining
Student Id and Name
[Pick the date]

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Question 1
(i) The first association rule highlights that the purchase of brushes would lead to a purchase
of nail polish
Associated confidence level: 100%
The second association rule highlights that the purchase of nail polish would lead to a
purchase of brushes.
Associated confidence level: 63.22%
1
Document Page
The third association rule highlights that the purchase of nail polish would lead to a
purchase of bronzes.
Associated confidence level: 59.19%
(ii) The redundancy of rule can be a problem issue in association rules and hence they need to
identified and eliminated. For the present output, an opt candidate for redundancy is rule 2.
The ancestor rule for this is essentially rule 1. The lift ratio for the two has come out to be
same. Hence, the rule 2 does not provide any incremental value and is redundant (Hong,
Kuo & Chi, 1999).
The utility analysis of the association rules needs to pay attention to two aspects identified
below.
Lift ratio – Indication of support
Confidence level – Indication of confidence
Rules with higher value in atleast one aspect are usually considered useful as they contain
valuable information regarding the underlying purchasing patterns of consumers (Berkhin,
2015).
(iii) XL miner output (Minimum confidence level altered 75%)
2
Document Page
The increasing of minimum confidence interval results to display of only one rule as the rules
have to meet the 75% confidence level criterion which others are not meeting. However, caution
needs to be observed while fixing the confidence level too high as certain important rules with
high lift ratio may not be considered (Prithiviraj & Porkodi, 2011).
Question 2
(a) Dendrogram
There are three clusters based on the following evidence.
Highlighted dendogram assuming a cutoff at distance 995.
Input fed while computing hierarchical clustering was for three cluster formation.
3

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
(b) The normalisation is absent can lead to multiple issues. To begin with the distance between
the centroids would not be accurate as scale effect would be visible and hence dendrogram
and clusters would not be accurate. In such a case, the variable having higher scale also to
claim dominance in the clustering process and leads to inaccurate results. Hence it is
advisable to normalize or alternately provide equal weights to the variables of interest
(Chau, et. al., 2015).
(c) XL Miner Output (Cluster 1)
Features: Low transactions in the last 12 months visible in relation to both flight and also non-
flight, The balance miles that qualify for awards are at the lowest level amongst the clusters
Suitable Name: “Middle Class Flyer”
XL Miner Output (Cluster 2)
4
Document Page
Features: Transactions over the preceding year high for flights. The observation is similar for
non-flight bonus transaction. The balance miles that qualify for awards are at the highest level
amongst the clusters (Shumueli, et.al., 2016).
Suitable Name: “High Networth Flyer”
XL Miner Output (Cluster 3)
5
Document Page
Features: Difference behavior for flight and non-flight bonus transactions. The former is quite
low while the latter is the highest amongst all the clusters (Zaki, 2000).
Suitable Name: “Non-frequent Flyer”
(d) XL Miner Output (K Means Clustering)
The pattern comparison can be done if each cluster in the K means clustering represents the same
as in hierarchical clustering (Liebowitz, 2015).
The first cluster to be taken is cluster 1. The critical aspects are as follows.
Very high transactions that are flight based (>15)
Very high balance miles (>200,000)
Very high value for Qual_Miles
It is apparent that the segment captured by Cluster 1 is “High Networth Flyers”. But this
classification does not match with the hierarchical clustering output where Cluster 1 belongs to
the “Middle Class Flyer”. Hence difference in pattern is established and no need to compare the
other clusters (Ana, 2014).
e) Target and Offers
6

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
References
Ana, A. (2014) Integration of Data Mining in Business Intelligence System (4th ed.). Sydney:
IGA Global.
Berkhin, P. (2015). Survey of clustering Data Mining Techniques. Accrue software, Inc. 123-47.
https://www.cc.gatech.edu/~isbell/reading/papers/berkhin02survey.pdf
Chau, m., Cheng, R., Kao, B. & Ng, J. (2006). Uncertain Data Mining: An Example in clustering
Location Data. Retrieved from
https://link.springer.com/chapter/10.1007%2F11731139_24
Hong, P.T., Kuo, S.C., & Chi, C.S. (1999). Mining Association Rue From Quantitative Data.
Intelligent Data Analysis. Vol (3), 363-376.
http://www.sciencedirect.com/science/article/pii/S1088467X99000281
Liebowitz, J. (2015) Business Analytics: An Introduction (2nd ed.). New York: CRC Press.
Prithiviraj, P. & Porkodi, R. (2011) A Comparative Analysis of Association Rule Mining
Algorithms in Data Mining: A Study. Retrieved from
http://www.imedpub.com/articles/a-comparative-analysis-of-association-rulemining-
algorithms-in-data-mining-a-study.pdf
Shumueli, G., Bruce, C.P., Yahav, I., Patel, R. N., Kenneth, C., & Lichtendahl, J. (2016) Data
Mining For Business Analytics: Concepts Techniques and Application (2nd ed.).London:
John Wiley & Sons.
Zaki, M.J.(2000), Generating non-redundant association rules. In: Proceeding of the ACM
SIGKDD, pp. 34–43
7
Document Page
8
1 out of 9
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]