Data Mining Analysis & Clustering Techniques

Verified

Added on  2020/03/16

|4
|433
|163
AI Summary
This data mining assignment analyzes customer purchase patterns using association rule mining, revealing relationships between products like brushes, nail polish, and bronzer. It further delves into hierarchical and K-means clustering techniques to segment customers based on their flight and bonus transaction behaviors. The analysis includes interpreting dendrogram outputs, understanding the impact of normalization, and identifying distinct customer clusters for targeted marketing strategies.
tabler-icon-diamond-filled.svg

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
DATA MINING
STUDENT ID:
[Pick the date]
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
DATA MINING
Question 1
Output(Association Rules)
i) Interpretation
The rule 1 highlights at 100% confidence level that purchase of brush is followed by purchase of
nail polish. The rule 2 highlights at 63.22% confidence level that purchase of nail polish is
followed by purchase of brush. The rule 3 highlights at 59.19% confidence level that purchase of
nail polish is followed by purchase of bronzer.
ii) Rule redundancy tends to occur for a given rule with regards to other when same support and
confidence is extended for every dataset by the former as the latter. In case of the given data, for
the rule 2, redundancy situation is observed in respect of rule 1, however, the level of confidence
for the rule 2 is comparatively lesser than rule 1.
iii) The minimum confidence level chosen tends to influence the number of rules that are displayed
in the output. For instance, for the output outlined above, rules with a confidence level greater
than 50% are highlighted. Thus, if the minimum confidence level is now increased to 75%, then
out of the above, only the first rule would be outlined since only this tends to fulfil the condition
of having a confidence level of atleast 75%.
Question 2
a) Dendogram Output
Document Page
DATA MINING
For the above dendogram, assuming a cut off distance of 1,000, there would be three clusters that
would be observed besides a number of smaller sub-clusters.
b) Absence of normalisation would give rise to the following issues.
The distance computation would be inaccurate unless equal weights are given to the
variables of interest.
The largest scale would tend to dominate the measure and hence to eliminate this effect,
normalisation is critical.
c) The various clusters are named in the following table.
d) K-Mean Clustering
Observation: Cluster 1 – High Net-worth Flyers owing to the average flight transactions exceeding 15 in
the last 12 months.
Cluster 2- Middle Class Flyers owing to non-flight bonus transactions and miles earned in this
regard being the lowest.
Document Page
DATA MINING
Cluster 3 – Non-Frequent Flyers owing to non-flight bonus transactions and miles earned in
this regard being quite significant while the flight transactions being quite low.
Conclusion: Results obtained from the K- Means clustering does not resemble the corresponding
result from hierarchical clustering.
e) The clusters to be targeting along with the requisite offers are briefed as follows.
chevron_up_icon
1 out of 4
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]