Data Mining and Business Intelligence

Verified

Added on  2020/03/16

|6
|897
|33
AI Summary
This assignment delves into the applications of data mining and visualization in business intelligence. It covers association rule mining, interpreting generated rules, and exploring cluster analysis techniques like K-means. The assignment discusses how to analyze customer purchasing patterns, identify clusters based on transaction data, and formulate targeted business strategies based on cluster insights.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Data Mining and Visualization for Business Intelligence
Assignment - 3
[Pick the date]
Student Name
Contents
1.1 Association Rules........................................................................................................................2
1.1.1 I)...........................................................................................................................................2
1.1.2 II)..........................................................................................................................................3
1.1.3 III).........................................................................................................................................4
1.2 Cluster Analysis............................................................................................................................4
1.2.1 A).........................................................................................................................................4

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
1.2.2 B)..........................................................................................................................................4
1.2.3 C)..........................................................................................................................................4
1.2.4 D).........................................................................................................................................5
1.2.5 e)..........................................................................................................................................5
1.1 Association Rules
1.1.1 I)
Interpretation of the first 3 rules:
Rule1:
It means when the customer buys Brushes & Concealer known as antecedent, they
also buys Bronzer & Nail Polish. This is observed many times which gives the
incident 80% confidence. Support here for event A is 77 while event C has support in
103 instances. The number of times both the events happened were 62, giving the lift
ratio of 3.90. (Gupta, Garg, & Sharma, 2014; Rajak & Gupta, 2008; Sujatha & CH,
2011).
Rule2:
It is opposite to the rule 1 and should be interpreted as, when the customer buys Nail
Polish & Bronzer they also tends to buy Brushes & Concealer. Support for event A is
103 while for event C is 77 & the intersection of the event is 62. But the confidence is
for the rule is as low as 50%.
Rule3:
If any customer purchase nail polish, concealer & bronzer together then they also buy
brushes & it has confidence of 81%.
Document Page
1.1.2 II)
To better understand the efficiency of the rules generated from the algorithm, there are various
criteria. Firstly we need to examine the level of confidence which gives shows the confidence for
the rules. Also, it should be logical & backed by the business understanding. For example, the
Rule 6 has Confidence level more than 80% & the lift ratio of 3.7. Also, this rules make logical
sense. Hence, this rule can be considered as efficient rule to apply.
1.1.3 III)
When the confidence level is set at 75% then the no. of association rules will reduce.
This is because the algorithm will only choose those rules in which confidence level
is more than or equal to 75%. Confidence Level is calculated by taking the proportion
of the support for A&C to support for A only. Hence, more transaction with the
intersection between antecedent & consequent is required to qualify as rules.
Document Page
1.2 Cluster Analysis
1.2.1 A)
There are total 5 clusters which we have specified in advance before running the
algorithm in software. This will help the algorithm to reach the convergence level.
1.2.2 B)
When the data is not normalized then the scale of the variable will affect the
distance calculated hence it dominate the measure.
1.2.3 C)
0 5 10 15 20 25 30 35
0
500
1000
1500
2000
2500
0
5
10
15
20
25
30
35
13
16
2
17
10
1415
18
5
2019
3
12
21
7 8 9
4
26
22
1
23
6
11
2425
30
272928
Dendrogram
There are 5 clusters marked with different colors.
D)
We had run the K-means clustering with total five centroids. With
number of iterations set at 50, the cluster algorithm find the convergence
level (found the optimal clusters based on the distance). The optimal
clusters given by hierarchical & k-means are same. So, the results
obtained are same from both the techniques.

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
1.2.4 e)
Before making offers to the customers based on each of the clusters,
each and every cluster must be examined to understand the people
personas that falls into the clusters. Based on the understanding, each
clusters should be validated. For eg, the customer having higher balance
should typically be in one cluster. This is observed from the results
obtained, they all comes in cluster 4. So, people in cluster 4 can be
offered more reward points to retain them as they generate higher
revenues to the business. Sending them the gifts, preferred seat
selection can help them to show as appreciation for the customer’s
loyalty. People who generally don’t do transaction are clustered in
cluster 1. So, the offers such as discount, reward points that is specific
to this segment so that the business can extract from the segment(Correa,
González, Nieto, & Amezquita, 2012; Iaci & Singh, 2012; Trebuna,
Halcinova, & Fil’o, 2014).
Document Page
References
Correa, A., González, A., Nieto, C., & Amezquita, D. (2012). Constructing a Credit Risk
Scorecard using Predictive Clusters. SAS Global Forum.
Gupta, A. K., Garg, R. R., & Sharma, V. K. (2014). Association Rule Mining Techniques
between Set of Items. International Journal of Intelligent Computing and Informatics, 1(1).
Iaci, R., & Singh, A. K. (2012). Clustering high dimensional sparse casino player tracking
datasets. UNLV Gaming Research & Review Journa, 16(1), 21–43.
Rajak, A., & Gupta, M. (2008). Association Rule Mining: Applications in Various Areas. In
International Conference on Data Management,. International Conference on Data
Management,.
1 out of 6
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]