logo

Assignment on Data Mining

7 Pages920 Words38 Views
   

Added on  2020-03-16

Assignment on Data Mining

   Added on 2020-03-16

ShareRelated Documents
DATA MINING
STUDENT ID:
[Pick the date]
Assignment on Data Mining_1
DATA MINING
Question 1
The output for association rules is shown below.
i) Rule 1: If an individual buys brush, then nail polish is also purchased.
Confidence level: 100%
Rule 2: If an individual buys nail polish, then brush is also purchased.
Confidence level: 63.22%
Rule 3: If an individual buys nail polish, then bronzer is also purchased.
Confidence level: 59.19%
ii) One instance of a redundant rule is rule 2 as the rule 1 has also established
the relation between nail polish and brush and has a higher confidence level.
The utility of the given rules would be essentially assessed based on the
underlying confidence levels which would communicate the underlying
probability of the rule. For instance, rule no.1 in the given case is highly useful
as the confidence level is 100% and hence it implies the purchase of brush
would be followed by a purchase of nail polish (Ana, 2014).
Also, the underlying utility of the rules is based on namely two parameters.
Confidence Level
Lift Ratio
Ideally, both of them should be high for higher utility but even if one of these is
high, then the association rule under consideration may be significant. Thus,
based on these two aspects the relevant information about the patterns can be
made available (Abramowics, 2013).
iii)Relevant Output:
Assignment on Data Mining_2
DATA MINING
The chosen confidence level impacts the number of association rules displayed.
This is because only those rules would be displayed which would enjoy a
confidence level that is higher than the stated confidence level. Hence, for the
given case, the output would consist of only one rule which has a confidence
level in excess of 75%. Also, this may have adverse consequences as some rules
having high lift ratio may get discarded (Shumueli et. al., 2016).
Question 2
a) The dendrogram for the given data is as highlighted below.
There are three main clusters that are visible in the above dendogram which
become apparent at a distance cutoff value of 1000.
Assignment on Data Mining_3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Data Mining Assignment | XL Miner Output Report
|8
|936
|155

Rules of Association for XL Miner Output
|7
|739
|219

Data Mining and Redundant Rule Assignment
|9
|993
|143

Data Mining Association Rules & Clustering
|9
|951
|62

Data Mining Assignment | Business Analytics Assignment
|8
|876
|157

Association Rules (Relevant Output)
|11
|978
|303