logo

Data Mining & Visualization for Business Intelligence

8 Pages781 Words55 Views
   

Added on  2020-03-23

Data Mining & Visualization for Business Intelligence

   Added on 2020-03-23

ShareRelated Documents
Data Mining and Visualization for BusinessIntelligenceAssignment - IIStudent id[Pick the date]
Data Mining & Visualization for Business Intelligence_1
Question 1(a)The Xl Miner output for the utilities data is shown below.In the above output, the principal component matrix tends to highlight the eigen values whichhave been obtained for the various features corresponding to each of the principal components.This is essential in shortlisting the critical features taking magnitude of the coefficients underconsideration. Thus, for principal component 1, the features that would be considered the mostsignificant would be x1 and x2 owing to these coefficients being the two most highest. Hence,for any given principal components, the identification of key utilities parameter may beundertaken. It is noteworthy that these features collectively a particular aspect of the functioningof utilities thereby allowing understanding of the relative importance.1
Data Mining & Visualization for Business Intelligence_2
A key observation in case of PCA is that the scales for representation of various variables wouldbe different and hence the absolute variance value would vary. This in certain cases can providean intrinsic advantage to the variables having high values as the importance of these variablestends to be overestimated in the total variance matrix. However, this is not evident here wherethe principal component 1 has a representation less than 30% which augers well. Further,normalising the utilities data did not yield any better results and hence it may be said thatnormalisation is not required here. (b) Advantages of PCA method:Reduction method for minimizing the larger number of variables into smaller number ofvariables which are known as principal components. PCA provide detailed description regarding the image/structure of the variable setRecommendable when variables highlight linear relations It is based on max- variance technique and hence the focus is to maximize the variance ofvariables and through the variables with least variance which ensures easier identification ofcritical components.The magnitude of the new components is easy to find because the resultant PCA summarycontains orthogonal matrix. Disadvantages of PCA method:PCA is useful only when the aim is to determine the magnitude of the component because itcannot provide the actual estimation about the direction of the components due to thecomplex distribution in the dimensional cloud. 2
Data Mining & Visualization for Business Intelligence_3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Data Mining and Visualization | Assessment
|10
|1183
|87

Data Mining and Visualization Assignment
|9
|1040
|50

Report on Data Mining and Visualization for Business Intelligence
|10
|1048
|94

Assignment Data Mining & Visualization for Business Intelligence
|7
|1204
|86

Data Mining and Visualization Assessment | Study
|8
|1179
|111

Variance Matrix - A key feature of DMVBI
|8
|1369
|444