logo

ITC 516 Data Mining and Visualization for Business Intelligence

   

Added on  2020-03-23

7 Pages750 Words162 Views
ITC 516 DATA MINING AND VISUALISATIONFOR BUSINESS INTELLIGENCEAssessment - IIStudent Id & Name[Pick the date]
ITC 516 Data Mining and Visualization for Business Intelligence_1
Question 1a)The principal component analysis result is as highlighted below.Analysis of the PCA matrix 80% of the variance explanation tendered by first four consecutive principal componentsand hence the analysis in the PCA matrix can be reduced to these only.The first principal component on the basis of the key parameters (i.e. x1 and x2) seems torepresent the utility financial performance.The second principal component on the basis of the key parameters (i.e. x4 and x8) seemsto represent the utility operational performance.The third principal component on the basis of the key parameters (i.e. x3 and x7) seemsto represent the utility production cost for electricity.The fourth principal component on the basis of the key parameters (i.e. x1 and x3) seemsto represent the utility fixed cost in relation to electricity.1
ITC 516 Data Mining and Visualization for Business Intelligence_2
Need for NormalizationNormalization need arises when different scales of variables tend to be of significance as thiseffect tends to get magnified in PCA owing to emphasis on highest variance. This usually isreflected in the total variance matrix which highlights the contribution of the different principalcomponents. However, seeing the same in this case, normalization need prior to PCA does notarise as no particular principal component has disproportionately contribution leading toinsignificance of the other components.(b) Advantages and disadvantages of applying PCA are listed below:AdvantagesProvide estimation about the structure of the data Easy to visualize the principal components in m and p dimensional space It reduces the risk which can be generated in over-fitting The principal components are arranged orthogonally and hence, easier to interpret Disadvantages/ limitation It cannot be used when the data variables are showing non-linear association Difficult to examine the exact direction of principal component The data which are not summarized through Gaussian distribution cannot be analyzed byPCA. Difficult to find the component with highest variance principal component.2
ITC 516 Data Mining and Visualization for Business Intelligence_3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Data Mining Business Case Analysis Report
|9
|846
|160

Report on Data Mining Information
|10
|1148
|43

Assignment on Data Mining Business Case Analysis
|7
|724
|42

Data Mining - Business Case Analysis Assignment
|7
|841
|187

Data Mining and Visualization - Assignment
|10
|789
|188

Data Mining and Visualization - PDF
|9
|779
|313