logo

Assignment Data Mining & Visualization for Business Intelligence

7 Pages1204 Words86 Views
   

Added on  2020-03-23

Assignment Data Mining & Visualization for Business Intelligence

   Added on 2020-03-23

ShareRelated Documents
Data Mining and Visualization forBusiness IntelligenceAssignmentStudent id/name[Pick the date]
Assignment Data Mining & Visualization for Business Intelligence_1
Question 1a)The given data which captures features of 32 US based utilities has been analysed using PCAand the results obtained are shared below.The variance matrix is indicative of the fact that the four of the most significant principalcomponents account for 80% of the cumulative variance. Reducing the objective to explain onlythis much variance, from the principal component matrix, the following conclusion can be drawnabout the identified principal components (Shumueli et. al., 2016).The above has been derived by considering the top two features for each principal componentthat tend to have the highest value with no reference to sign which essentially captures onlydirection.In light of the different scales used for measuring the given variables in the dataset, a criticaldecision which needs to be made is in relation to the requirement for normalisation before PCAanalysis. For the given dataset, this is not a problem which is reflected both from the matrixindicating the variance where the first principal component represents about 1/4th of the1
Assignment Data Mining & Visualization for Business Intelligence_2
cumulative variance. Further, this does not undergo any significant improvement even afternormalisation is done before running PCA. Hence, this is representative of the lack of need fordata normalisation here (Hofmann & Chisholm, 2016).b)Advantages of using Principal Component Analysis (PCA) are shown below (Kudyba &Hoptroff, 2012).It provides easy to understand result which is especially in the form of covariance orthogonalmatrix. This is most common and useful method to comment on the actual structure of the data. This is a reduction process operation which minimizes the complex data into fewer/smallernumber data range. The newly derived variable also known as principal components are not correlated with theinitial data variable and hence they exhibit zero correlation (Correlation coefficient = 0). Disadvantages of using Principal Component Analysis (PCA) are shown below (Hofmann &Chisholm, 2016).The technique fails to examine the structure of variables which are not correlated with eachother by logical “linear relationship.”The procedure is not applied on the set of variables which are of categorical type. At times, the provided result from PCA comprises complex structures results from the dotproducts of principal components. Determination of magnitude of principal component is easy step however, the real directionof principal component is hard work because each variable has own path which sometimeleads to complexity in interpretation.Question 2 (a)Universal Bank Total customer data = 5000 Partition:Training = 60%Validation =40% 2
Assignment Data Mining & Visualization for Business Intelligence_3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Data Mining and Visualization | Assessment
|10
|1183
|87

Data Mining and Visualization Assessment | Study
|8
|1179
|111

Data Mining and Visualization for Business Intelligence- Assignment
|10
|836
|72

Data Mining - Business Case Analysis Assignment
|7
|841
|187

Data Mining & Visualization for Business Intelligence
|8
|781
|55

Data Mining and Visualization Assignment
|9
|1040
|50