STAT 4002: Homework on Principal Component Analysis and Inferences
VerifiedAdded on  2022/09/18
|15
|2743
|40
Homework Assignment
AI Summary
This document presents a comprehensive solution to a STAT 4002 homework assignment on multivariate techniques, specifically focusing on Principal Component Analysis (PCA) and inferences on the mean vector. The assignment involves loading data, calculating the sample covariance matrix, finding eigenvectors, and implementing PCA using both 'princomp' and 'prcomp' functions in R. The solution includes detailed analysis of the scree plot to determine the number of principal components to retain and provides comparisons of loadings from different PCA implementations. Additionally, the assignment addresses hypothesis testing related to mean vectors, including testing for parallel groups, equal means, and horizontal groups, with corresponding test statistics, p-values, and interpretations. The solution also includes R code and outputs for each step, providing a complete guide to the analysis. The solution also includes plots of PC1 and PC2, PC1 and PC3, and PC2 and PC3. The document offers a complete overview of the statistical methods and the interpretation of the results, making it a valuable resource for students studying multivariate techniques.

Statistics
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Statistics
#====
#Question 1
#Loading Data
load(file = "HW2Q1Data.rda")
#----
#Part a
#Sample Covariance Matrix
CovMat <- cov(carmean2)
CovMat
## Economy Service Value Price Design
Sporty
## Economy 0.7462846 -0.2339921 -0.4213439 0.7783992 -0.4540909 -
0.4221344
## Service -0.2339921 0.6552174 0.8175889 -0.7087945 0.5213636
0.5713043
## Value -0.4213439 0.8175889 1.1853360 -1.1094862 0.7663636
0.8955336
## Price 0.7783992 -0.7087945 -1.1094862 1.4120158 -0.8959091 -
1.0436759
## Design -0.4540909 0.5213636 0.7663636 -0.8959091 0.7218182
0.7700000
## Sporty -0.4221344 0.5713043 0.8955336 -1.0436759 0.7700000
1.0532806
## Safety -0.2751186 0.8517391 1.1198024 -0.9521739 0.6790909
0.7577075
## Easy 0.2839723 0.1358300 0.1107115 0.1810079 -0.1040909 -
0.1454743
## Safety Easy
## Economy -0.2751186 0.2839723
## Service 0.8517391 0.1358300
## Value 1.1198024 0.1107115
## Price -0.9521739 0.1810079
## Design 0.6790909 -0.1040909
## Sporty 0.7577075 -0.1454743
## Safety 1.2593676 0.2202767
## Easy 0.2202767 0.3184585
#----
#Part b
#Eigen Vectors
EigenVectors <- eigen(CovMat)$vectors
EigenVectors
## [,1] [,2] [,3] [,4] [,5]
[,6]
## [1,] 0.22074468 -0.5406128 -0.59342593 0.1091400 0.03715014 -
0.2076344
## [2,] -0.30597795 -0.2790384 0.12849793 -0.2740828 -0.10755712
2
#====
#Question 1
#Loading Data
load(file = "HW2Q1Data.rda")
#----
#Part a
#Sample Covariance Matrix
CovMat <- cov(carmean2)
CovMat
## Economy Service Value Price Design
Sporty
## Economy 0.7462846 -0.2339921 -0.4213439 0.7783992 -0.4540909 -
0.4221344
## Service -0.2339921 0.6552174 0.8175889 -0.7087945 0.5213636
0.5713043
## Value -0.4213439 0.8175889 1.1853360 -1.1094862 0.7663636
0.8955336
## Price 0.7783992 -0.7087945 -1.1094862 1.4120158 -0.8959091 -
1.0436759
## Design -0.4540909 0.5213636 0.7663636 -0.8959091 0.7218182
0.7700000
## Sporty -0.4221344 0.5713043 0.8955336 -1.0436759 0.7700000
1.0532806
## Safety -0.2751186 0.8517391 1.1198024 -0.9521739 0.6790909
0.7577075
## Easy 0.2839723 0.1358300 0.1107115 0.1810079 -0.1040909 -
0.1454743
## Safety Easy
## Economy -0.2751186 0.2839723
## Service 0.8517391 0.1358300
## Value 1.1198024 0.1107115
## Price -0.9521739 0.1810079
## Design 0.6790909 -0.1040909
## Sporty 0.7577075 -0.1454743
## Safety 1.2593676 0.2202767
## Easy 0.2202767 0.3184585
#----
#Part b
#Eigen Vectors
EigenVectors <- eigen(CovMat)$vectors
EigenVectors
## [,1] [,2] [,3] [,4] [,5]
[,6]
## [1,] 0.22074468 -0.5406128 -0.59342593 0.1091400 0.03715014 -
0.2076344
## [2,] -0.30597795 -0.2790384 0.12849793 -0.2740828 -0.10755712
2

Statistics
0.6891689
## [3,] -0.44346793 -0.2191251 0.06175501 0.3292515 -0.40319081
0.1158605
## [4,] 0.47767627 -0.2964806 -0.11642410 -0.4561908 -0.01677775
0.3512380
## [5,] -0.33268483 0.1386958 -0.20025752 -0.7204073 -0.36718744 -
0.4111211
## [6,] -0.38569859 0.1557756 -0.71265406 0.1460579 0.13023590
0.2451503
## [7,] -0.41623018 -0.4565588 0.21554180 -0.1701540 0.66939195 -
0.2419840
## [8,] 0.01098606 -0.4919455 0.13974456 0.1648590 -0.47363848 -
0.2397225
## [,7] [,8]
## [1,] 0.43602713 0.245625524
## [2,] 0.47623487 -0.153492508
## [3,] -0.30152838 0.613346794
## [4,] -0.56452099 0.141161045
## [5,] 0.04250816 0.073078202
## [6,] -0.31721979 -0.346516075
## [7,] -0.19039136 -0.002274241
## [8,] -0.18652925 -0.628146827
EigenVectors[, c(1,2)]
## [,1] [,2]
## [1,] 0.22074468 -0.5406128
## [2,] -0.30597795 -0.2790384
## [3,] -0.44346793 -0.2191251
## [4,] 0.47767627 -0.2964806
## [5,] -0.33268483 0.1386958
## [6,] -0.38569859 0.1557756
## [7,] -0.41623018 -0.4565588
## [8,] 0.01098606 -0.4919455
#----
#Part c
The Principle Component Analysis (PCA) is a statistical approach
that is aimed at determining projected directions for which the
projected data points vary the most, implying maximum variation
(Everitt & Skrondal, 2010). Covariance on its side involves computing
the degree to which two attributes from different populations vary
(Everitt & Skrondal, 2010). This hence indicates a commonality of
interest in both the Principle Component Analysis (PCA) and the
Covariance, the interest being the variation. Therefore, this
commonality of interest qualifies the covariance matrix, over the
correlation matrix, for the computations in the Principle Component
Analysis (PCA)
3
0.6891689
## [3,] -0.44346793 -0.2191251 0.06175501 0.3292515 -0.40319081
0.1158605
## [4,] 0.47767627 -0.2964806 -0.11642410 -0.4561908 -0.01677775
0.3512380
## [5,] -0.33268483 0.1386958 -0.20025752 -0.7204073 -0.36718744 -
0.4111211
## [6,] -0.38569859 0.1557756 -0.71265406 0.1460579 0.13023590
0.2451503
## [7,] -0.41623018 -0.4565588 0.21554180 -0.1701540 0.66939195 -
0.2419840
## [8,] 0.01098606 -0.4919455 0.13974456 0.1648590 -0.47363848 -
0.2397225
## [,7] [,8]
## [1,] 0.43602713 0.245625524
## [2,] 0.47623487 -0.153492508
## [3,] -0.30152838 0.613346794
## [4,] -0.56452099 0.141161045
## [5,] 0.04250816 0.073078202
## [6,] -0.31721979 -0.346516075
## [7,] -0.19039136 -0.002274241
## [8,] -0.18652925 -0.628146827
EigenVectors[, c(1,2)]
## [,1] [,2]
## [1,] 0.22074468 -0.5406128
## [2,] -0.30597795 -0.2790384
## [3,] -0.44346793 -0.2191251
## [4,] 0.47767627 -0.2964806
## [5,] -0.33268483 0.1386958
## [6,] -0.38569859 0.1557756
## [7,] -0.41623018 -0.4565588
## [8,] 0.01098606 -0.4919455
#----
#Part c
The Principle Component Analysis (PCA) is a statistical approach
that is aimed at determining projected directions for which the
projected data points vary the most, implying maximum variation
(Everitt & Skrondal, 2010). Covariance on its side involves computing
the degree to which two attributes from different populations vary
(Everitt & Skrondal, 2010). This hence indicates a commonality of
interest in both the Principle Component Analysis (PCA) and the
Covariance, the interest being the variation. Therefore, this
commonality of interest qualifies the covariance matrix, over the
correlation matrix, for the computations in the Principle Component
Analysis (PCA)
3
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

Statistics
#----
#Part d
#PCA using princomp and Covariance Matrix
pca1 <- princomp(carmean2, cor = F)
summary(pca1)
## Importance of components:
## Comp.1 Comp.2 Comp.3 Comp.4
Comp.5
## Standard deviation 2.3053724 1.0468979 0.59199170 0.30848713
0.2690438
## Proportion of Variance 0.7557791 0.1558552 0.04983609 0.01353277
0.0102934
## Cumulative Proportion 0.7557791 0.9116344 0.96147048 0.97500325
0.9852967
## Comp.6 Comp.7 Comp.8
## Standard deviation 0.213864811 0.185036144 0.153034018
## Proportion of Variance 0.006504163 0.004868844 0.003330341
## Cumulative Proportion 0.991800815 0.996669659 1.000000000
#----
#Part e
#PCA using prcomp and conter = T
pca2 <- prcomp(carmean2, center = T)
summary(pca2)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6
PC7
## Standard deviation 2.3572 1.0704 0.60530 0.31542 0.27509 0.2187
0.18919
## Proportion of Variance 0.7558 0.1559 0.04984 0.01353 0.01029 0.0065
0.00487
## Cumulative Proportion 0.7558 0.9116 0.96147 0.97500 0.98530 0.9918
0.99667
## PC8
## Standard deviation 0.15647
## Proportion of Variance 0.00333
## Cumulative Proportion 1.00000
#----
#Part f
#Comparing loadings
pca1$loadings
##
## Loadings:
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## Economy 0.221 0.541 0.593 0.109 0.208 0.436 0.246
## Service -0.306 0.279 -0.128 -0.274 -0.108 -0.689 0.476 -0.153
4
#----
#Part d
#PCA using princomp and Covariance Matrix
pca1 <- princomp(carmean2, cor = F)
summary(pca1)
## Importance of components:
## Comp.1 Comp.2 Comp.3 Comp.4
Comp.5
## Standard deviation 2.3053724 1.0468979 0.59199170 0.30848713
0.2690438
## Proportion of Variance 0.7557791 0.1558552 0.04983609 0.01353277
0.0102934
## Cumulative Proportion 0.7557791 0.9116344 0.96147048 0.97500325
0.9852967
## Comp.6 Comp.7 Comp.8
## Standard deviation 0.213864811 0.185036144 0.153034018
## Proportion of Variance 0.006504163 0.004868844 0.003330341
## Cumulative Proportion 0.991800815 0.996669659 1.000000000
#----
#Part e
#PCA using prcomp and conter = T
pca2 <- prcomp(carmean2, center = T)
summary(pca2)
## Importance of components:
## PC1 PC2 PC3 PC4 PC5 PC6
PC7
## Standard deviation 2.3572 1.0704 0.60530 0.31542 0.27509 0.2187
0.18919
## Proportion of Variance 0.7558 0.1559 0.04984 0.01353 0.01029 0.0065
0.00487
## Cumulative Proportion 0.7558 0.9116 0.96147 0.97500 0.98530 0.9918
0.99667
## PC8
## Standard deviation 0.15647
## Proportion of Variance 0.00333
## Cumulative Proportion 1.00000
#----
#Part f
#Comparing loadings
pca1$loadings
##
## Loadings:
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## Economy 0.221 0.541 0.593 0.109 0.208 0.436 0.246
## Service -0.306 0.279 -0.128 -0.274 -0.108 -0.689 0.476 -0.153
4
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Statistics
## Value -0.443 0.219 0.329 -0.403 -0.116 -0.302 0.613
## Price 0.478 0.296 0.116 -0.456 -0.351 -0.565 0.141
## Design -0.333 -0.139 0.200 -0.720 -0.367 0.411
## Sporty -0.386 -0.156 0.713 0.146 0.130 -0.245 -0.317 -0.347
## Safety -0.416 0.457 -0.216 -0.170 0.669 0.242 -0.190
## Easy 0.492 -0.140 0.165 -0.474 0.240 -0.187 -0.628
##
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7
Comp.8
## SS loadings 1.000 1.000 1.000 1.000 1.000 1.000 1.000
1.000
## Proportion Var 0.125 0.125 0.125 0.125 0.125 0.125 0.125
0.125
## Cumulative Var 0.125 0.250 0.375 0.500 0.625 0.750 0.875
1.000
pca2$rotation
## PC1 PC2 PC3 PC4 PC5
PC6
## Economy -0.22074468 -0.5406128 -0.59342593 0.1091400 -0.03715014
0.2076344
## Service 0.30597795 -0.2790384 0.12849793 -0.2740828 0.10755712 -
0.6891689
## Value 0.44346793 -0.2191251 0.06175501 0.3292515 0.40319081 -
0.1158605
## Price -0.47767627 -0.2964806 -0.11642410 -0.4561908 0.01677775 -
0.3512380
## Design 0.33268483 0.1386958 -0.20025752 -0.7204073 0.36718744
0.4111211
## Sporty 0.38569859 0.1557756 -0.71265406 0.1460579 -0.13023590 -
0.2451503
## Safety 0.41623018 -0.4565588 0.21554180 -0.1701540 -0.66939195
0.2419840
## Easy -0.01098606 -0.4919455 0.13974456 0.1648590 0.47363848
0.2397225
## PC7 PC8
## Economy -0.43602713 0.245625524
## Service -0.47623487 -0.153492508
## Value 0.30152838 0.613346794
## Price 0.56452099 0.141161045
## Design -0.04250816 0.073078202
## Sporty 0.31721979 -0.346516075
## Safety 0.19039136 -0.002274241
## Easy 0.18652925 -0.628146827
EigenVectors[, c(1,2)]
## [,1] [,2]
## [1,] 0.22074468 -0.5406128
5
## Value -0.443 0.219 0.329 -0.403 -0.116 -0.302 0.613
## Price 0.478 0.296 0.116 -0.456 -0.351 -0.565 0.141
## Design -0.333 -0.139 0.200 -0.720 -0.367 0.411
## Sporty -0.386 -0.156 0.713 0.146 0.130 -0.245 -0.317 -0.347
## Safety -0.416 0.457 -0.216 -0.170 0.669 0.242 -0.190
## Easy 0.492 -0.140 0.165 -0.474 0.240 -0.187 -0.628
##
## Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7
Comp.8
## SS loadings 1.000 1.000 1.000 1.000 1.000 1.000 1.000
1.000
## Proportion Var 0.125 0.125 0.125 0.125 0.125 0.125 0.125
0.125
## Cumulative Var 0.125 0.250 0.375 0.500 0.625 0.750 0.875
1.000
pca2$rotation
## PC1 PC2 PC3 PC4 PC5
PC6
## Economy -0.22074468 -0.5406128 -0.59342593 0.1091400 -0.03715014
0.2076344
## Service 0.30597795 -0.2790384 0.12849793 -0.2740828 0.10755712 -
0.6891689
## Value 0.44346793 -0.2191251 0.06175501 0.3292515 0.40319081 -
0.1158605
## Price -0.47767627 -0.2964806 -0.11642410 -0.4561908 0.01677775 -
0.3512380
## Design 0.33268483 0.1386958 -0.20025752 -0.7204073 0.36718744
0.4111211
## Sporty 0.38569859 0.1557756 -0.71265406 0.1460579 -0.13023590 -
0.2451503
## Safety 0.41623018 -0.4565588 0.21554180 -0.1701540 -0.66939195
0.2419840
## Easy -0.01098606 -0.4919455 0.13974456 0.1648590 0.47363848
0.2397225
## PC7 PC8
## Economy -0.43602713 0.245625524
## Service -0.47623487 -0.153492508
## Value 0.30152838 0.613346794
## Price 0.56452099 0.141161045
## Design -0.04250816 0.073078202
## Sporty 0.31721979 -0.346516075
## Safety 0.19039136 -0.002274241
## Easy 0.18652925 -0.628146827
EigenVectors[, c(1,2)]
## [,1] [,2]
## [1,] 0.22074468 -0.5406128
5

Statistics
## [2,] -0.30597795 -0.2790384
## [3,] -0.44346793 -0.2191251
## [4,] 0.47767627 -0.2964806
## [5,] -0.33268483 0.1386958
## [6,] -0.38569859 0.1557756
## [7,] -0.41623018 -0.4565588
## [8,] 0.01098606 -0.4919455
#----
#Part g
The scree plot is a plot for the Principle Component Analysis
(PCA) that displays the different components and the respective
proportion of variance that the components explain (Barbara & Susan,
2014). The scree plot is significant in the determination of the
number of components that should be selected from the Principle
Component Analysis (PCA). In the plot for the Principle Component
Analysis (PCA) in our case below, we observe that the curve of the
graph drops drastically after the first two components; this implies
that in order to summarize the car means data, only the first two
components are necessary. Observing the summary output of the
Principle Component Analysis (PCA), we note that the proportion of
variance of the first component = 0.7557791, which translates to
75.57791% of the variance being explained by the first component. We
also observe that the proportion of variance of the second component =
0.1558552, which translates to 15.58552% of the variance being
explained by component 2. Therefore, jointly, components 1 and 2
explain 91.16343%.
#Scree Plot
screeplot(pca1, type = "l")
6
## [2,] -0.30597795 -0.2790384
## [3,] -0.44346793 -0.2191251
## [4,] 0.47767627 -0.2964806
## [5,] -0.33268483 0.1386958
## [6,] -0.38569859 0.1557756
## [7,] -0.41623018 -0.4565588
## [8,] 0.01098606 -0.4919455
#----
#Part g
The scree plot is a plot for the Principle Component Analysis
(PCA) that displays the different components and the respective
proportion of variance that the components explain (Barbara & Susan,
2014). The scree plot is significant in the determination of the
number of components that should be selected from the Principle
Component Analysis (PCA). In the plot for the Principle Component
Analysis (PCA) in our case below, we observe that the curve of the
graph drops drastically after the first two components; this implies
that in order to summarize the car means data, only the first two
components are necessary. Observing the summary output of the
Principle Component Analysis (PCA), we note that the proportion of
variance of the first component = 0.7557791, which translates to
75.57791% of the variance being explained by the first component. We
also observe that the proportion of variance of the second component =
0.1558552, which translates to 15.58552% of the variance being
explained by component 2. Therefore, jointly, components 1 and 2
explain 91.16343%.
#Scree Plot
screeplot(pca1, type = "l")
6
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

Statistics
pca2$rotation
## PC1 PC2 PC3 PC4 PC5
PC6
## Economy -0.22074468 -0.5406128 -0.59342593 0.1091400 -0.03715014
0.2076344
## Service 0.30597795 -0.2790384 0.12849793 -0.2740828 0.10755712 -
0.6891689
## Value 0.44346793 -0.2191251 0.06175501 0.3292515 0.40319081 -
0.1158605
## Price -0.47767627 -0.2964806 -0.11642410 -0.4561908 0.01677775 -
0.3512380
## Design 0.33268483 0.1386958 -0.20025752 -0.7204073 0.36718744
0.4111211
## Sporty 0.38569859 0.1557756 -0.71265406 0.1460579 -0.13023590 -
0.2451503
## Safety 0.41623018 -0.4565588 0.21554180 -0.1701540 -0.66939195
0.2419840
## Easy -0.01098606 -0.4919455 0.13974456 0.1648590 0.47363848
0.2397225
## PC7 PC8
## Economy -0.43602713 0.245625524
## Service -0.47623487 -0.153492508
## Value 0.30152838 0.613346794
## Price 0.56452099 0.141161045
## Design -0.04250816 0.073078202
7
pca2$rotation
## PC1 PC2 PC3 PC4 PC5
PC6
## Economy -0.22074468 -0.5406128 -0.59342593 0.1091400 -0.03715014
0.2076344
## Service 0.30597795 -0.2790384 0.12849793 -0.2740828 0.10755712 -
0.6891689
## Value 0.44346793 -0.2191251 0.06175501 0.3292515 0.40319081 -
0.1158605
## Price -0.47767627 -0.2964806 -0.11642410 -0.4561908 0.01677775 -
0.3512380
## Design 0.33268483 0.1386958 -0.20025752 -0.7204073 0.36718744
0.4111211
## Sporty 0.38569859 0.1557756 -0.71265406 0.1460579 -0.13023590 -
0.2451503
## Safety 0.41623018 -0.4565588 0.21554180 -0.1701540 -0.66939195
0.2419840
## Easy -0.01098606 -0.4919455 0.13974456 0.1648590 0.47363848
0.2397225
## PC7 PC8
## Economy -0.43602713 0.245625524
## Service -0.47623487 -0.153492508
## Value 0.30152838 0.613346794
## Price 0.56452099 0.141161045
## Design -0.04250816 0.073078202
7
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Statistics
## Sporty 0.31721979 -0.346516075
## Safety 0.19039136 -0.002274241
## Easy 0.18652925 -0.628146827
#----
#Part h
#Plot of PC1 and PC2
t = row.names(carmean2)
y = pca1$scores
plot(y[,1],y[,2],main="first vs. second
PC",xlab="PC1",ylab="PC2",type="n")
text(y[,1],y[,2],labels=t,adj=c(0.5,0.5),cex=0.8,xpd=NA)
#Plot of PC1 and PC3
t = row.names(carmean2)
y = pca1$scores
plot(y[,1],y[,3],main="first vs. third
PC",xlab="PC1",ylab="PC3",type="n")
text(y[,1],y[,3],labels=t,adj=c(0.5,0.5),cex=0.8,xpd=NA)
8
## Sporty 0.31721979 -0.346516075
## Safety 0.19039136 -0.002274241
## Easy 0.18652925 -0.628146827
#----
#Part h
#Plot of PC1 and PC2
t = row.names(carmean2)
y = pca1$scores
plot(y[,1],y[,2],main="first vs. second
PC",xlab="PC1",ylab="PC2",type="n")
text(y[,1],y[,2],labels=t,adj=c(0.5,0.5),cex=0.8,xpd=NA)
#Plot of PC1 and PC3
t = row.names(carmean2)
y = pca1$scores
plot(y[,1],y[,3],main="first vs. third
PC",xlab="PC1",ylab="PC3",type="n")
text(y[,1],y[,3],labels=t,adj=c(0.5,0.5),cex=0.8,xpd=NA)
8

Statistics
#Plot of PC2 and PC3
t = row.names(carmean2)
y = pca1$scores
plot(y[,2],y[,3],main="second vs. third
PC",xlab="PC2",ylab="PC3",type="n")
text(y[,2],y[,3],labels=t,adj=c(0.5,0.5),cex=0.8,xpd=NA)
9
#Plot of PC2 and PC3
t = row.names(carmean2)
y = pca1$scores
plot(y[,2],y[,3],main="second vs. third
PC",xlab="PC2",ylab="PC3",type="n")
text(y[,2],y[,3],labels=t,adj=c(0.5,0.5),cex=0.8,xpd=NA)
9
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

Statistics
#Part i
Observing the loadings for PC1 below, we note that the values
closest to the extremes (1 and -1) are those corresponding to Value,
Price and Safety at 0.44346793, -0.47767627 and 0.41623018
respectively. Therefore, for principle component 1, the most
influential variables are Value, Price and Safety.
pca2$rotation[,1]
## Economy Service Value Price Design
Sporty
## -0.22074468 0.30597795 0.44346793 -0.47767627 0.33268483
0.38569859
## Safety Easy
## 0.41623018 -0.01098606
#====
#Question 2
#Loading Data
load(file = "HW2Q2Data.rda")
#----
#Part a
#Testing Whether the two groups are parallel
Hypothesis:
10
#Part i
Observing the loadings for PC1 below, we note that the values
closest to the extremes (1 and -1) are those corresponding to Value,
Price and Safety at 0.44346793, -0.47767627 and 0.41623018
respectively. Therefore, for principle component 1, the most
influential variables are Value, Price and Safety.
pca2$rotation[,1]
## Economy Service Value Price Design
Sporty
## -0.22074468 0.30597795 0.44346793 -0.47767627 0.33268483
0.38569859
## Safety Easy
## 0.41623018 -0.01098606
#====
#Question 2
#Loading Data
load(file = "HW2Q2Data.rda")
#----
#Part a
#Testing Whether the two groups are parallel
Hypothesis:
10
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Statistics
H0: Cμgroup1 = Cμgroup2
HA: Cμgroup1 ≠Cμgroup2
Test Statistic:
T 2= n1 n2
n1+n2
¿ ¿
The resulting test statistics, numerical value and p-value are given as: 0.1716331,
0.0750895 and 0.9283975 respectively. Comparing the p-value = 0.9283975 with a level of
significance of 0.05, we note p-value = 0.9283975 > 0.05, hence, we fail to reject the null
hypothesis and conclude that the two groups are parallel.
#Selecting GroupS 1 and 2
G1 <- as.matrix(plasma[plasma$group == "Group 1", 2:4])
G2 <- as.matrix(plasma[plasma$group == "Group 2", 2:4])
#Obtaining row lengths
n1 <- nrow(G1)
n2 <- nrow(G2)
#Computing means
(m1 <-apply (G1 ,2, mean ))
## 8am 11am 3pm
## 132.4 143.4 126.0
(m2 <-apply (G2 ,2, mean ))
## 8am 11am 3pm
## 104.6 112.4 99.2
#Plotting the means for both groups
plot (m1 , type ="b",ylim =c (95 ,150), col = "red")
lines (m2 , type ="b",lty =2, col ="blue")
11
H0: Cμgroup1 = Cμgroup2
HA: Cμgroup1 ≠Cμgroup2
Test Statistic:
T 2= n1 n2
n1+n2
¿ ¿
The resulting test statistics, numerical value and p-value are given as: 0.1716331,
0.0750895 and 0.9283975 respectively. Comparing the p-value = 0.9283975 with a level of
significance of 0.05, we note p-value = 0.9283975 > 0.05, hence, we fail to reject the null
hypothesis and conclude that the two groups are parallel.
#Selecting GroupS 1 and 2
G1 <- as.matrix(plasma[plasma$group == "Group 1", 2:4])
G2 <- as.matrix(plasma[plasma$group == "Group 2", 2:4])
#Obtaining row lengths
n1 <- nrow(G1)
n2 <- nrow(G2)
#Computing means
(m1 <-apply (G1 ,2, mean ))
## 8am 11am 3pm
## 132.4 143.4 126.0
(m2 <-apply (G2 ,2, mean ))
## 8am 11am 3pm
## 104.6 112.4 99.2
#Plotting the means for both groups
plot (m1 , type ="b",ylim =c (95 ,150), col = "red")
lines (m2 , type ="b",lty =2, col ="blue")
11

Statistics
#Computing respective Covariance Matrices for Groups 1 and 2 and the
pooled covariance matrix
S1 <-var (G1)
S2 <-var (G2)
Sp <-((n1 -1) * var (G1)+(n2 -1) * var(G2))/(n1+n2 -2)
#Computing inverse matrix and difference in means between Groups 1 and
2
#Constructing C
p <-ncol (G1)
q <-p-1
v <-c(1,-1, rep (0,q))
v <-c( rep (v,p -2) ,1,-1)
(C <- matrix (v,nr=q,nc=p, byrow =T))
## [,1] [,2] [,3]
## [1,] 1 -1 0
## [2,] 0 1 -1
Sinv <-solve (C%*%Sp%*%t(C))
m <-m1 -m2
n <- n1 + n2
#Computing T2 Statistics
(T2 <-(n1*n2/n)*t(C%*%m)%*% Sinv %*%(C%*%m))
12
#Computing respective Covariance Matrices for Groups 1 and 2 and the
pooled covariance matrix
S1 <-var (G1)
S2 <-var (G2)
Sp <-((n1 -1) * var (G1)+(n2 -1) * var(G2))/(n1+n2 -2)
#Computing inverse matrix and difference in means between Groups 1 and
2
#Constructing C
p <-ncol (G1)
q <-p-1
v <-c(1,-1, rep (0,q))
v <-c( rep (v,p -2) ,1,-1)
(C <- matrix (v,nr=q,nc=p, byrow =T))
## [,1] [,2] [,3]
## [1,] 1 -1 0
## [2,] 0 1 -1
Sinv <-solve (C%*%Sp%*%t(C))
m <-m1 -m2
n <- n1 + n2
#Computing T2 Statistics
(T2 <-(n1*n2/n)*t(C%*%m)%*% Sinv %*%(C%*%m))
12
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide
1 out of 15
Your All-in-One AI-Powered Toolkit for Academic Success.
 +13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2026 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.