logo

About Statistics Question Answer 2022

   

Added on  2022-09-23

6 Pages1354 Words25 Views
STATISTICS
STUDENT ID:
[Pick the date]
About Statistics Question Answer 2022_1
Question 1
Goodness of Fit
Based on the minitab output represented, the following observations can be made about the
fit of the model.
The R2 value is 0.469 which implies that the given regression model can only explain 46.9%
variation in the price of the house. As a result, about 53.1% of the variation in the house price
is not accounted for by the current model.
With regards to the individual slope coefficients, number of bathrooms and number of people
are statistically significant even at 1% significance level. However, this cannot be concluded
for number of bedrooms which is significant only at significance levels more than 6%. Even
though the given model presents a good fit, but other relevant predictor variables need to be
introduced in order to improve the predictive power of the regression model.
Testing of assumptions
Based on the residual plots, it is evident that the residuals are not normally distributed. This is
evident from the normal probability plots where there are outliers at the upper end. Further,
the distribution of points in the residual plot does not seem to be random since their
distribution over the X axis does not seem to be symmetric. With regards to histogram of
residuals also, it is evident that it is asymmetric. Clearly, this is a violation of
homoscedasticity assumption linked with linear regression.
Question 2
a) The requisite hypotheses are given below.
Null Hypothesis: β3 = β4 = β5 = 0 which implies that all additional slope coefficients of the
extended model are insignificant and hence can be assumed as zero.
Alternative Hypothesis: Atleast one of the above slope coefficients is non-zero and therefore
significant.
b) In order to compare the complete model with the reduced model, the F statistics can be
computed as follows.
F = ((SSEReduced – SSEComplete)/Number of slope coefficients tested)/MSEComplete
MSE = SSE/(n-(k+1))
Where n is the sample size
In the complete model, k =5, n=40
About Statistics Question Answer 2022_2
MSE = (1830.44/(40-(5+1))) = 53.84
There are 3 slope coefficients which are tested.
Also, SSEReduced = 3197.16, SSEComplete = 1830.44
F statistic = ((3197.16-1830.44)/3)/53.84 = 8.46
In order to determine if the null hypothesis can be rejected or not, the F critical value ought to
be determined.
Level of significance = 5%, df for numerator = 3, df for denominator = (40-(5+1)) = 34
For the above inputs, critical value of F determined from the table is 2.84
Since, F statistic (8.46) > F critical (2.84), hence the null hypothesis is rejected and
alternative hypothesis is accepted.
This implies that atleast one of the three slope coefficients included in the complete model
are statistically significant.
c) The complete model would be used to predict Y. This is because the interaction effect
between the independent variables seems to be significant as apparent from part b. If the
reduced model is used, then this effect would not be captured which would lead to higher
residuals in the prediction of y.
Question 3
a) The prediction equation for the interaction model is given below.
b) The requisite hypotheses are stated below.
Null Hypothesis: β1 = β2 = β3 = 0 which implies that all slope coefficients of the interaction
model are insignificant and hence can be assumed as zero.
Alternative Hypothesis: Atleast one of the above slope coefficients is non-zero and therefore
significant.
Based on the ANOVA output, F = 9391.97 with p value = 2.1108E-11
Since the p value (0.00) < level of significance (0.05), hence the available evidence is
sufficient to cause rejection of null hypothesis and acceptance of alternative hypothesis.
Hence, it can be concluded that the given interaction model is statistically significant.
About Statistics Question Answer 2022_3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Regression Analysis | Assignment-1
|7
|872
|19

Multiple Regression Model Analysis for Business Decisions
|5
|727
|480

Statistics HI6007 Group Assignment
|11
|662
|430

Linear Regression and Correlation Analysis Assignment
|13
|1372
|109

Linear Estimation of Reading Ability of a Child on Age, Memory Span, and IQ
|12
|1637
|451

Applied Statistics: Analysis of Scatter Plot, Correlation Coefficient, Regression Model, Confidence Interval, and Coefficient of Determination
|7
|1040
|39