Statistics Study Material
VerifiedAdded on 2022/11/26
|10
|1212
|91
AI Summary
This document provides study material for statistics including solved assignments, essays, and dissertations. It covers topics such as hypothesis testing, regression analysis, and prediction models. The content is suitable for students studying statistics in college or university courses.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
![Document Page](https://desklib.com/media/document/docfile/pages/statistics-date-student-id-company-metw/2024/09/15/ddcd7a9d-bf90-4c3f-9753-175c14a5aae0-page-1.webp)
STATISTICS
[DATE]
STUDENT ID
[Company address]
[DATE]
STUDENT ID
[Company address]
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
![Document Page](https://desklib.com/media/document/docfile/pages/statistics-date-student-id-company-metw/2024/09/15/e4bd131f-b171-4428-9b81-0644ccaec86d-page-2.webp)
Question 1
(a) Claim: The online retail store would be profitable when the average order would exceed
$85.
As the population standard deviation of the order variable is not known, hence the relevant
test statistic would be t.
The p value is lower than the significance level (assuming 5%) and hence, sufficient evidence
is present to reject the null hypothesis and to accept the alternative hypothesis. Hence, the
claim is right that the online retail store would be profitable when the average order exceeds
$85.
(b) Claim: Proportion of the people who has received an e-gift card for Christmas is higher
than 20%.
1: NO, 2: YES
1
(a) Claim: The online retail store would be profitable when the average order would exceed
$85.
As the population standard deviation of the order variable is not known, hence the relevant
test statistic would be t.
The p value is lower than the significance level (assuming 5%) and hence, sufficient evidence
is present to reject the null hypothesis and to accept the alternative hypothesis. Hence, the
claim is right that the online retail store would be profitable when the average order exceeds
$85.
(b) Claim: Proportion of the people who has received an e-gift card for Christmas is higher
than 20%.
1: NO, 2: YES
1
![Document Page](https://desklib.com/media/document/docfile/pages/statistics-date-student-id-company-metw/2024/09/15/a148c1a8-e3d3-4c52-87ec-57618b66fb25-page-3.webp)
The p value is higher than the significance level (5%) and hence, insufficient evidence is
present to reject the null hypothesis. Hence, the claim is not right that proportion of the
people who have received an e-gift card for Christmas is higher than 20%.
(c) Claim: Significant difference is present between the mean two appraisers.
Null hypothesis Ho : μAppraiser 1−μAppraiser 2=0
Alternative hypothesis Ha : μAppraiser 1−μAppraiser 2 ≠ 0
Considering that the population standard deviation for the two variables is unknown, hence
the test statistics ought to be t. Further, two sample independent t test needs to be computed
whose result from Excel is illustrated below.
2
present to reject the null hypothesis. Hence, the claim is not right that proportion of the
people who have received an e-gift card for Christmas is higher than 20%.
(c) Claim: Significant difference is present between the mean two appraisers.
Null hypothesis Ho : μAppraiser 1−μAppraiser 2=0
Alternative hypothesis Ha : μAppraiser 1−μAppraiser 2 ≠ 0
Considering that the population standard deviation for the two variables is unknown, hence
the test statistics ought to be t. Further, two sample independent t test needs to be computed
whose result from Excel is illustrated below.
2
![Document Page](https://desklib.com/media/document/docfile/pages/statistics-date-student-id-company-metw/2024/09/15/2df50376-62c5-4100-8218-1a0d9f219dfa-page-4.webp)
This is a two tailed hypothesis test and therefore, the two tailed p value would be used.
The two tailed p value (0.7414) is higher than the significance level (5%) and hence,
insufficient evidence is present to reject the null hypothesis. Hence, it can be concluded that
no significant difference is present between the mean of the two appraisers in the
assessments.
Question 2
(a) Multiple regression model
Least square regression line equation
Longevity=3.244+ ( 0.451∗Mother ) + ( 0.411∗Father ) + ( 0.017∗Gmothers ) +(0.087∗Gfathers)
3
The two tailed p value (0.7414) is higher than the significance level (5%) and hence,
insufficient evidence is present to reject the null hypothesis. Hence, it can be concluded that
no significant difference is present between the mean of the two appraisers in the
assessments.
Question 2
(a) Multiple regression model
Least square regression line equation
Longevity=3.244+ ( 0.451∗Mother ) + ( 0.411∗Father ) + ( 0.017∗Gmothers ) +(0.087∗Gfathers)
3
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
![Document Page](https://desklib.com/media/document/docfile/pages/statistics-date-student-id-company-metw/2024/09/15/579700fe-5777-4cd0-96a8-12ba0ce515b3-page-5.webp)
(b) Interpretation of slope coefficients
Mother: If the age of the mother changes by 1 year, then the expected age of the person
concerned would change by 0.451 years in the same direction.
Father: If the age of the father changes by 1 year, then the expected age of the person
concerned would change by 0.411 years in the same direction.
Gmother: If the age of the grandmother changes by 1 year, then the expected age of the
person concerned would change by 0.017 years in the same direction.
Gfather: If the age of the grandfather changes by 1 year, then the expected age of the person
concerned would change by 0.087 years in the same direction.
The requisite hypotheses for testing the significance of the slope are as follows.
Null hypothesis Ho : β=0 Slope is not significant.
Alternative hypothesis Ha : β ≠ 0 Slope is significant
Significance level (Alpha) = 0.05
Based on the above, it can be said that only Mother and Father slope coefficients are
statistically significant.
(c) Longevity of man =?
When, Mother = 75, Father =75, Grandmothers = 77, Grandmothers =73
Least square regression line equation
Longevity=3.244+ ( 0.451∗Mother ) + ( 0.411∗Father ) + ( 0.017∗Gmothers ) +(0.087∗Gfathers)
Longevity=3.244+ ( 0.451∗75 ) + ( 0.411∗75 ) + ( 0.017∗77 ) +(0.087∗73)
Longevity=75.5
Hence, the longevity of a man for the given inputs would be 75.5 years.
4
Mother: If the age of the mother changes by 1 year, then the expected age of the person
concerned would change by 0.451 years in the same direction.
Father: If the age of the father changes by 1 year, then the expected age of the person
concerned would change by 0.411 years in the same direction.
Gmother: If the age of the grandmother changes by 1 year, then the expected age of the
person concerned would change by 0.017 years in the same direction.
Gfather: If the age of the grandfather changes by 1 year, then the expected age of the person
concerned would change by 0.087 years in the same direction.
The requisite hypotheses for testing the significance of the slope are as follows.
Null hypothesis Ho : β=0 Slope is not significant.
Alternative hypothesis Ha : β ≠ 0 Slope is significant
Significance level (Alpha) = 0.05
Based on the above, it can be said that only Mother and Father slope coefficients are
statistically significant.
(c) Longevity of man =?
When, Mother = 75, Father =75, Grandmothers = 77, Grandmothers =73
Least square regression line equation
Longevity=3.244+ ( 0.451∗Mother ) + ( 0.411∗Father ) + ( 0.017∗Gmothers ) +(0.087∗Gfathers)
Longevity=3.244+ ( 0.451∗75 ) + ( 0.411∗75 ) + ( 0.017∗77 ) +(0.087∗73)
Longevity=75.5
Hence, the longevity of a man for the given inputs would be 75.5 years.
4
![Document Page](https://desklib.com/media/document/docfile/pages/statistics-date-student-id-company-metw/2024/09/15/696b80d5-3417-4335-9fa1-19866d196625-page-6.webp)
(d) Multiple regression model
Smoker
YES =1, NO =0
The key differences in the above model when compared with the original model are indicated
below.
The coefficient for mother and father coefficients have decreased implying that their
age would have lower impact on the age of the child.
Further, the significance of the coefficients related to grandmother and grandfather
have improved in the model as the corresponding p value of slope has decreased.
(e) Interpretation of smoker dummy variable: If the underlying person is a smoker, than the
average age would be reduced by 3.719 in comparison to a non-smoker assuming all the
other parameters are same.
The requisite hypotheses for testing the significance of slope are as follows.
Null hypothesis Ho : β=0 Slope is not significant.
Alternative hypothesis Ha : β ≠ 0 Slope is significant
Significance level (Alpha) = 0.01
5
Smoker
YES =1, NO =0
The key differences in the above model when compared with the original model are indicated
below.
The coefficient for mother and father coefficients have decreased implying that their
age would have lower impact on the age of the child.
Further, the significance of the coefficients related to grandmother and grandfather
have improved in the model as the corresponding p value of slope has decreased.
(e) Interpretation of smoker dummy variable: If the underlying person is a smoker, than the
average age would be reduced by 3.719 in comparison to a non-smoker assuming all the
other parameters are same.
The requisite hypotheses for testing the significance of slope are as follows.
Null hypothesis Ho : β=0 Slope is not significant.
Alternative hypothesis Ha : β ≠ 0 Slope is significant
Significance level (Alpha) = 0.01
5
![Document Page](https://desklib.com/media/document/docfile/pages/statistics-date-student-id-company-metw/2024/09/15/940b4b5d-f5d4-4e96-b743-bb82d9a5c140-page-7.webp)
The p value for smoker comes out to be zero which is lower than significance level and
hence, null hypothesis will be rejected and alternative will be accepted. Thus, it can be said
that smoker variable is statistically significant and hence, smoking would affect the length of
a life significantly.
Question 3
(a) Multiple regression model
The regression equation is as follows.
Time = -28.427 + 0.604*Boxes + 0.374*Weight
The slope coefficients for the above regression model can be interpreted as shown belo.w
Boxes: If the number of box is changed by 1, the time taken to unload would alter by 0.604
minutes in the same direction. The positive sign of the slope is on expected lines as more time
would be required for unloading more boxes.
Weight: If the weight of a box is changed by 100 kg, then the corresponding time taken to
unload the box would alter by 0.374 minutes in the same direction. The positive sign of the
slope is on expected time as higher time is expected to be consumed for unloading a heavier
box.
(b) Simple regression model with codes for time of day
6
hence, null hypothesis will be rejected and alternative will be accepted. Thus, it can be said
that smoker variable is statistically significant and hence, smoking would affect the length of
a life significantly.
Question 3
(a) Multiple regression model
The regression equation is as follows.
Time = -28.427 + 0.604*Boxes + 0.374*Weight
The slope coefficients for the above regression model can be interpreted as shown belo.w
Boxes: If the number of box is changed by 1, the time taken to unload would alter by 0.604
minutes in the same direction. The positive sign of the slope is on expected lines as more time
would be required for unloading more boxes.
Weight: If the weight of a box is changed by 100 kg, then the corresponding time taken to
unload the box would alter by 0.374 minutes in the same direction. The positive sign of the
slope is on expected time as higher time is expected to be consumed for unloading a heavier
box.
(b) Simple regression model with codes for time of day
6
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
![Document Page](https://desklib.com/media/document/docfile/pages/statistics-date-student-id-company-metw/2024/09/15/71b711bc-7ca2-416f-a6cd-ede9543cbd82-page-8.webp)
The issue with the above model is that it assumes that there is uniform difference in time
taken to unload between morning & early afternoon and early afternoon & late afternoon.
This confusion has been created using to the use of numerical measures which are equidistant
from one another.
(c) Let the dummy variable is codes i.e. code = 0; morning and code = 1; afternoon
(d) Model B is better considering the fact that for this model, the code variable is also
statistically significant as indicated from the p value of zero corresponding to the slope
coefficient of codes. Further, there is an improvement in the R2 value for Model B which
clearly implies that the predictive ability or power of this model is superior in comparison
to Model A.
7
taken to unload between morning & early afternoon and early afternoon & late afternoon.
This confusion has been created using to the use of numerical measures which are equidistant
from one another.
(c) Let the dummy variable is codes i.e. code = 0; morning and code = 1; afternoon
(d) Model B is better considering the fact that for this model, the code variable is also
statistically significant as indicated from the p value of zero corresponding to the slope
coefficient of codes. Further, there is an improvement in the R2 value for Model B which
clearly implies that the predictive ability or power of this model is superior in comparison
to Model A.
7
![Document Page](https://desklib.com/media/document/docfile/pages/statistics-date-student-id-company-metw/2024/09/15/12431b96-54b9-41c4-ab3a-d8d0b2f70f71-page-9.webp)
(e) In order to highlight if time is significantly impacted by the time of day, it needs to be
determined whether the underlying slope coefficient for the same is statistically
significant or not.
The requisite hypotheses for testing the significance of slope are as follows.
Null hypothesis Ho : β=0 Slope is not significant.
Alternative hypothesis Ha : β ≠ 0 Slope is significant
Significance level (Alpha) = 0.05
The p value for slope coefficient of codes variable comes out to be zero which is lower than
significance level and hence, null hypothesis will be rejected and alternative will be accepted.
Thus, it can be said that the time is significantly dependent on the time of day when
unloading takes place.
(f) Prediction of time required to unload truck =?
Number of boxes = 100
Weight of boxes = 5000 kg
Three times of day = 1, 2, 3
Now,
Time required to unload truck in the morning (code =1)
Time=−41.422+ ( 0.644∗Boxes ) + ( 0.349∗Weight ) +(4.543∗Codes)
Time=−41.422+ ( 0.644∗100 ) + ( 0.349∗5000 ) + ( 4.543∗1 ) =1774.85 minutes
Time required to unload truck in the early afternoon (code =2)
Time=−41.422+ ( 0.644∗100 )+ ( 0.349∗5000 )+ ( 4.543∗2 )=1779.39 minutes
Time required to unload truck in the late afternoon (code =3)
Time=−41.422+ ( 0.644∗100 ) + ( 0.349∗5000 ) + ( 4.543∗3 ) =1783.93 minutes
8
determined whether the underlying slope coefficient for the same is statistically
significant or not.
The requisite hypotheses for testing the significance of slope are as follows.
Null hypothesis Ho : β=0 Slope is not significant.
Alternative hypothesis Ha : β ≠ 0 Slope is significant
Significance level (Alpha) = 0.05
The p value for slope coefficient of codes variable comes out to be zero which is lower than
significance level and hence, null hypothesis will be rejected and alternative will be accepted.
Thus, it can be said that the time is significantly dependent on the time of day when
unloading takes place.
(f) Prediction of time required to unload truck =?
Number of boxes = 100
Weight of boxes = 5000 kg
Three times of day = 1, 2, 3
Now,
Time required to unload truck in the morning (code =1)
Time=−41.422+ ( 0.644∗Boxes ) + ( 0.349∗Weight ) +(4.543∗Codes)
Time=−41.422+ ( 0.644∗100 ) + ( 0.349∗5000 ) + ( 4.543∗1 ) =1774.85 minutes
Time required to unload truck in the early afternoon (code =2)
Time=−41.422+ ( 0.644∗100 )+ ( 0.349∗5000 )+ ( 4.543∗2 )=1779.39 minutes
Time required to unload truck in the late afternoon (code =3)
Time=−41.422+ ( 0.644∗100 ) + ( 0.349∗5000 ) + ( 4.543∗3 ) =1783.93 minutes
8
![Document Page](https://desklib.com/media/document/docfile/pages/statistics-date-student-id-company-metw/2024/09/15/58e5693f-37e8-4f93-92d2-57a55aedee16-page-10.webp)
9
1 out of 10
Related Documents
![[object Object]](/_next/image/?url=%2F_next%2Fstatic%2Fmedia%2Flogo.6d15ce61.png&w=640&q=75)
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.