Hypothesis Testing and Confidence Intervals
VerifiedAdded on 2023/01/13
|5
|1103
|27
AI Summary
This document discusses hypothesis testing and confidence intervals in statistics. It covers topics such as t-tests, effect size calculations, and confidence interval calculations. Examples and explanations are provided to help understand the concepts.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
Hypothesis Testing and Confidence Intervals
Question 1
Part (a)
We are going to use a two-tailed paired sample t-test to compare the difference between the scores obtained on
the first and second attempt. A t-test is used because the sample size is small i.e. n<30. While a paired sample t-
test is employed because the group is the same for both attempts; the only difference is that the test attempts are
conducted at different time periods.
Null hypothesis (H0): The means for the two test attempts are the same i.e. μ1=μ2
Alternative hypothesis (H1): There is significant difference between the means for the two test attempts i.e.
μ1 ≠ μ2
T-score calculation
D.F..= n-1=15-1=14
t= d
√s2 /n
t= 6.5333
√ 144.2667 /15
t=2.106675
P-value=0.05366
We will therefore not reject the null hypothesis and conclude that the average scores of students in the first and
second attempt are the same.
Part (b)
Calculating the effect size for the given test scores using the t-score
Effect ¿ t
√n
Effect ¿ 2.106675
√15
Effect ¿ 0.14
The effect size i.e. Cohen’s d is less than 0.2 which test us the difference between the two means for first
attempt and second attempt is quite trivial in spite of the fact that it is significant.
Question 1
Part (a)
We are going to use a two-tailed paired sample t-test to compare the difference between the scores obtained on
the first and second attempt. A t-test is used because the sample size is small i.e. n<30. While a paired sample t-
test is employed because the group is the same for both attempts; the only difference is that the test attempts are
conducted at different time periods.
Null hypothesis (H0): The means for the two test attempts are the same i.e. μ1=μ2
Alternative hypothesis (H1): There is significant difference between the means for the two test attempts i.e.
μ1 ≠ μ2
T-score calculation
D.F..= n-1=15-1=14
t= d
√s2 /n
t= 6.5333
√ 144.2667 /15
t=2.106675
P-value=0.05366
We will therefore not reject the null hypothesis and conclude that the average scores of students in the first and
second attempt are the same.
Part (b)
Calculating the effect size for the given test scores using the t-score
Effect ¿ t
√n
Effect ¿ 2.106675
√15
Effect ¿ 0.14
The effect size i.e. Cohen’s d is less than 0.2 which test us the difference between the two means for first
attempt and second attempt is quite trivial in spite of the fact that it is significant.
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Part (c)
Confidence interval for this is given by
d ± t0.0 2 5,14 ( Sd
n )
hence
6.5333 ± 2.145∗(12.0111
15 )
95 %Confidence Interval=(8.25 , 4.82)
The 95% confidence interval implies that the mean for test scores on the second attempts is between 8.25 and
4.82 percent large than that of test scores on the first attempt.
Question 2
Part (a)
We are going to use a two-tailed independent samples t-test to compare the difference in results between
Professor A’s class and Professor B’s class (assuming equal variance). The hypothesis here is:
Null hypothesis (H0): The means for the two classes are the same i.e. μ1=μ2
Alternative hypothesis (H1): There is significant difference between the means for the two classes i.e. μ1 ≠ μ2
T-score calculation
t= μA −μB
√ [ ( ∑ A2− ( ∑ A )
2
nA ) + ( ∑ B2− ( ∑ B )
2
nB )
nA +nB−2 ]∗
[ 1
nA
+ 1
nB ]
t= −6.5333
√ [ 3633.733+2515.333
28 ]∗0.13333
Using the formula above
We get that
t=-1.207368562
P-value=0.237
Confidence interval for this is given by
d ± t0.0 2 5,14 ( Sd
n )
hence
6.5333 ± 2.145∗(12.0111
15 )
95 %Confidence Interval=(8.25 , 4.82)
The 95% confidence interval implies that the mean for test scores on the second attempts is between 8.25 and
4.82 percent large than that of test scores on the first attempt.
Question 2
Part (a)
We are going to use a two-tailed independent samples t-test to compare the difference in results between
Professor A’s class and Professor B’s class (assuming equal variance). The hypothesis here is:
Null hypothesis (H0): The means for the two classes are the same i.e. μ1=μ2
Alternative hypothesis (H1): There is significant difference between the means for the two classes i.e. μ1 ≠ μ2
T-score calculation
t= μA −μB
√ [ ( ∑ A2− ( ∑ A )
2
nA ) + ( ∑ B2− ( ∑ B )
2
nB )
nA +nB−2 ]∗
[ 1
nA
+ 1
nB ]
t= −6.5333
√ [ 3633.733+2515.333
28 ]∗0.13333
Using the formula above
We get that
t=-1.207368562
P-value=0.237
We will therefore not reject the null hypothesis and conclude that there was no significance difference in the
mean results between Prof A’s class and Prof B’s class of student.
Part (b)
No, the approaches are different because in question one we were using a single group of students but in
question two we are employing two different groups of students.
Question 3
No the statistical approach will not be different because the sample sizes does not affect the formula employed.
What would affect is the assumption of equal or unequal variance between the two groups of data. If in this
question we assume unequal variance the formula employed would differ from the one used in question two.
Question 4
Part (a)
20 30 40 50 60 70 80 90
0
10
20
30
40
50
60
70
80
90
100
40
70
Relationship between Midterm and Final
Scores
Relationship between Midterm
and Final Scores
Grade in midterm exam (%)
Grade in final exam (%)
Part (b)
There seems to be fairly strong positive relationship between midterm exam and final exam scores. There are
two outliners in the data for when final exam score is 70% and 40%. Nevertheless, we can conclude that when
a student gets a high score in the midterm he/she is likely to get a high school in the final exam. Likewise, a low
score in midterm exam will likely be followed by a low score in the final exam.
mean results between Prof A’s class and Prof B’s class of student.
Part (b)
No, the approaches are different because in question one we were using a single group of students but in
question two we are employing two different groups of students.
Question 3
No the statistical approach will not be different because the sample sizes does not affect the formula employed.
What would affect is the assumption of equal or unequal variance between the two groups of data. If in this
question we assume unequal variance the formula employed would differ from the one used in question two.
Question 4
Part (a)
20 30 40 50 60 70 80 90
0
10
20
30
40
50
60
70
80
90
100
40
70
Relationship between Midterm and Final
Scores
Relationship between Midterm
and Final Scores
Grade in midterm exam (%)
Grade in final exam (%)
Part (b)
There seems to be fairly strong positive relationship between midterm exam and final exam scores. There are
two outliners in the data for when final exam score is 70% and 40%. Nevertheless, we can conclude that when
a student gets a high score in the midterm he/she is likely to get a high school in the final exam. Likewise, a low
score in midterm exam will likely be followed by a low score in the final exam.
Question 5
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.682929
R Square 0.466392
Adjusted R Square0.425345
Standard Error 10.16102
Observations 15
ANOVA
df SS MS F Significance F
Regression 1 1173.132 1173.132 11.36246 0.005016
Residual 13 1342.201 103.2463
Total 14 2515.333
CoefficientsStandard Error t Stat P-value Lower 95% Upper 95%Lower 95.0%Upper 95.0%
Intercept 32.93105 10.63351 3.096912 0.008497 9.958748 55.90336 9.958748 55.90336
Grade in midterm exam (%)0.568194 0.168562 3.370825 0.005016 0.204037 0.932351 0.204037 0.932351
Part (a)
The is fairly strong positive relationship between the two variables with a correlation coefficient value of 0.68
Part (b)
Given R-squared=0.4663, then only 46.63% of the variance in the dependent variable (final exam scores) can be
explained by the dependent variable (midterm exam scores)
Part (c)
Looking at the results of the ANOVA test the f-statistic is significant at an alpha level of 0.05. Hence, the
relationship between the two variables is statistically significant.
Question 6
Using the regression model obtained in question 5, we can forecast the final exam score Y given the midterm
exam score X is 64.
Y =32.93105+0.568194(x )
Y =32.93105+0.568194∗64
Y=69.2954904 or 69.30%
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.682929
R Square 0.466392
Adjusted R Square0.425345
Standard Error 10.16102
Observations 15
ANOVA
df SS MS F Significance F
Regression 1 1173.132 1173.132 11.36246 0.005016
Residual 13 1342.201 103.2463
Total 14 2515.333
CoefficientsStandard Error t Stat P-value Lower 95% Upper 95%Lower 95.0%Upper 95.0%
Intercept 32.93105 10.63351 3.096912 0.008497 9.958748 55.90336 9.958748 55.90336
Grade in midterm exam (%)0.568194 0.168562 3.370825 0.005016 0.204037 0.932351 0.204037 0.932351
Part (a)
The is fairly strong positive relationship between the two variables with a correlation coefficient value of 0.68
Part (b)
Given R-squared=0.4663, then only 46.63% of the variance in the dependent variable (final exam scores) can be
explained by the dependent variable (midterm exam scores)
Part (c)
Looking at the results of the ANOVA test the f-statistic is significant at an alpha level of 0.05. Hence, the
relationship between the two variables is statistically significant.
Question 6
Using the regression model obtained in question 5, we can forecast the final exam score Y given the midterm
exam score X is 64.
Y =32.93105+0.568194(x )
Y =32.93105+0.568194∗64
Y=69.2954904 or 69.30%
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Final Exam Score is 69.30%
Question 7
Part (a)
The data would have to be placed in categories (e.g. first, second, third, and fourth quartiles scores) and the two
variables midterm and final exam scores would be on the row segment. See the table below (note actual data).
For example, the first quartile would represent the number of students who scores between 0% and 25% in the
midterm and final exams.
First Second Third Fourth
Midterm 2 2 5 7
Final 3 4 3 7
Part (b)
The major problem with using chi-square is the deviation from employing the actual data points in the
examination to using their frequency or position in the dataset. Moreover, a chi-square does not help with the
determination of categories that differ; it just indicates that the variables are different.
Question 7
Part (a)
The data would have to be placed in categories (e.g. first, second, third, and fourth quartiles scores) and the two
variables midterm and final exam scores would be on the row segment. See the table below (note actual data).
For example, the first quartile would represent the number of students who scores between 0% and 25% in the
midterm and final exams.
First Second Third Fourth
Midterm 2 2 5 7
Final 3 4 3 7
Part (b)
The major problem with using chi-square is the deviation from employing the actual data points in the
examination to using their frequency or position in the dataset. Moreover, a chi-square does not help with the
determination of categories that differ; it just indicates that the variables are different.
1 out of 5
Related Documents
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.