Statistical Analysis Case Study: Graduation Rates and Study Time

Verified

Added on 2020/04/13

AI Summary

This case study analyzes two scenarios using statistical methods. Case 1 examines a college's graduation rate, employing a z-test to test a hypothesis about the proportion of graduating students. The analysis includes defining hypotheses, calculating the z-statistic, determining the p-value, establishing a decision rule, and drawing conclusions based on a 95% confidence interval. Case 2 investigates study time per week, using a t-test to evaluate the relationship between study hours and achieving above-average grades. This analysis involves defining hypotheses, calculating the t-statistic, determining the p-value, establishing a decision rule, and drawing conclusions based on a 95% confidence interval. Both cases provide a comprehensive application of statistical techniques to real-world scenarios, offering insights into data analysis and hypothesis testing.

STAT ANALYSIS CASESTUDY
ASSIGNMENT 3
[Pick the date]
Student id

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Case 1: Analysing your College’s School Graduation Rate
Number of students = 200
Number of graduated students = 165
Graduation rate = 77%
(a) (i) It is apparent from the case information that the sample size is significantly higher
than 30 and the binomial distribution can be assumed to be normal distribution.
Additionally,
n=200 , p=( 165
200 )=0.825
Thus, np> 10
Also, np ( 1− p ) > 10
Hence, it would be fair to conclude that the distribution of data is from normal distribution
and thus, z test would be taken into consideration to test the hypothesis.
(ii) In the given case, population proportion would be relevant instead of mean which may be
attributed to the fact that the focus of the given claim is not on the number of students that
would graduate but rather on ascertaining the percentage of candidates that would pass their
graduation.
(iii) The z statistic would be computed as shown below:
sample proportion ( ^p ¿ 0.825
sample ¿ n ¿ 200
hypothesize population proportion( p0 ¿ 0.77
z=
{ ( ^p− p0 )
( √ p0 ( 1−p0 )
n ) }=
{ 0.825−0.77
( √ 0.77 ( 1−0.77 )
200 ) }=1.848
(b) Hypothesis test steps to check the validity of the claim
1. Defining the hypotheses
Null hypothesis H0 : p=0.77
Alternative hypothesis H1 : p ≠ 0.77

2. The value of t statistic and significance level
z=
{ ( ^p− p0 )
( √ p0 ( 1−p0 )
n ) }=
{ 0.825−0.77
( √ 0.77 ( 1−0.77 )
200 ) }=1.848
Significance level ¿ ∝=0.01
3. The p value for conclusion
The p value would be determined with the help of z statistics. In present case, the p value for
z statistic = 1.848 and two tailed test is 0.0646.
4. Defining decision rule
The null hypothesis would be rejected only when the p value is lesser than the significance
level. Similarly, null hypothesis would not be rejected when the p value is greater than the
significance level.
5. Final conclusion
It can be seen that the p value is greater than significance level and thus, it will not result
rejection of null hypothesis and acceptance for alternative hypothesis. Therefore, it can be
said that “% of graduate students is same as 0.77.”
(c) 95% confidence interval (for population mean number of hours studied per week)
95% confidence interval ¿ ^p ± ( z∗
√ ^p ( 1− ^p )
n )
Z value for 95% confidence interval = 1.96
Upper limit ¿ { ^p+ ( z∗
√ ^p ( 1− ^p )
n ) }=0.825+1.96 √ 0.825 ( 1−0.825 )
200 =0.877
Lower limit ¿ { ^p−( z∗
√ ^p ( 1− ^p )
n )}= {0.825−1.96 √ 0.825 ( 1−0.825 )
200 }=0.772
Hence, 95% confidence interval [0.772 0.877]

Case 2: Analysing Study Time per Week
(a) Name of class and the hours per week studying for 25 students are shown below:
(b) (i) It is apparent from the case information and collected data set that the standard
deviation of the population is unknown. Moreover, the sample size is lower than 30 and
thus, as per central limit theorem z statistics would not be used. Hence, t test would be
taken into consideration to test the hypothesis.
(ii) It is apparent that in the given problem, the focus area is the population mean instead of
the population proportion. This is because the claim is regarding the mean hours required to
score above average grade and does not pertain to proportion of students who tend to agree
with the same. Thus, estimate regarding the mean hours is required which necessitates the use
of population mean and not population proportion.
(iii) The t statistic would be computed as shown below:

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

t=(x ¿−μ)
( s
√n ) ¿
x=mean of sample=8.28
μ=meanof population=5
s=standard deviation of Sample=2.836
t=¿
(iv) Hypothesis test steps to check the validity of the claim
1. Defining the hypotheses
Null hypothesis H0 : μ ≤5 hr
Alternative hypothesis H1 : μ> 5 hr
2. The value of t statistic
t=¿
Degree of freedom¿ ( n−1 )=(25−1)=24
Assuming significance level ¿ ∝=0.05
3. The p value for conclusion
The p value would be determined with the help of t statistics, degree of freedom and one/two
tailed test. In present case, the p value for degree of freedom = 24, t statistic = 5.782, one
tailed test is 0.0001.
4. Defining decision rule
The null hypothesis would be rejected only when the p value is lesser than the significance
level. Similarly, null hypothesis would not be rejected when the p value is greater than the
significance level.
5. Final conclusion

It can be seen that the p value is lesser than significance level and thus, it results rejection of
null hypothesis and acceptance for alternative hypothesis. Therefore, it can be said that
“students who would study higher than 5 hours per week would get an above average grades
on any subjects.”
(c) 95% confidence interval (for population mean number of hours studied per week)
95% confidence interval ¿ x ± {t∗( s
√ n ) }
Upper limit
¿ {x+t ( s
√n ) }= {8.28+ (5.782∗2.836
√25 ) }=11.56
Lower limit
¿ {x−t ( s
√n )}= {8.28−
( 5.782∗2.836
√25 ) }=5
Hence, 95% confidence interval [5 11.56]