Scientific Statistics - Analysis and Solutions

Verified

Added on 2023/06/08

AI Summary

This article provides solutions to various statistical problems related to binomial distribution, t-test, proportion test, and hypothesis testing. It also includes a comparison of sugar content in Coke Zero and Diet Coke.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.

Running head: SCIENTIFIC STATISTICS
Scientific Statistics
Name of the Student:
Name of the University:
Course ID:

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

1SCIENTIFIC STATISTICS
Table of Contents
Answer 1..........................................................................................................................................3
Answer 1. a).................................................................................................................................3
Answer 1. b).................................................................................................................................3
Answer 1. c).................................................................................................................................3
Answer 2..........................................................................................................................................4
Answer 2. a).................................................................................................................................4
Answer 2. b).................................................................................................................................4
Answer 2. c).................................................................................................................................4
Answer 2. d).................................................................................................................................4
Answer 2. e).................................................................................................................................5
Answer 2. f).................................................................................................................................5
Answer 2. g).................................................................................................................................5
Answer 3..........................................................................................................................................5
Answer 3. a).................................................................................................................................5
Answer 3. b).................................................................................................................................6
Answer 3. c).................................................................................................................................6
Answer 3. d).................................................................................................................................6
Answer 3. e).................................................................................................................................6
Answer 3. f).................................................................................................................................7

2SCIENTIFIC STATISTICS
Answer 4..........................................................................................................................................7
Answer 4. a).................................................................................................................................7
Answer 4. b).................................................................................................................................7
Answer 4. c).................................................................................................................................8
Answer 4. d).................................................................................................................................8
Answer 4. e).................................................................................................................................9
Answer 5........................................................................................................................................10
Answer 5. a)...............................................................................................................................10
Answer 5. b)...............................................................................................................................10
Answer 5. c)...............................................................................................................................11
References:....................................................................................................................................12
Appendix:......................................................................................................................................13

3SCIENTIFIC STATISTICS
Answer 1.
As per 2016 census, 25% people are living in NSW who have no religion. A random
sample of 120 people living in NSW.
Answer 1. a)
We know that, CDF of binomial distribution is given as-
F (k; n, p) = Pr (X ≤ k) = ∑
k ( n
k ) pi ( 1− p ) n−i
The probability that no more than 20% of the selected people have no religion =
Here, n =120, p = 0.25. For, k = 0, 1, 2, …, we find that-
F (26; 120, 0.25) = Pr (X ≤ 26) = 0.1718 < 0.2.
Therefore, the probability that no more than 20% of the chosen people have no religion =
26
120 =0.2167 .
Answer 1. b)
The conditions that are necessary for the normal approximation to the binomial are-
1) The sample size (n =120) must be sufficiently large. The ‘thumb rule’ shows that normally
sample size should be greater than 50 that is satisfied here.
2) The central limit theorem shows that the sum of independent Bernoulli random variables
draws conclusions about a true ‘population proportion’ (p).
Z =
∑ Xi−np
√np(1− p)= ^p− p
√ p(1− p)
n
d N (0, 1) (Feller 2015)

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

4SCIENTIFIC STATISTICS
Answer 1. c)
The approximated probability using a normal distribution =
Z =
0.25−0.2167
√ 0.2167(1−0.2167)
120
= 0.0333
0.03761 =0.8854
Here, ^p=0.25 , p=0.2167∧n=120.
Answer 2.
Answer 2. a)
The hypotheses are-
Null hypothesis (H0): The average weight of sugar of ‘Diet Coke’ in 25 selected cans is equal to
0 gm.
Alternative hypothesis (HA): The average weight of sugar of ‘Diet Coke’ in 25 selected cans is
greater than 0 gm.
Answer 2. b)
‘One Sample t-test’ is used in this context.
Answer 2. c)
The decision rule is that-
If calculated p-value is greater than level of significance (0.05), then the null hypothesis
cannot be rejected. If calculated p-value is lesser than the level of significance, the null
hypothesis can be rejected.
Answer 2. d)
Output of One-sample t-test:

5SCIENTIFIC STATISTICS
Answer 2. e)
The null hypothesis could be rejected as the calculated p-value (2.2e-16) is less than the
level of significance (5%).
Answer 2. f)
It could be concluded that the average weight of sugar of ‘Diet Coke’ in 25 selected cans
is greater than 0 gm. Therefore, the slogan of Diet Coke ‘No sugar, No Calories’ is proved to be
invalid from the test result of sample of 25 randomly selected cans.
Answer 2. g)
The assumptions for the ‘One sample t-test’ are-
 The observations of the numeric variable are independent of one another.
 The undertaken variable must be continuous in nature.
 The undertaken variable must be normally distributed.
 The variable should not contain any outliers.
Answer 3.
Answer 3. a)
One sample t-test with 8 observations at 5% level of significance shows that-
H0: μ = 0.
H1: μ > 0.
The critical right-tailed t-statistic = 1.859547.

6SCIENTIFIC STATISTICS
Answer 3. b)
If a null hypothesis can be rejected at 1% level of significance, then null hypothesis
could also be rejected at higher level of significance. As, 10% level of significance is greater
than 1% level of significance, therefore, the null hypothesis could be rejected at 10% level of
significance also.
Answer 3. c)
If a null hypothesis cannot be rejected at 5% level of significance, then it cannot be
rejected at 1% level of significance. Failure of rejection at 5% level of significance means that
there exist 95% evidence for the claim of the basic assertion. It is very much possible that the
claim could be fulfilled with 99% evidence. Therefore, the null hypothesis cannot be rejected at
1% level of significance with confirmation.
Answer 3. d)
The two-sample proportional test at 0.05 level of significance provides the test-statistic =
(-1.21). The hypotheses are-
H0: p1-p2 = 0.
H1: p1-p2 ≠ 0.
The p-value is calculated as 0.226279. The calculated p-value is greater than level of
significance (5%). Therefore, the null hypothesis could be rejected with 95% confidence.
Answer 3. e)
As per the data, it is obtained that 95% confidence intervals for the true mean are (0.53,
0.87). Here, H0: μ = 0.5 and H1: μ ≠ 0.5.
The upper confidence interval, X + Z(α/2) * S.E.(X) = 0.87 (Steiger and Fouladi 2016).
The lower confidence interval, X - Z(α/2) * S.E.(X) = 0.53.
The calculated mean (X ) = (0.53+ 0.87)
2 = 1.4
2 =¿0.7
Here, Z(1-α/2) = 1.96, where α =0.05. Therefore, S.E.(X) = 0.87−0.7
1.96 = 0.17
1.96 =0.0867 .

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

7SCIENTIFIC STATISTICS
The calculated Z-statistic = x−μ
S . E .( X)= 0.7−0.5
0.0867 = 0.2
0.0867 =2.30588
The p-value found out from the calculated Z-statistic = 0.021119. It is less than the level
of significance (5%). Therefore, the null hypothesis could be rejected with 95% probability.
Answer 3. f)
Welch’s two sample t-test is used for testing equality of means of two independent
samples preferably of unequal variances and unequal sample sizes. However, the equality of
average weights of experimented 10 rats in prior-treatment and post-treatment situations would
be preferably tested by the paired t-test. Here, two variables occur from same population, from
two different situations and of equal sizes.
Therefore, paired t-test would be more preferably used rather than Welch’s two sample t-
test.
Answer 4.
Answer 4. a)
The test statistic to estimate the proportion of all families that would find the program
effective is Z-statistic.
The calculated value of Z-statistic =
^p− p0
√ p0 (1− p0)
n
=
0.54−0.5
√ 0.5(1−0.5)
120
= 0.04
√ 0.5 (1−0.5)
120
= 0.04
0.04564 =0.876
Note that, here,
^p= x
n = 65
120 =0.54=calculated proportion , p0=0.5=hypothetical proportion , n=120=sample ¿

8SCIENTIFIC STATISTICS
Answer 4. b)
The hypotheses are-
H0: ^p ≥0.5
H1: ^p<0.5
The calculated one-tailed p-value of the Z-statistic (0.876) = 0.190515. Therefore, the
null hypothesis cannot be rejected with 95% confidence. Hence, the company’s claim of program
‘Sleeping well’ could be retained supported by data.
Answer 4. c)
The margin of error for 95% confidence interval is not greater than 0.03.
M.E. = Z * √ ^p(1− ^p)
n¿ = 0.03.
Or, √ ^p( 1− ^p)
n¿ = 0.03
1.96 =0.0153
Or, ^p (1− ^p)
n¿ =0.00023427738
Or, n¿= 0.54 (1−0.54 )
0.00023427738 = 1060.
The number of families needed to construct a 95% confidence interval for a similar trial
so that the margin of error is not greater than 0.03 is 1060.
Answer 4. d)
The ‘Sleeping well’ has proportion = ^p1 = 65
120 =0.54

9SCIENTIFIC STATISTICS
The ‘Sleeping well 2’ has proportion = ^p2 = 135
200 =0.665
The pooled proportion of families are Sp
2 = ( n1−1 ) S1
2+(n2 −1)S2
2
n1 +n2−2
S1
2= ^p1 (1− ^p1 )
n1
¿ 0.54(1−0.54)
120 =0.00207
S2
2= ^p2 (1− ^p2 )
n2
¿ 0.665(1−0.665)
200 =0.001113875
Sp
2 = ( 120−1 )∗0.00207+ ( 200−1 )∗0.001113875
120+ 200−2 =0.0014716702
The calculated pooled proportion of families = 0.0014716702.
Answer 4. e)
We set the hypothesis-
H0: The effectiveness of ‘Sleeping well 2’ ( ^p2) is equal to the effectiveness of ‘Sleeping
well’ ( ^p1 ¿.
HA: The effectiveness of ‘Sleeping well 2’ ( ^p2) is higher than the effectiveness of
‘Sleeping well’ ( ^p1 ¿.
The calculated t-statistic = ¿ ¿ = 28.2186 (De Winter 2013).
Degrees of freedom = ( n1 +n2−2¿=(120+200−2)=318.
The calculated p-value of T (28.2186, 318) = 0.00001.
As the calculated p-value (0.00001) is less than 0.05, therefore the null hypothesis of
equality of means of both the proportions could be rejected. The alternative hypothesis can not

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

10SCIENTIFIC STATISTICS
be rejected in this context. It could be interpreted that effectiveness of ‘Sleeping well 2’ is higher
than the effectiveness of ‘Sleeping well’. The hypothesis testing indicates the improvement of
effectiveness.
Answer 5.
Answer 5. a)
Null Hypothesis (H0): The average sugar content of Coke Zero and Diet Coke does not
differ.
Alternative Hypothesis (HA): The average sugar content of Coke Zero and Diet Coke
differ.
The two-sample t-test is applied to determine the difference of average sugar content of
Coke Zero and Diet Coke. The calculated t-statistic (0.95939, d.f. = 24) generates the p-value =
0.3469. The calculated p-value is greater than 0.1 (assumed level of significance).
Hence, the null hypothesis cannot be rejected with 90% confidence. The difference of
average sugar level of two kinds of Cokes is significant. Therefore, it is 90% evident that the
average sugar level of Coke Zero is equal to the average sugar level of Diet Coke.
Answer 5. b)
The statistical assumptions of the two-sample t-test are-

11SCIENTIFIC STATISTICS
 Both populations must be normal in nature.
 The standard deviation of both populations must be equal. Hence, the variance would be
homogeneous.
 Both samples have to be randomly drawn independent of each other.
 Two samples must be reasonably large in size.
The assumption of normality of both the variables is right. These are also selected randomly
and these are independent to each other. However, the standard deviations of both variables are
not equal. Also, the sample sizes are not adequately large (n = 25). Hence, some assumptions are
maintained here and some are violated.
Answer 5. c)
Actually, a paired sample t-test compares two sample means from the same population of
same variable whether the means are different in two various times or not (pre-test and post-test).
On the other hand, two-sample t-test compares means from different populations whose
members have been matched for determining whether the difference between two means of two
variables is equal to 0 or not.
It could be interpreted that as ‘Coke Zero’ and ‘Diet Coke’ are from two different
populations, therefore, two-sample t-test is more preferable than paired-sample t-test.

12SCIENTIFIC STATISTICS

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

13SCIENTIFIC STATISTICS
References:
De Winter, J.C., 2013. Using the Student's t-test with extremely small sample sizes. Practical
Assessment, Research & Evaluation, 18(10).
Feller, W., 2015. On the normal approximation to the binomial distribution. In Selected Papers
I (pp. 655-665). Springer, Cham.
Steiger, J.H. and Fouladi, R.T., 2016. Noncentrality interval estimation and the evaluation of
statistical models. What if there were no significance tests, pp.197-229.

14SCIENTIFIC STATISTICS
Appendix:
R-code for Answer 2.
Dataset <- read.table("C:/Users/HP/Downloads/790841/CokeQ2.csv",
header=TRUE, sep="", na.strings="NA", dec=".", strip.white=TRUE)
with(Dataset, (t.test(sugar, alternative='greater', mu=0.0,
conf.level=.95)))
R-code for Answer 5.
Dataset <- read.table("C:/Users/HP/Downloads/790841/Comparison.csv",
header=TRUE, sep=",", na.strings="NA", dec=".", strip.white=TRUE)
with(Dataset, (t.test(Diet_Coke_sugar, Coke_Zero_sugar,
alternative='two.sided', conf.level=.90, paired=TRUE)))