# Statistics Study Material

Statistics and Probability
STATISTICS
Question 1
(a) 95% confidence interval for mean annual expenditure of all reader households in USA
Mean = \$95.50
Standard deviation = \$50
Sample size = 100
Standard error = Standard deviation/sqrt (Sample size) = 5
The t value for 95% confidence interval = 1.98
Margin of error = t value * Standard error = 1.98*5 = 9.9
Lower limit of 95% confidence interval = Mean - Margin of error = 95.50 – 9.9 = 85.6
Upper limit of 95% confidence interval = Mean + Margin of error = 95.50 +9.9 = 105.4
95% confidence interval = [85.6 105.4]
(b) This is because the underlying confidence interval is based on the t statistics and not on z
statistics. T statistics does not assume that the distribution of the sample should be normal
in distribution and thereby ensures that the validity of the confidence interval is not
(c) Number of households in US = 120 million
Proportion of reader household in sample p= 100/1000 = 0.1
95% confidence interval for the proportion
The z value for 95% confidence interval = 1.96
Standard error = sqrt (p*q/n) = sqrt (0.1*0.9/1000) = 0.009487
Margin of error = z value * Standard error =1.96*0.009487 = 0.01859
Lower limit of 95% confidence interval = p - Margin of error = 0.1 –0.01859= 0.0814
1
Upper limit of 95% confidence interval = p + Margin of error = 0.1 +0.01859= 0.1186
95% confidence interval for population proportion = [0.0814 0.1186]
Considering that there are 120 million households, the 95% confidence interval for the
number of reader households in the US = (0.0814*120 million, 0.1186*120 million) =
(9.768,14.232) million.
(d) Minimum sample size needs to be computed
Margin of error = 5
Confidence interval = 95%
Standard deviation = 50
The z value for 95% confidence interval = 1.96
Minimum sample size = (z value * Standard deviation/ Margin of error)2
Minimum sample size = (1.96*50/5)2 = 385
Additional unit required = 385 – 100 = 285
There must be 285 additional households from US population that needs to be sampled in
regards to satisfy the requirement.
Question 2
(a) % variation in commercial cost that is explained by variation in Nielsen Rating
From correlation matrix, the value of correlation coefficient between X1 and Y comes out to
be 0.715.
Now,
Correlation coefficient = 0.715
Coefficient of determination (R square) = (Correlation coefficient)2 = (0.715)2 = 0.511.
