Statistics Assignment: Experiments, Surveys, and Confidence

Verified

Added on 2022/12/16

AI Summary

This statistics assignment solution covers a range of topics including experimental design, sampling techniques, and statistical inference. The assignment begins with an experiment analysis, focusing on the manipulation of variables and the plausibility of observed differences. It then delves into survey methodology, exploring random sampling, non-response bias, selection bias, and behavioral considerations. The solution also examines bootstrap confidence intervals for median wait times and the differences between coffee types. Furthermore, the assignment includes the calculation and interpretation of confidence intervals for mean delivery times, as well as an analysis of sampling situations and the calculation and interpretation of confidence intervals for proportions. The document provides detailed explanations and justifications for each answer, making it a comprehensive resource for statistics students.

Statistics
Student Name:
Instructor Name:
Course Number:
28 April 2019
.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Question1:
a) Briefly explain why this study is an experiment.
Answer
The study us an experiment because there is a manipulation of the variable (speed
variable) in order to study its effect (i.e. speed setting 1 and speed setting 3).
b)
i) Randomization test
Answer
ii) When chance is acting alone, estimate how unusual it would be to get a difference
between the two group means at least as big as the observed difference?
Answer
It would be 13.83
iii) Is it plausible that the observed difference between the two group means can be
explained by “chance acting alone”? Briefly justify your answer.
Answer

Yes it is plausible that the observed difference between the two group means can
be explained by “chance acting alone” since both the treatment effect (speed
setting) and chance have possibility of creating the observed differences.
iv) Can we conclude that changing the speed setting from 1 to 3 causes the average
survival time to decrease? If so, justify why with two reasons. If not, what can we
conclude?
Answer
Yes we can conclude that changing the speed setting from 1 to 3 causes the
average survival time to decrease. This is because;
 We can clearly see that the speed setting 1 has a higher average mean
survival time as compared to the mean survival time with speed setting 3.
 The probability is small indicating significance of the treatment effect.
Question 2:
a) Briefly how you could use a table of random digits to select 50 out of the 953 areas to
survey.
Answer
We need to select numbers between 1 and 953. Use the first number as the randomly
selected starting point. The number represents area to be surveyed and this number
could be #941. Moe down the column selecting the appropriate numbers till 50 are
selected.
b) Is non-response bias a potential problem with the survey? Briefly justify your answer.
Answer

Yes. There is possibility of non-response bias in the sense that some regions might be
difficult to be reached hence no data would be collected in such areas leading to non-
response bias.
c) Is selection bias a potential problem with the survey? Briefly justify your answer.
Answer
No. Selection bias is not a potential problem with the given survey because the areas
are randomly selected hence reducing the selection bias.
d) Are behavioural considerations a potential problem with the survey? Briefly justify
your answer.
Answer
Behavioural considerations are not a potential problem in this study. This is based on
the fact that we are not dealing with human beings in the study nor living things that
might be influenced by the change in behavioural issues.
e) Can the results from this survey be applied to another desert 300 kilometres away?
Briefly justify your answer.
Answer
Yes the results from this survey can be applied to another desert 300 kilometres away
since a scientific method was applied. This scientific method/approach allows for
replicability of the survey in other areas.
f) Is the survey subject to sampling errors? Briefly justify your answer.
Answer

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Yes the survey is subject to sampling errors. This is because, the survey uses a sample
of the population and not an entire population and as such there is possibility of
difference in the sample statistics for the sample and the population.
Question 3:
a)
i) Why would the median be a better estimate of the centre of this data than the
mean?
Answer
Median would be a better estimate of the centre of this data than the mean
because the data is not normally distributed but rather skewed.
ii) Generate a bootstrap confidence interval for the median wait time of customers.
Answer
iii) What is the parameter we are estimating using this bootstrap confidence interval?
Answer

The parameter we are estimating using this bootstrap confidence interval is the
wait time. That is, the time the customers have to wait for their coffee after
placing an order.
iv) Do we know the true value of this parameter?
Answer
Yes we know the range of the true value of this parameter.
v) Interpret the bootstrap confidence interval.
Answer
We are 95% confident that the true population mean wait time for coffee after
placing an order is between 45 and 110.
b)
i) Generate a bootstrap confidence interval for the difference in the median wait
time between regular and fancy coffees.
Answer

ii) What is the parameter we are estimating using this bootstrap confidence interval
Answer
The parameter we are interested in estimating is the difference in the median wait time
for coffee after placing an order based on the type of coffee (either Fancy or Regular).
iii) Interpret the bootstrap confidence interval.
Answer
There is fairly a longer wait time for the fancy coffee which is between 100.06 lower
limit and 128.14 upper limit than the median wait time for the regular coffee which is
between 42.12 lower limit and 52.81 upper limit.
iv) Based on the bootstrap confidence interval, is it plausible that the median wait
time for regular coffees is the same as the median wait time for the fancy coffees?
Briefly justify your answer.
Answer
It is not plausible that the median wait time for regular coffees is the same as the
median wait time for the fancy coffees. This is because zero is not in the bootstrap
confidence interval, so a difference of zero – i.e. no difference – is not believable
in this case.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Question 4:
The data are displayed below;
17.9, 20.3, 28.4, 27.2, 19.6, 32.9, 16.3, 29.7, 18.5, 27.4
Summary Statistics: x=23.82 , s=5.89
a) Calculate and interpret a 95% confidence interval for the mean delivery time.
Answer
C . I :→ x ± zα/ 2 × s
√n
We have;
zα / 2=1.96 , x=23.82 , s=5.89∧n=10
x ± zα/ 2 × s
√ n →23.82 ±1.96 × 5.89
√ 10
→ 23.82± 3.65066
Lower limit :23.82−3.65066=20.16934
Upper limit :23.82+3.65066=27.47066
From the above results, we 95% confident that the true population time for delivering
pizza from local pizzeria is between 20.169 and 27.471 minutes.
b) After being shown the confidence interval, a friend of the statistics student pointed out
that the latest pizza they ordered arrived quicker than the lower value of the confidence
interval and suggests the company is improving. Explain what was wrong with the
thinking.
Answer

The sample size used by the friend of the statistics student is too small to warrant such a
conclusion. A larger sample size should be used and the sample should be randomly
selected in order to make such a conclusion.
Question 5:
a) State the sampling situation (a, b or c) for calculating the standard error of the difference
in the following scenarios:
i) For people who had researched the country they visited before going overseas,
estimating the difference between the proportion of females who researched
currency and the proportion of females who researched shopping/food.
Answer
Situation (b): One sample of size n, several response categories
ii) For people who had travelled overseas, estimating the difference between the
proportion of people under 40 who thought children were the biggest irritation
while travelling and the proportion of people 40 or over who thought children
were the biggest irritation while travelling.
Answer
Situation (a): Proportions from two independent samples
iii) For people who had travelled overseas, estimating the difference between the
proportion who researched the country they visited before going overseas and the
proportion of who would take out travel insurance.
Answer
Situation (c): One sample of size n, many yes/no items

b) For people who would take out travel insurance, calculate and interpret a 95% confidence
interval for the difference between the proportion of females who did so to cover the cost
of emergency medical treatment and the proportion of males who did so to cover the cost
of emergency medical treatment.
Answer
Proportion of males ( ^p1 ) = 571
2613 =0.2185
Proportion of females ( ^p2 ) = 897
4148 =0.2162
Confidence interval is computed as follows;
( ^p1− ^p2 ) ± Z (1−α /2 ) √ ^p1 (1− ^p1)
n1
+ ^p2 (1− ^p2)
n2
^p1=0.2185 , n1=2613 , ^p2=0.2162, n2=4148
Therefore we have;
( ^p1− ^p2 ) ± Z ( 1−α /2 ) √ ^p1 (1− ^p1)
n1
+ ^p2 (1− ^p2)
n2
→ ( 0.2185−0.2162 ) ± 1.96 √ 0.2185(1−0.2185)
2613 + 0.2162(1−0.2
4148
→ 0.0023± 1.96∗0.010305
→ 0.0023± 0.020199
Lower limit :0.0023−0.020199=−0.017899
Upper limit :0.0023+0.020199=0.022499
From the above calculations, we are 95% confident that the true population difference in
the proportion of females who did so to cover the cost of emergency medical treatment
and the proportion of males who did so to cover the cost of emergency medical treatment
is between -0.017899 and 0.022499.