SIT191: Problem Solving Task 2 - Probability, Hypothesis, Confidence

Verified

Added on 2023/06/06

AI Summary

This document presents a comprehensive solution to Problem Solving Task 2 for the SIT191 course. The assignment covers various statistical concepts, including probability calculations related to blood types, sampling techniques, and simulation. It explores binomial and normal distributions, applying them to problems involving learner drivers and the heights of boys. Furthermore, the solution delves into hypothesis testing, constructing and interpreting confidence intervals, and analyzing data related to pain relief and asbestos exposure. The document provides detailed explanations and calculations for each problem, demonstrating a strong understanding of statistical principles and their application in real-world scenarios. The solution addresses topics like experimental design, observational studies, and the interpretation of statistical significance, making it a valuable resource for students studying statistics and probability.

PROBLEM SOLVING
TASK 2
SIT191
Student Name
[Pick the date]

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

2.1
Population have blood type O = 49%
Population have blood type A = 38%
Population have blood type B = 10%
Population have blood type AB = 3%
(a) Random sampling is considered the suitable technique to represent the different blood groups.
Blood group O 0 – 48
Blood group A 49 – 86
Blood group B 87 – 96
Blood group AB 97 – 99
(b) Simulation to calculate the number of donors required to ensure that that at least three
people with O blood group and one person with A blood group
Minimum four person needs to be involved for the simulation trial.
Final estimate would be calculated as shown below.
1

Therefore, it can be concluded that approximately 8 donors are required to ensure that that at
least three people with O blood group and one person with A blood group.
2.2
a) The population would consist of the residents in a particular city.
b) Proportion of Australians who tend to consume some kind of multivitamins on a daily basis.
c) All the residents that reside in that city
d) Shopper at the local supermarket
e) Randomisation was not observed in the sample since only shoppers from the local
supermarket were used for survey
f) People who do not shop from supermarkets; people those who tend to shop from other
supermarkets or people those who cannot afford shopping from supermarket would be excluded
in the survey.
g) With regards to multivitamin consumption, it is quite possible that the underlying
consumption pattern of those shoppers visiting the supermarket might be different from those
who do not visit supermarket.
2.3
a) The given study would be classified as experimental and not observation since the underlying
treatment is controlled by the researcher. This is apparent since 50% of the volunteers who
participated were given trial drug while the remaining was given placebo.
b) The people suffering from breast cancer are the subjects studied.
c) The underlying factor is the trial drug with two levels being drug received and placebo
received.
d) There are two treatments in the given experiment namely receiving the trial drug and
receiving the placebo.
e) The response variable is the healing of patient suffering from breast cancer.
2

f) The design is completely randomised since the researchers are not aware of the treatments
received by a given volunteer.
g) The experiment is double blind since neither the researchers nor the volunteers are aware as to
whether they are receiving the trial drug or the placebo.
h) The given experiment would allow us to understand the healing capacity of the trial drug with
regards to breast cancer patients.
2.4
a) For the given context, statistically significant means that for the given level of there is
difference in the fitness level of people who are regular dancers and those who do not dance at
all.
b) The given study is observational since the people are selected according to whether they dance
regularly or do not dance. No treatment has been given to the participants and instead the fitness
levels are observed.
c) It is imperative to note that dancing does not lead to fitness. However, it has been observed
that people engaged in dancing on a regular basis on average are more fit in comparison to
people those who do not dance at all. It is quite possible that fitness level of a person not engaged
in regular dance could be high but the proportion of such people would be less.
d) Yes, an experiment can be designed for possible link between dancing and fitness. In such a
study, there would be two treatments namely regularly dancing and no dance at all.
2.5
Probability calculation
Let’s defined the variables
C = Chicken
B =Beef
A =Bacon
F =Fish
3

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Now,
(a) Probability that John is buying one meat type only
(i) Buys chicken
(ii) Buys chicken or bacon
(iii) Buys neither chicken nor bacon
(b) John is buying three meat type randomly then the probability
(i) Buys all chicken
(ii) Buys no chicken
(iii) Buys at least one fish
(iv) Has not bought three fish
4

2.6
% of households have at least one dog P (D) = 36%
% of households have at least one cat P (C) = 23%
% of households have both animals
(a) Probability that a randomly selected household has dog not a cat
(b) Probability that a randomly selected household does not have either animal
(c) Probability that a randomly selected household has cat or dog
(d) Probability that a randomly selected household has cat if they have a dog
(e) Owning dogs and cats are not mutually exclusive because
(f) Owning dogs and cats independent events
(Events would be independents)
5

(g) As, and hence, owning dogs and cats would not be considered
as independent events.
2.7
Probability distribution is highlighted below.
a) Mean number of tomatoes on a branch
b) Standard deviation of tomatoes on a branch
2.8
Percentage of learner drivers pass driving test p = 80% =0.80
6

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

(a) Let x represents the total number of learners drivers that would pass the test on the first
attempt when there are 10 learners drivers.
Binomial distribution
(i) Probability that successful driving test outcome for all 6 drivers
(ii) Probability that successful driving test outcome for 3 or 4
drivers
(iii) Probability that successful driving test outcome for at least 2 drivers
(b) If 150 total learner drivers are chosen
Normal distribution
7

(i) Number of learners who would pass the test
Let y is the total learner drivers are chosen who are passing the test in their first attempt of the
120 leaners derivers.
(ii) Probability that higher than 125 drivers would pass the test
2.9
Average height of boy (6 year) = 115 cm
Standard deviation = 3 cm
Mean height of 20 boys of the same age is defined by
(a) Percentage of 6 year old boys who are taller than 117 cm
Therefore, 25.24% of 6 year old boys are taller than 117 cm.
(b) Sampling distribution of the mean
Where,
8

(c) Probability that the mean height of class is higher than 117 cm
Hence, there is a 0.001435 probability that the mean height of class is higher than 117 cm.
2.10
Total patients n = 38
Number of patients who reported a reduction in pain = 27
(a) 95% confidence interval
Proportion of patients
Let’s assume normal distribution and hence,
Standard deviation
Z value for 95% confidence interval = 1.96
Lower level of 95% confidence interval
9

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Upper level of 95% confidence interval
Hence, “the 95% confidence interval for the true population of the sufferers of the condition
whose symptoms should be improved by injection would be [0.5663 0.8547].” There is 95%
confidence that the proportion of patients reported a reduction in pain would lie between this
interval.
(b) Claim: Injection would relief pain at least 75% of patients
It can be said that that injection would relief pain at least 75% of the patients fall within the
range of the 95% confidence interval. Therefore, the probability is 95% that injection would
relief pain at least 75% of the patients and hence, the claim is right.
2.11
Hypothesis testing
Random sample (number of children) n = 3216
Number of children had unknowingly played = 600
(a) Hypotheses
Null hypothesis Rate of exposure has not increased on the 1997 figure i.e. p = 0.18
Alternative hypothesis Rate of exposure has increased on the 1997 figure i.e. p >0.18
(b) Assumption and conditions
 Sample has taken through random sampling from the population.
 Sample size is higher than 30
 should be higher than 10, as n*p = 3216*0.18 =578.88, n*(1-p)= 3216*(1-
0.18) = 2637.12 both are higher than 10 and hence, the condition is satisfied.
 Level of significance = 5%
10

(c) The z stat and p value
Population proportion p = 0.18
Sample proportion
Thus,
Therefore, the value of z stat is 0.97.
The p value is calculated based on the z stat.
Hence, the p value comes out to be 0.16602.
(d) Explanation of p value and relevancy
The p value is the measure of significance of the result. When p value is lower than the level of
significance, then sufficient evidence is present to reject the null hypothesis and to accept the
alternative hypothesis.
(e) Conclusion
It can be said that p value comes out to be higher than the level of significance (0.16602 >0.05)
and thus, insufficient evidence present to reject the null hypothesis. Thus, alternative hypothesis
would not be accepted and the conclusion can be drawn that rate of exposure has not increased
on the 1997 figure.
(f) 95% confidence interval of the true proportion of the children exposed to asbestos.
The z value for 95% confidence interval is 1.96.
11