Statistics Project
VerifiedAdded on 2023/03/30
|7
|1132
|192
AI Summary
This statistics project covers topics such as probability distribution, hypothesis testing, and A/B testing. It includes calculations and explanations for finding the theoretical probability, estimating the range of population mean, testing a claim, and conducting A/B testing. The project is suitable for students studying statistics or related subjects.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
Statistics Project
Student Name:
Instructor Name:
Course Number:
31 May 2019
Student Name:
Instructor Name:
Course Number:
31 May 2019
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Problem Statement
Comprehension
The pharmaceutical company Sun Pharma is manufacturing a new batch of painkiller drugs,
which are due for testing. Around 80,000 new products are created and need to be tested for their
time of effect (which is measured as the time taken for the drug to completely cure the pain), as
well as the quality assurance (which tells you whether the drug was able to do a satisfactory job
or not).
Question 1:
The quality assurance checks on the previous batches of drugs found that — it is 4 times more
likely that a drug is able to produce a satisfactory result than not.
Given a small sample of 10 drugs, you are required to find the theoretical probability that at
most, 3 drugs are not able to do a satisfactory job.
a.) Propose the type of probability distribution that would accurately portray the above scenario,
and list out the three conditions that this distribution follows.
It follows a binomial distribution as it satisfies the following three conditions:
i. Each trial have only two outcomes
ii. The trials are independent of each other and
iii. The number of trials are fixed
b.) Calculate the required probability.
P ( x ) =∑
0
3 N !
x ! ( N−x ) ! π x (1−π )N −x
Comprehension
The pharmaceutical company Sun Pharma is manufacturing a new batch of painkiller drugs,
which are due for testing. Around 80,000 new products are created and need to be tested for their
time of effect (which is measured as the time taken for the drug to completely cure the pain), as
well as the quality assurance (which tells you whether the drug was able to do a satisfactory job
or not).
Question 1:
The quality assurance checks on the previous batches of drugs found that — it is 4 times more
likely that a drug is able to produce a satisfactory result than not.
Given a small sample of 10 drugs, you are required to find the theoretical probability that at
most, 3 drugs are not able to do a satisfactory job.
a.) Propose the type of probability distribution that would accurately portray the above scenario,
and list out the three conditions that this distribution follows.
It follows a binomial distribution as it satisfies the following three conditions:
i. Each trial have only two outcomes
ii. The trials are independent of each other and
iii. The number of trials are fixed
b.) Calculate the required probability.
P ( x ) =∑
0
3 N !
x ! ( N−x ) ! π x (1−π )N −x
P ( 0 ) = 10 !
0 ! ( 10−0 ) ! 0.40 (1−0.4)10−0=¿0.006047
P ( 1 )= 10!
1! ( 10−1 ) ! 0.41(1−0.4)10−1=¿0.040311
P ( 2 ) = 10!
2! ( 10−2 ) ! 0.42 (1−0.4 )10−2=¿0.120932
P ( 3 )= 10 !
3 ! ( 10−3 ) ! 0.43 (1−0.4)10−3=¿0.214991
¿ 0.006047+0.040311+0.120932+0.214991=¿0.382281
Question 2:
For the effectiveness test, a sample of 100 drugs was taken. The mean time of effect was 207
seconds, with the standard deviation coming to 65 seconds. Using this information, you are
required to estimate the range in which the population mean might lie — with a 95% confidence
level.
a.)Discuss the main methodology using which you will approach this problem. State all the
properties of the required method. Limit your answer to 150 words.
Since the sample size is greater than 30, we use a Z test
Characteristics:
i. Woks in a data of sample size>30,
ii. Data points should be independent from each other,
iii. Applied in a normally distributed data though this does not matter for large sample data
iv. Items should be randomly selected,
0 ! ( 10−0 ) ! 0.40 (1−0.4)10−0=¿0.006047
P ( 1 )= 10!
1! ( 10−1 ) ! 0.41(1−0.4)10−1=¿0.040311
P ( 2 ) = 10!
2! ( 10−2 ) ! 0.42 (1−0.4 )10−2=¿0.120932
P ( 3 )= 10 !
3 ! ( 10−3 ) ! 0.43 (1−0.4)10−3=¿0.214991
¿ 0.006047+0.040311+0.120932+0.214991=¿0.382281
Question 2:
For the effectiveness test, a sample of 100 drugs was taken. The mean time of effect was 207
seconds, with the standard deviation coming to 65 seconds. Using this information, you are
required to estimate the range in which the population mean might lie — with a 95% confidence
level.
a.)Discuss the main methodology using which you will approach this problem. State all the
properties of the required method. Limit your answer to 150 words.
Since the sample size is greater than 30, we use a Z test
Characteristics:
i. Woks in a data of sample size>30,
ii. Data points should be independent from each other,
iii. Applied in a normally distributed data though this does not matter for large sample data
iv. Items should be randomly selected,
v. There should be same sample sizes if probable
b.)Find the required range.
standard error = σ
√ n = 65
√ 100 = 65
10 =6.5
margin of errors=6.5 ×1.96=12.74
μ range=207 ± 12.74
¿(194.26 ,219.74)
Question 3:
a) The painkiller drug needs to have a time of effect of at most 200 seconds to be considered
as having done a satisfactory job. Given the same sample data (size, mean, and standard
deviation) of the previous question, test the claim that the newer batch produces a
satisfactory result and passes the quality assurance test. Utilize 2 hypothesis testing
methods to make your decision. Take the significance level at 5 %. Clearly specify the
hypotheses, the calculated test statistics, and the final decision that should be made for
each method.
Answer
The following hypothesis is to be tested
1. Null hypothesis (H0): The average time is same as 200 seconds
2. Alternative hypothesis (HA): The average time is greater than 200 seconds.
This can be written symbolically as follows;
b.)Find the required range.
standard error = σ
√ n = 65
√ 100 = 65
10 =6.5
margin of errors=6.5 ×1.96=12.74
μ range=207 ± 12.74
¿(194.26 ,219.74)
Question 3:
a) The painkiller drug needs to have a time of effect of at most 200 seconds to be considered
as having done a satisfactory job. Given the same sample data (size, mean, and standard
deviation) of the previous question, test the claim that the newer batch produces a
satisfactory result and passes the quality assurance test. Utilize 2 hypothesis testing
methods to make your decision. Take the significance level at 5 %. Clearly specify the
hypotheses, the calculated test statistics, and the final decision that should be made for
each method.
Answer
The following hypothesis is to be tested
1. Null hypothesis (H0): The average time is same as 200 seconds
2. Alternative hypothesis (HA): The average time is greater than 200 seconds.
This can be written symbolically as follows;
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
H0 : μ=200
H A : μ>200
The test statistics to be computed is the z score.
z= x −μ
σ / √n = 200−207
65 / √100 =−1.0769
The computed z score is given as -1.0769. Next we compute the probability
P ( z>−1.0769 )=0.859
Decision
Since the p-value is greater than 5% level of significance we fail to reject the null
hypotheses and conclude that the time effect is less than or equal to 200 hence the newer
batch produces a satisfactory result
b) You know that two types of errors can occur during hypothesis testing — namely Type-I
and Type-II errors — whose probabilities are denoted by α and β respectively. For the
current hypothesis test conditions (sample size, mean, and standard deviation), the value
of α and β come out to 0.05 and 0.45 respectively.
Now, a different sampling procedure is proposed so that when the same hypothesis test is
conducted, the values of α and β are controlled at 0.15 each. Explain under what
conditions would either method be more preferred than the other.
Answer
From the explanations, we can see that both the procedures have same probability of committing
type I error. However, the second procedure has a greater probability of committing type II error
as compared to the first procedure. Based on this information, the first sampling procedure is
appropriate when one has enough resources to conduct a larger sample size that would ensure a
H A : μ>200
The test statistics to be computed is the z score.
z= x −μ
σ / √n = 200−207
65 / √100 =−1.0769
The computed z score is given as -1.0769. Next we compute the probability
P ( z>−1.0769 )=0.859
Decision
Since the p-value is greater than 5% level of significance we fail to reject the null
hypotheses and conclude that the time effect is less than or equal to 200 hence the newer
batch produces a satisfactory result
b) You know that two types of errors can occur during hypothesis testing — namely Type-I
and Type-II errors — whose probabilities are denoted by α and β respectively. For the
current hypothesis test conditions (sample size, mean, and standard deviation), the value
of α and β come out to 0.05 and 0.45 respectively.
Now, a different sampling procedure is proposed so that when the same hypothesis test is
conducted, the values of α and β are controlled at 0.15 each. Explain under what
conditions would either method be more preferred than the other.
Answer
From the explanations, we can see that both the procedures have same probability of committing
type I error. However, the second procedure has a greater probability of committing type II error
as compared to the first procedure. Based on this information, the first sampling procedure is
appropriate when one has enough resources to conduct a larger sample size that would ensure a
lower risk of committing type II error by having enough power as can be seen (Kimball, 2011).
In the second sampling procedure, we can see that the probability of committing type II error is
now much greater implying that the sampling procedure is ideal for when the resources are
extremely low and one has no choice but to use a small sample size (Lubin, 2012).
Question 4:
Now, once the batch has passed all the quality tests and is ready to be launched in the market, the
marketing team needs to plan an effective online ad campaign for its existing subscribers. Two
taglines were proposed for the campaign, and the team is currently divided on which option to
use.
Explain why and how A/B testing can be used to decide which option is more effective. Give a
stepwise procedure for the test that needs to be conducted.
Stepwise procedure
A/B testing would be ideal to test for the effectiveness of the two taglines since this procedure
(A/B testing) is basically a procedure for comparing two versions of webpages or apps to
ascertain which one is more effective.
This procedure can be done by creating two versions of the same page with the only changing
thing being the tagline
Present each tagline version to the half of your visitors
Wait to see which version the viewers liked most, check the comments: this helps to know which
one was preferred over the other
In the second sampling procedure, we can see that the probability of committing type II error is
now much greater implying that the sampling procedure is ideal for when the resources are
extremely low and one has no choice but to use a small sample size (Lubin, 2012).
Question 4:
Now, once the batch has passed all the quality tests and is ready to be launched in the market, the
marketing team needs to plan an effective online ad campaign for its existing subscribers. Two
taglines were proposed for the campaign, and the team is currently divided on which option to
use.
Explain why and how A/B testing can be used to decide which option is more effective. Give a
stepwise procedure for the test that needs to be conducted.
Stepwise procedure
A/B testing would be ideal to test for the effectiveness of the two taglines since this procedure
(A/B testing) is basically a procedure for comparing two versions of webpages or apps to
ascertain which one is more effective.
This procedure can be done by creating two versions of the same page with the only changing
thing being the tagline
Present each tagline version to the half of your visitors
Wait to see which version the viewers liked most, check the comments: this helps to know which
one was preferred over the other
Adopt the one that has more likes than the other, make necessary improvements on the chosen
one if commented.
References
Kimball, A., 2011. Errors of the Third Kind in Statistical Consulting. Journal of the American
Statistical Association, 52(278), p. 133–142.
Lubin, A., 2012. The Interpretation of Significant Interaction. Educational and Psychological
Measurement, 21(4), p. 807–817.
one if commented.
References
Kimball, A., 2011. Errors of the Third Kind in Statistical Consulting. Journal of the American
Statistical Association, 52(278), p. 133–142.
Lubin, A., 2012. The Interpretation of Significant Interaction. Educational and Psychological
Measurement, 21(4), p. 807–817.
1 out of 7
Related Documents
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.