Introduction to Biostatistics: Assignment 2 Solutions

Verified

Added on  2023/06/07

|8
|1567
|142
AI Summary
This article provides solutions to Assignment 2 of Introduction to Biostatistics course. It covers topics like point estimate, confidence interval, hypothesis testing, contingency table, sample size calculation and more. The article is relevant to students studying biostatistics in college or university. Course code and college/university name are not mentioned.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Running head: INTRODUCTION TO BIOSTATISTICS 1
401077 Introduction to Biostatistics, Spring 2018
Assignment 2
Due Sunday September 23, 2018
When submitting your assignment to Turnitin you are implicitly ticking these statements:
I retain a backup file of this assignment in case the original file is lost or damaged.
I hereby certify that no part of this assignment or product has been copied from any
other student’s work or from any other source except where due acknowledgement is
made in the assignment.
I hereby certify that no part of this assignment or product has been submitted by me in
another (previous or current) assessment.
I hereby certify that no part of the assignment has been written or produced by any
person.
I hereby certify that no part of this assignment has been made available to any other
student.
I am aware that this work will be reproduced and submitted to plagiarism detection
software for the purpose of detecting possible plagiarism. This software may retain a
copy of this assignment on its database for future plagiarism detection.
I understand that failure to uphold this declaration may result in academic proceedings
in line with the UWS Student Academic Misconduct Policy.”
Your name: Mary Spandana Pudota
Your student number: 19134332

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
INTRODUCTION TO BIOSTATISTICS 2
Question 1
a. Point estimate of proportion of females.
According to the data, females are 128 while males are 271. The point estimate of females is;
The point estimate for female p= 128
271 = ¿ 0.47
The 95% confident interval is;
Confidence Interval= p ± Z α
2
. p ( 1 p )
n
Confidence Interval=0.47 ± 1.96 0.47 × 0.53
195
¿ 0.47 ± 1.96 ×0.044
Confidence Interval=¿ Or; 0.41<p<0.53
According to R data
Statistics-proportion-single sample proportion test-sex-ok
Point estimate for female is 0.47
95% Confidence interval ranges between 0.4136887- 0.5317344
b. The proportion of girls to boys is 0.47, therefore, the confidence interval tells us that
when we are 95% confidence with the data we have, then the lower limit of girl’s
proportion is 0.41 while the highest limit the proportion can go is 0.53. Hence the point
estimate is inclusive in the confidence interval.
c. The result in part (b). is consistent with the statement 50% of 17-year-old in NSW are
females since the confidence interval ranges between 0.41- 0.53.
Document Page
INTRODUCTION TO BIOSTATISTICS 3
Question 2
a.
Fig 1. Histogram
Based on the histograms above, it is evident that the highest number of hours on both males and
females lies 0-3 hours with a frequency of 60 while that of least number of hours for females lies
20-25 and males lies within 17- 20 hours. The graphs show the distribution of the number of
hours for MVPA on each gender.
Sex Mean SD IQR Skewness 0% 25% 50% 75% 100% MVPA:n
Femal
e
3.857031 3.66875
9
3.90 2.292525 0.3 1.3 2.85 5.20 23.0 128
Male 4.546154 4.13621
5
5.45 1.152940 0.2 1.2 3.10 6.65 17.7 143
Document Page
INTRODUCTION TO BIOSTATISTICS 4
From the above chart, it is evident that the shape of distribution of self-reported hours of MVPA
per week for 17-year-old males and females are positively skewed. Moreover, from the
histogram, the shape of distribution of self-reported hours of MVPA per week for 17-year-old
females and males are skewed to the right.
b. Hypothesis testing
As this is a hypothesis test, we shall follow 5 step method to answer the question. We can use R
Commander to help with the calculations as follows:
Open R commander
Load survey data file
Select ‘statistics’ then ‘non-parametric’ ’then two-way Wilcoxon test
Select sex as group variable and MVPA response variable
Options two-sided normal approximation
Click ok
Now the output provides W= 8537 and p value as 0.3396
We can now proceed with the hypothesis test using the 5 step method.
Step 1: set up hypothesis and determine level of significance:
Null hypothesis( H¿ ¿ 0)=¿ ¿Is average self-reported hours of moderate to vigorous physical
activity (MVPA) per week equal between males and females in the population of NSW 17-year-
olds?
Alteranivehypothesis (H1 )=¿Is average self-reported hours of moderate to vigorous physical
activity (MVPA) per week not equal between males and females in the population of NSW 17-
year-olds?
Α = 0.05
Step 2: Selection of appropriate test statistics

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
INTRODUCTION TO BIOSTATISTICS 5
To make use of the available data on the size of the effect we shall apply two sample Wilcoxon
test using R commander.
Step 3: set up decision criteria
We shall reject null hypothesis if the P value calculated by R commander is less than the α =
0.05 significance level
Step 4: compute the test statistic
The test statistic from R commander is W= 8537 and the associated p value is 0.3396
Step 5: conclusion
Since our P-value obtained is 0.3396 and is greater than 0.05, at 95 % confidence level, we
cannot reject null hypothesis concluding that average self-reported hours of moderate to vigorous
physical activity (MVPA) per week equal between males and females in the population of NSW
17-year-olds.
Question 3
a. It one-sided hypothesis test. Since the researcher is interested in knowing whether the
emissions from aluminum smelters has decreased since the introduction of the new laws.
b. It is Wilcoxon sign-rank test. This is because we want to compare two related samples on
a single sample to assess whether their population mean ranks differ and thus Wilcoxon
sign-rank test is applicable in the case (Gyrfi et al. 2013).
Question 4
a. Contingency table between license status and gender.
Document Page
INTRODUCTION TO BIOSTATISTICS 6
License Sex
Female Male
Not licensed 36 33
Learners permit 30 45
Licensed 62 65
Sex
License Female Male Total Count
Not licensed 52.2 47.8 100 69
Learners permit 40.0 60.0 100 75
Licensed 48.8 51.2 100 127
b. There no evidence of association between gender and license status in this sample of
NSW 17-year-olds. This is because the proportion of not licensed females are more than
not licensed males whereas the proportion of learners and licensed males are more than
learners and licensed females.
c. The requirements for a Chi-Square test are met since the sample is more than 45
observations and even the minimum expected values are greater than 5
d. Step 1: Setting up the hypotheses
Null hypothesis ( H0 ) =Does mode of drivers license status do not differ by gender the population of NSW 1
Alternative hypothesis ( H ¿¿ 1)=Doesmode of drivers license status differ by gender the population of NS
And p-value α =0.05
Step 2: Selection of appropriate test statistics
To make use of the available data on the size of the effect we shall apply Chi-Square
Test.
Step 3: set up decision criteria:
We reject the null hypothesis if the computed P-value is less than 0.05
Step 4: Computation of the test statistics in R
Document Page
INTRODUCTION TO BIOSTATISTICS 7
Sex
License Female Male Total Count
Not licensed 52.2 47.8 100 69
Learners permit 40.0 60.0 100 75
Licensed 48.8 51.2 100 127
X- squared= 2.3783, df=2, p- value= 0.3045
Expected counts:
License Sex
Female Male
Not licensed 32.59041 36.40959
Learners permit 35.42435 39.57565
Licensed 59.98524 67.01476
Step 5: Conclusion
As our computed p- value 0.3045 is more than 0.05 significance level we don’t have
enough evidence to reject the null hypothesis. There is insufficient statistical evidence to
conclude that driver’s license status differs by gender in the population of NSW 17-year-
olds.
As our p value 0.3045 is more than 0.05 significance level we cannot reject null
hypothesis concluding that driver’s license status doesn’t differ by gender in the
population of NSW 17-year- old.
Question 5

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
INTRODUCTION TO BIOSTATISTICS 8
a. Different researches have different objectives making them to have different target group
hence different sample size.
b. Using the Online calculator, the following steps are applied;
The required margin of error is E= 0.05
The estimated standard deviation of the difference is δ = 3.0
To produce 95% confidence , we use Ƶ = 1.96
Power (1-β) = 0.90
True mean (μ)= 2
Null hypothesis mean = 1.5
755 girls and 755 boys is the required sample size to achieve what power subject
to what conditions given.
c. The sample size of 40 is small. This will make our margin of error to be large and lower
the confident level making the data less reliable.
Reference
Gyorfi Laszlo et al. (2013). A Distribution-Free Theory of Nonparametric Methods. Springer.
1 out of 8
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]