Comparison of Myopia Rates among Young Adult Males and Females
VerifiedAdded on 2019/11/19
|10
|2577
|178
Report
AI Summary
The study investigates whether there is a significant difference in the proportion of young adult males and females with myopia. The results show that the proportion of female participants (0.5806) is slightly higher than that of male participants (0.5767), but the difference is not statistically significant (p-value = 0.9404). Additionally, the study examines whether there is an association between myopia status and highest education level achieved in young adult Australians. The results show a significant association between the two variables (Chi-squared test p-value < 0.05).
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
Running Head: INTRODUCTION TO BIOSTATISTICS 1
Introduction to Biostatistics
Course code: 401077
Name
Institution
Instructor
Spring 2017
Date
Introduction to Biostatistics
Course code: 401077
Name
Institution
Instructor
Spring 2017
Date
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
INTRODUCTION TO BIOSTATISTICS 2
Question 1 (10 marks)
Research question: Does the average hours minutes of moderate to vigorous physical activity
(MVPA) differ by gender in young adult Australians?
Use the assignment data set assigned to you: Variables to analyse: ‘MVPA’ and ‘sex’.
Note: Each student will get different answers as the data sets differ.
a) Using R Commander draw histograms of MVPA by sex. Add reasonable axis labels.
(1 mark)
Solution
Note that MVPA has a strong positive skew. Possible responses include:
i/ Use a parametric approach (as the sample size is large enough for the Central
Limit Theorem to apply) or
ii/ Use a non-parametric approach.
b) Address the research question applying option i/ above Please use R Commander to
do all calculations but format your answer following the 5 step method. (6 marks)
Solution
Question 1 (10 marks)
Research question: Does the average hours minutes of moderate to vigorous physical activity
(MVPA) differ by gender in young adult Australians?
Use the assignment data set assigned to you: Variables to analyse: ‘MVPA’ and ‘sex’.
Note: Each student will get different answers as the data sets differ.
a) Using R Commander draw histograms of MVPA by sex. Add reasonable axis labels.
(1 mark)
Solution
Note that MVPA has a strong positive skew. Possible responses include:
i/ Use a parametric approach (as the sample size is large enough for the Central
Limit Theorem to apply) or
ii/ Use a non-parametric approach.
b) Address the research question applying option i/ above Please use R Commander to
do all calculations but format your answer following the 5 step method. (6 marks)
Solution
INTRODUCTION TO BIOSTATISTICS 3
STATE: We will test the claim that the average hours minutes of moderate to
vigorous physical activity (MVPA) differ by gender in young adult Australians.
FORMULATE: We will test the following hypotheses at 5% significance level
An independent t-test will be used
Differences: Male – Female = μd
H0 : μd =0
H1 : μd ≠ 0
α =0.05
SOLVE: We first check the requirements. Assume that tablets were randomly selected
for testing. The sample is large (n = 349 > 30) hence assumption of normality was
made and a parametric test was performed;
We performed an independent t-test
The following R output has been obtained:
Table 1: Independent T-Test and CI: Male, Female
DECISION:
From the output in Table 1, the t-test statistic is 1.9738; it follows a t distribution with
n – 1 = 349 -2 = 347 degrees of freedom. The corresponding P-value is 0.04941.
Since the P-value = 0.04941 < 0.05, H0 MUST BE Rejected.
> t.test(MVPA~sex, alternative='two.sided', conf.level=.95, var.equal=FALSE,
+ data=shortsight)
Welch Two Sample t-test
data: MVPA by sex
t = 1.9738, df = 273.492, p-value = 0.04941
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.00227584 1.74477212
sample estimates:
mean in group male mean in group female
4.585890 3.712366
STATE: We will test the claim that the average hours minutes of moderate to
vigorous physical activity (MVPA) differ by gender in young adult Australians.
FORMULATE: We will test the following hypotheses at 5% significance level
An independent t-test will be used
Differences: Male – Female = μd
H0 : μd =0
H1 : μd ≠ 0
α =0.05
SOLVE: We first check the requirements. Assume that tablets were randomly selected
for testing. The sample is large (n = 349 > 30) hence assumption of normality was
made and a parametric test was performed;
We performed an independent t-test
The following R output has been obtained:
Table 1: Independent T-Test and CI: Male, Female
DECISION:
From the output in Table 1, the t-test statistic is 1.9738; it follows a t distribution with
n – 1 = 349 -2 = 347 degrees of freedom. The corresponding P-value is 0.04941.
Since the P-value = 0.04941 < 0.05, H0 MUST BE Rejected.
> t.test(MVPA~sex, alternative='two.sided', conf.level=.95, var.equal=FALSE,
+ data=shortsight)
Welch Two Sample t-test
data: MVPA by sex
t = 1.9738, df = 273.492, p-value = 0.04941
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.00227584 1.74477212
sample estimates:
mean in group male mean in group female
4.585890 3.712366
INTRODUCTION TO BIOSTATISTICS 4
CONCLUDE: At 5% significance level, there is enough statistical evidence to
conclude that the average hours minutes of moderate to vigorous physical activity
(MVPA) differ by gender in young adult Australians (The average for the males (M =
4.59) is higher than that of the females (M = 3.71))
c) Address the research question applying option ii/ above. Please use R Commander to
do all calculations but format your answer following the 5 step method. (3 marks)
Solution
STATE: We will test the claim that the average hours minutes of moderate to
vigorous physical activity (MVPA) differ by gender in young adult Australians.
FORMULATE: We will test the following hypotheses at 5% significance level
Two Sample Wilcoxon test will be used
Differences: Male – Female = μd
H0 : μd =0
H1 : μd ≠ 0
α =0.05
SOLVE: We first check the requirements. Assume that tablets were randomly selected
for testing. The data does not meet normality assumption so non-parametric test was
performed.
We performed a 2 sample Wilcoxon test; The following R output has been obtained:
Table 2: Wilcoxon test: Male, Female
> with(shortsight, tapply(MVPA, sex, median, na.rm=TRUE))
male female
3.1 3.0
> wilcox.test(MVPA ~ sex, alternative="two.sided",
data=shortsight)
Wilcoxon rank sum test with continuity correction
data: MVPA by sex
W = 16155, p-value = 0.2896
alternative hypothesis: true location shift is not equal to 0
CONCLUDE: At 5% significance level, there is enough statistical evidence to
conclude that the average hours minutes of moderate to vigorous physical activity
(MVPA) differ by gender in young adult Australians (The average for the males (M =
4.59) is higher than that of the females (M = 3.71))
c) Address the research question applying option ii/ above. Please use R Commander to
do all calculations but format your answer following the 5 step method. (3 marks)
Solution
STATE: We will test the claim that the average hours minutes of moderate to
vigorous physical activity (MVPA) differ by gender in young adult Australians.
FORMULATE: We will test the following hypotheses at 5% significance level
Two Sample Wilcoxon test will be used
Differences: Male – Female = μd
H0 : μd =0
H1 : μd ≠ 0
α =0.05
SOLVE: We first check the requirements. Assume that tablets were randomly selected
for testing. The data does not meet normality assumption so non-parametric test was
performed.
We performed a 2 sample Wilcoxon test; The following R output has been obtained:
Table 2: Wilcoxon test: Male, Female
> with(shortsight, tapply(MVPA, sex, median, na.rm=TRUE))
male female
3.1 3.0
> wilcox.test(MVPA ~ sex, alternative="two.sided",
data=shortsight)
Wilcoxon rank sum test with continuity correction
data: MVPA by sex
W = 16155, p-value = 0.2896
alternative hypothesis: true location shift is not equal to 0
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
INTRODUCTION TO BIOSTATISTICS 5
DECISION:
From the output in Table 2, the Wilcoxon statistic is 16155; the corresponding P-
value is 0.2896. Since the P-value = 0.2896 > 0.05, H0 cannot be Rejected.
CONCLUDE: At 5% significance level, there is no enough statistical evidence to
conclude that the average hours minutes of moderate to vigorous physical activity
(MVPA) differ by gender in young adult Australians (The median for the males (M =
3.1) is not significantly different from the females (M = 3.0))
Question 2 (3 marks)
Research question: On average, how much heavier is the first born twin than the second born
twin among twins born at full term through vaginal delivery in Australia.
The following table shows birthweight (in grams) of a random sample of 10 Australian sets
of twins born at full term through vaginal delivery.
ID of mother Birthweight of first
born twin (grams)
Birthweight of
second born twin
(grams)
How much heavier
the first born is than
the second (grams)
1 2018 2843 -825
2 3217 2476 741
3 2204 2861 -657
4 1166 2300 -1134
5 2715 2582 133
6 2530 1886 644
7 1802 2004 -202
8 2913 2416 497
9 1917 2399 -482
10 1202 1996 -794
Sample size 10 10 10
mean 2168.4 2376.3 207.9
DECISION:
From the output in Table 2, the Wilcoxon statistic is 16155; the corresponding P-
value is 0.2896. Since the P-value = 0.2896 > 0.05, H0 cannot be Rejected.
CONCLUDE: At 5% significance level, there is no enough statistical evidence to
conclude that the average hours minutes of moderate to vigorous physical activity
(MVPA) differ by gender in young adult Australians (The median for the males (M =
3.1) is not significantly different from the females (M = 3.0))
Question 2 (3 marks)
Research question: On average, how much heavier is the first born twin than the second born
twin among twins born at full term through vaginal delivery in Australia.
The following table shows birthweight (in grams) of a random sample of 10 Australian sets
of twins born at full term through vaginal delivery.
ID of mother Birthweight of first
born twin (grams)
Birthweight of
second born twin
(grams)
How much heavier
the first born is than
the second (grams)
1 2018 2843 -825
2 3217 2476 741
3 2204 2861 -657
4 1166 2300 -1134
5 2715 2582 133
6 2530 1886 644
7 1802 2004 -202
8 2913 2416 497
9 1917 2399 -482
10 1202 1996 -794
Sample size 10 10 10
mean 2168.4 2376.3 207.9
INTRODUCTION TO BIOSTATISTICS 6
standard deviation 686.7 339.4 674.8
The following table shows critical values of the t-distribution to be used when calculating a
95% confidence interval
df= 6 7 8 9 10 11 12 13 14
t= 2.447 2.365 2.306 2.262 2.228 2.201 2.179 2.160 2.145
df= 15 16 17 18 19 20 21 22 23
t= 2.131 2.120 2.110 2.101 2.093 2.086 2.080 2.074 2.069
Use the sample size, mean, standard deviation and t-value provided to calculate a 95%
confidence interval for the mean difference in birthweights in Australian twins. Please
assume that the data are normally distributed conditions of the Central Limit Theorem have
been met. Write your answer to the research question in a sentence. This is a manual
calculation – do not use R Commander – and you need to show your working to get any
marks.
When writing equations you are welcome to use the following simplifications:
x can be written xbar
± can be written +-
A1 can be written A_1
a
b can be written a/b
√a can be written sqrt(a)
Solution
First, we compute Sp, the pooled estimate of the common standard deviation:
Substituting:
Sp= √ ( 10−1 ) 686.72 + ( 10−1 ) 339.42
10+ 10−2 = √293374.6=541.6407
The degrees of freedom (df) = n1+n2-2 = 10+10-2 = 18. From the t-Table t =
2.101. The 95% confidence interval for the difference in mean systolic blood
pressures is:
Substituting:
( 2168.4−2376.3 ) ±2.101(541.6407) √ 1
10 + 1
10
207.9 ±1137.987 (0.4472)
standard deviation 686.7 339.4 674.8
The following table shows critical values of the t-distribution to be used when calculating a
95% confidence interval
df= 6 7 8 9 10 11 12 13 14
t= 2.447 2.365 2.306 2.262 2.228 2.201 2.179 2.160 2.145
df= 15 16 17 18 19 20 21 22 23
t= 2.131 2.120 2.110 2.101 2.093 2.086 2.080 2.074 2.069
Use the sample size, mean, standard deviation and t-value provided to calculate a 95%
confidence interval for the mean difference in birthweights in Australian twins. Please
assume that the data are normally distributed conditions of the Central Limit Theorem have
been met. Write your answer to the research question in a sentence. This is a manual
calculation – do not use R Commander – and you need to show your working to get any
marks.
When writing equations you are welcome to use the following simplifications:
x can be written xbar
± can be written +-
A1 can be written A_1
a
b can be written a/b
√a can be written sqrt(a)
Solution
First, we compute Sp, the pooled estimate of the common standard deviation:
Substituting:
Sp= √ ( 10−1 ) 686.72 + ( 10−1 ) 339.42
10+ 10−2 = √293374.6=541.6407
The degrees of freedom (df) = n1+n2-2 = 10+10-2 = 18. From the t-Table t =
2.101. The 95% confidence interval for the difference in mean systolic blood
pressures is:
Substituting:
( 2168.4−2376.3 ) ±2.101(541.6407) √ 1
10 + 1
10
207.9 ±1137.987 (0.4472)
INTRODUCTION TO BIOSTATISTICS 7
Then simplifying further:
207.9 ±508.9233
So, the 95% confidence interval for the difference is (-301.0233, 716.8233)
Therefore on average, the first born twin is heavier than the second born twin
among twins born at full term through vaginal delivery in Australia by between -
301.0233 and 716.8233.
Question 3 (4 marks)
Research question: How different is the proportion of people with myopia between young
adult females and young adult males in Australia?
Use the assignment data set assigned to you: Variables to analyse: ‘myopia’ and ‘sex’
Note: Each student will get different answers as the data sets differ.
a) Using R Commander, calculate the 95% confidence interval for the difference in
proportion of young adult males and young adult females with myopia. (1 mark)
Solution
The 95% confidence interval for the difference in proportion
of young adult males and young adult females with myopia
is CI. [-0.1078, 0.0999]
b) Carefully write, in words, the answer to the research
question. Be sure to identify which group has the
higher rate of myopia. (2 marks).
Solution
The proportion of young adult females (0.5806) with
myopia is higher than the proportion males with
myopia (0.5767). However, there is no significant
difference in the proportions of the two
groups, p = .9404.
c) Have the assumptions of this confidence interval
been met? Explain why or why not. (1 mark)
Solution
> library(abind,
pos=15)
> local({ .Table <-
xtabs(~sex+myopia
, data=shortsight)
+ cat("\
nPercentage
table:\n")
+
print(rowPercents(
.Table))
+
prop.test(.Table,
alternative='two.si
ded',
conf.level=.95,
correct=FALSE)
+ })
Percentage table:
myopia
sex myopia
normal Total
Count
male 57.7
42.3 100 163
Then simplifying further:
207.9 ±508.9233
So, the 95% confidence interval for the difference is (-301.0233, 716.8233)
Therefore on average, the first born twin is heavier than the second born twin
among twins born at full term through vaginal delivery in Australia by between -
301.0233 and 716.8233.
Question 3 (4 marks)
Research question: How different is the proportion of people with myopia between young
adult females and young adult males in Australia?
Use the assignment data set assigned to you: Variables to analyse: ‘myopia’ and ‘sex’
Note: Each student will get different answers as the data sets differ.
a) Using R Commander, calculate the 95% confidence interval for the difference in
proportion of young adult males and young adult females with myopia. (1 mark)
Solution
The 95% confidence interval for the difference in proportion
of young adult males and young adult females with myopia
is CI. [-0.1078, 0.0999]
b) Carefully write, in words, the answer to the research
question. Be sure to identify which group has the
higher rate of myopia. (2 marks).
Solution
The proportion of young adult females (0.5806) with
myopia is higher than the proportion males with
myopia (0.5767). However, there is no significant
difference in the proportions of the two
groups, p = .9404.
c) Have the assumptions of this confidence interval
been met? Explain why or why not. (1 mark)
Solution
> library(abind,
pos=15)
> local({ .Table <-
xtabs(~sex+myopia
, data=shortsight)
+ cat("\
nPercentage
table:\n")
+
print(rowPercents(
.Table))
+
prop.test(.Table,
alternative='two.si
ded',
conf.level=.95,
correct=FALSE)
+ })
Percentage table:
myopia
sex myopia
normal Total
Count
male 57.7
42.3 100 163
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
INTRODUCTION TO BIOSTATISTICS 8
Yes the assumptions of this confidence interval have been met. For instance, one of
the assumptions being satisfied: np ≥ 10 and n(1-p) ≥ 10; as could be seen
np=100*0.57=57>10.
Question 4 (8 marks)
Research question: Does the proportion of young adults with myopia differ by highest
education level achieved in young adult Australians?
Use the assignment data set assigned to you: Variables to analyse: ‘myopia’ and ‘educ’
Note: Each student will get different answers as the data sets differ.
a) Show the relationship between myopia status and highest education level achieved
using a two way contingency table. Include either row or column percentages. Obtain
the results using R Commander but then type and label the table yourself with
appropriate description and headings. An R Commander screenshot will not be
accepted. (1 mark)
Solution
Education level Myopia Normal
Count Percent Count Percent
Less 38 18.8% 43 29.3%
Completed Secondary 70 34.7% 57 38.8%
Completed Tertiary 94 46.5% 47 32.0%
Total 202 100.0% 147 100.0%
b) Present the expected frequencies for the above table if the null hypothesis were true.
Obtain the results using R Commander but then type and label the table yourself with
appropriate description and headings. An R Commander screenshot will not be
accepted. (1 mark)
Solution
Expected Counts
Myopia Normal
Yes the assumptions of this confidence interval have been met. For instance, one of
the assumptions being satisfied: np ≥ 10 and n(1-p) ≥ 10; as could be seen
np=100*0.57=57>10.
Question 4 (8 marks)
Research question: Does the proportion of young adults with myopia differ by highest
education level achieved in young adult Australians?
Use the assignment data set assigned to you: Variables to analyse: ‘myopia’ and ‘educ’
Note: Each student will get different answers as the data sets differ.
a) Show the relationship between myopia status and highest education level achieved
using a two way contingency table. Include either row or column percentages. Obtain
the results using R Commander but then type and label the table yourself with
appropriate description and headings. An R Commander screenshot will not be
accepted. (1 mark)
Solution
Education level Myopia Normal
Count Percent Count Percent
Less 38 18.8% 43 29.3%
Completed Secondary 70 34.7% 57 38.8%
Completed Tertiary 94 46.5% 47 32.0%
Total 202 100.0% 147 100.0%
b) Present the expected frequencies for the above table if the null hypothesis were true.
Obtain the results using R Commander but then type and label the table yourself with
appropriate description and headings. An R Commander screenshot will not be
accepted. (1 mark)
Solution
Expected Counts
Myopia Normal
INTRODUCTION TO BIOSTATISTICS 9
Less 46.8825 34.1175
Completed Secondary 73.5072 53.4928
Completed Tertiary 81.6103 59.3897
c) Are the requirements for a Chi-square ( χ2 ) test of independence met? Explain why or
why not. (1 mark)
Solution
Yes the requirements of the Chi-Square are met. For instance, the variables are
independent of each other and also no expected values < 5
d) Irrespective of your answer in part c) address the research question using a Chi-square
test on the provided data. Please use R Commander for all calculations but format
your answer following the 5 step method. (5 marks)
Solution
STATE: We will test the claim that there is significant association between myopia
status and highest education level.
FORMULATE: We will test the following hypotheses at 5% significance level
An independent t-test will be used
H0 :Thereis no association between mypoia∧education level
H1 :There is association between mypoia∧educationlevel
α =0.05
SOLVE: We first check the requirements. The expected values > 5 and the variables
are also independent of each other;
We performed a Chi-Square test of independence
The following R output has been obtained:
Pearson's Chi-squared test
data: .Table
X-squared =
8.8584, df = 2, p-
value = 0.01192
Less 46.8825 34.1175
Completed Secondary 73.5072 53.4928
Completed Tertiary 81.6103 59.3897
c) Are the requirements for a Chi-square ( χ2 ) test of independence met? Explain why or
why not. (1 mark)
Solution
Yes the requirements of the Chi-Square are met. For instance, the variables are
independent of each other and also no expected values < 5
d) Irrespective of your answer in part c) address the research question using a Chi-square
test on the provided data. Please use R Commander for all calculations but format
your answer following the 5 step method. (5 marks)
Solution
STATE: We will test the claim that there is significant association between myopia
status and highest education level.
FORMULATE: We will test the following hypotheses at 5% significance level
An independent t-test will be used
H0 :Thereis no association between mypoia∧education level
H1 :There is association between mypoia∧educationlevel
α =0.05
SOLVE: We first check the requirements. The expected values > 5 and the variables
are also independent of each other;
We performed a Chi-Square test of independence
The following R output has been obtained:
Pearson's Chi-squared test
data: .Table
X-squared =
8.8584, df = 2, p-
value = 0.01192
INTRODUCTION TO BIOSTATISTICS
10
DECISION:
From the above output , the Chi-Square statistic is 8.8584; it follows a Chi-Square
distribution with 2 degrees of freedom. The corresponding P-value is 0.01192. Since
the P-value = 0.01192 < 0.05, H0 MUST BE Rejected.
CONCLUDE: At 5% significance level, there is enough statistical evidence to
conclude that there is significant association (relationship) between myopia and
highest education level
Question 5 (5 marks)
Suppose the natural sleep cycle for Australians is normally distributed with a mean length of
8 hours with a standard deviation of 0.5 hours. Supposed a researcher wishes to test the
hypothesis that the natural sleep cycle for Australian men is longer than for Australian
women. The minimum difference in mean sleep cycle length which they are interested in
detecting is 8.1 hours for males against 7.9 hours for females.
a) What is the minimum sample size required to detect a difference of this size with
α=0.05and power =0.90 ( β=0.10 ). Present your answer as a sentence which
summarises the required sample size to achieve what power subject to what
conditions. (3 marks)
Solution
The minimum sample size required would be 263; this is based on a power of 0.9 and
the significance level is 0.05 and the minimum difference in the mean sleep between
the males and females should be 0.2 (i.e. d = 0.2)
b) Suppose the researcher could not afford such a large sample size. Suggest two
changes which they could make to their research question or study design which
would reduce the required sample size. (2 marks)
Solution
The researcher would do either of the following;
Reduce the power from 0.9 to say 0.8
Increase the minimum difference in mean sleep to be tested
The above two changes will reduce the required sample size
10
DECISION:
From the above output , the Chi-Square statistic is 8.8584; it follows a Chi-Square
distribution with 2 degrees of freedom. The corresponding P-value is 0.01192. Since
the P-value = 0.01192 < 0.05, H0 MUST BE Rejected.
CONCLUDE: At 5% significance level, there is enough statistical evidence to
conclude that there is significant association (relationship) between myopia and
highest education level
Question 5 (5 marks)
Suppose the natural sleep cycle for Australians is normally distributed with a mean length of
8 hours with a standard deviation of 0.5 hours. Supposed a researcher wishes to test the
hypothesis that the natural sleep cycle for Australian men is longer than for Australian
women. The minimum difference in mean sleep cycle length which they are interested in
detecting is 8.1 hours for males against 7.9 hours for females.
a) What is the minimum sample size required to detect a difference of this size with
α=0.05and power =0.90 ( β=0.10 ). Present your answer as a sentence which
summarises the required sample size to achieve what power subject to what
conditions. (3 marks)
Solution
The minimum sample size required would be 263; this is based on a power of 0.9 and
the significance level is 0.05 and the minimum difference in the mean sleep between
the males and females should be 0.2 (i.e. d = 0.2)
b) Suppose the researcher could not afford such a large sample size. Suggest two
changes which they could make to their research question or study design which
would reduce the required sample size. (2 marks)
Solution
The researcher would do either of the following;
Reduce the power from 0.9 to say 0.8
Increase the minimum difference in mean sleep to be tested
The above two changes will reduce the required sample size
1 out of 10
Related Documents
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.