Statistics in Practice Project Assignment - May 2019, STAT 193
VerifiedAdded on 2022/11/28
|14
|1201
|305
Homework Assignment
AI Summary
This document presents a comprehensive solution to a Statistics in Practice (STAT 193) project assignment. The solution encompasses various statistical concepts and techniques, including descriptive statistics, hypothesis testing, confidence intervals, and ANOVA. It begins by analyzing data varia...
Read More
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.

STATISTICS IN PRACTICE
STUDENT ID:
[Pick the date]
STUDENT ID:
[Pick the date]
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.

Question 1
Variable Name Type
Gender Categorical Nominal
Age Numerical Discrete
Ethnicity Categorical Nominal
Marital Categorical Nominal
Qualification Categorical Nominal
PostSchool Categorical Nominal
Hours Numerical Discrete
Income Numerical Discrete
Question 2
(a) Histogram of weekly income
2
Variable Name Type
Gender Categorical Nominal
Age Numerical Discrete
Ethnicity Categorical Nominal
Marital Categorical Nominal
Qualification Categorical Nominal
PostSchool Categorical Nominal
Hours Numerical Discrete
Income Numerical Discrete
Question 2
(a) Histogram of weekly income
2

11 to 233 233 to 456 456 to 678 678 to 900 900 to 1122 1122 to
1345 1345 to
1567 1567 to
1789
0.0
10.0
20.0
30.0
40.0
50.0
60.0
Histogram : Weekly Income
Weekly Income ($)
Frequency
Based on the above histogram, it is apparent that there is a right skew present in the weekly
income data as the tail on the right of the mean is longer than the one on the left. Also, the
shape of the above distribution is asymmetric which implies that the given distribution is not
normally distributed. The weekly income of most individuals in concentrated on the initial
few classes but there are some values included in the sample where the weekly income is
quite high and hence right skew is introduced.
(b) Point estimates and 99% confidence interval
Point estimates
99% confidence interval
3
1345 1345 to
1567 1567 to
1789
0.0
10.0
20.0
30.0
40.0
50.0
60.0
Histogram : Weekly Income
Weekly Income ($)
Frequency
Based on the above histogram, it is apparent that there is a right skew present in the weekly
income data as the tail on the right of the mean is longer than the one on the left. Also, the
shape of the above distribution is asymmetric which implies that the given distribution is not
normally distributed. The weekly income of most individuals in concentrated on the initial
few classes but there are some values included in the sample where the weekly income is
quite high and hence right skew is introduced.
(b) Point estimates and 99% confidence interval
Point estimates
99% confidence interval
3

(c) The appropriate general distribution of the variable X would be student t with mean =
30.68 and standard deviation = 8.68.
The given distribution is suitable considering that the data is skewed owing to which a
normal distribution is not appropriate for capturing the continuous variable X.
(d) The relevant hypothesis test for evaluating the claim that the mean weekly income in
Australia in 2007 was NZD 986. The relevant hypothesis test has been performed in
Excel and the requisite output is indicated below. Considering that population standard
deviation is not known, hence the relevant test statistic is t
T value = (Sample mean – Hypothesised mean)/Standard Error
4
30.68 and standard deviation = 8.68.
The given distribution is suitable considering that the data is skewed owing to which a
normal distribution is not appropriate for capturing the continuous variable X.
(d) The relevant hypothesis test for evaluating the claim that the mean weekly income in
Australia in 2007 was NZD 986. The relevant hypothesis test has been performed in
Excel and the requisite output is indicated below. Considering that population standard
deviation is not known, hence the relevant test statistic is t
T value = (Sample mean – Hypothesised mean)/Standard Error
4
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.

The relevant formula view for the above output is shown below.
As the p value (0.00) is lower than the significance level (0.05), hence the null hypothesis is
rejected and alternative hypothesis is accepted. This implies that the average weekly wage in
Australia in 2007 is significantly different from NZD 986.
5
As the p value (0.00) is lower than the significance level (0.05), hence the null hypothesis is
rejected and alternative hypothesis is accepted. This implies that the average weekly wage in
Australia in 2007 is significantly different from NZD 986.
5

(e) It is imperative to compute the z score corresponding to the $ 710.
Standard deviation = $237.50, Sample mean = $ 667, Sample size = 200
Z score = (710-667)/(237.50/2000.5) = 2.56
P( X>710) = 1- P(X≤710) = 1-NORMSDIST(2.56) = 1- 0.9948 = 0.0052
(f) In order to compare the relative standing, the corresponding z score would be found for
the randomly selected New Zealander and Australian.
Z score for randomly selected New Zealander = (850-667)/237.50 = 0.77
Z score for randomly selected Australian = (950-986)/245.70= -0.147
Comparing the above values, it can be concluded that a higher relative standing to the
respective population exists for the randomly selected New Zealander as the corresponding
standardised score is greater than the score for the randomly selected Australian.
Question 3
(a) Considering that the proportion of martial status across the different ethnicities shows
significant difference, hence it would be fair to conclude that there does seem to be an
association between the two variables.
(b) The requisite hypothesis are as stated below.
H0: Marital status and Ethnicity are independent of each other
Ha: Marital status and Ethnicity are not independent of each other
(c) The requisite summary table is shown below.
6
Standard deviation = $237.50, Sample mean = $ 667, Sample size = 200
Z score = (710-667)/(237.50/2000.5) = 2.56
P( X>710) = 1- P(X≤710) = 1-NORMSDIST(2.56) = 1- 0.9948 = 0.0052
(f) In order to compare the relative standing, the corresponding z score would be found for
the randomly selected New Zealander and Australian.
Z score for randomly selected New Zealander = (850-667)/237.50 = 0.77
Z score for randomly selected Australian = (950-986)/245.70= -0.147
Comparing the above values, it can be concluded that a higher relative standing to the
respective population exists for the randomly selected New Zealander as the corresponding
standardised score is greater than the score for the randomly selected Australian.
Question 3
(a) Considering that the proportion of martial status across the different ethnicities shows
significant difference, hence it would be fair to conclude that there does seem to be an
association between the two variables.
(b) The requisite hypothesis are as stated below.
H0: Marital status and Ethnicity are independent of each other
Ha: Marital status and Ethnicity are not independent of each other
(c) The requisite summary table is shown below.
6

If there is no association between marital status and ethnicity, then the expected value of
Pacific people who are married = 10*64/200 = 3.2
(d) (i)Percentage of people who never married and are Maori = 13.19%
(ii) Percentage of Maori people who surveyed have never married = (12/22) = 55.55%
7
Pacific people who are married = 10*64/200 = 3.2
(d) (i)Percentage of people who never married and are Maori = 13.19%
(ii) Percentage of Maori people who surveyed have never married = (12/22) = 55.55%
7
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

(e) The hypothesis testing can be facilitated from the following output obtained. The relevant
summary of the key data from the output is given below.
Since p value is lower than the assumed level of significance (5%), hence the available
evidence is sufficient to cause rejection of the null hypothesis and accept the alternative
hypothesis. Hence, it can be concluded that martial status and ethnicity are dependent on one
another.
8
summary of the key data from the output is given below.
Since p value is lower than the assumed level of significance (5%), hence the available
evidence is sufficient to cause rejection of the null hypothesis and accept the alternative
hypothesis. Hence, it can be concluded that martial status and ethnicity are dependent on one
another.
8

9

(f) The given conclusion in part (e) did not surprise me as my conclusion in part (a) was the
same as that statistically derived in part (e).
Question 4
(a) Side by side box plot of income by qualification is shown below.
10
same as that statistically derived in part (e).
Question 4
(a) Side by side box plot of income by qualification is shown below.
10
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.

Yes the above boxplot does indicate that income levels vary according to qualification as the
median value of weekly income tend to differ significantly for the different qualifications.
(b) Null and alternative hypotheses
H0: The mean weekly income for the four qualification groups does not differ significantly
from each other and thereby can be assumed to be same.
Ha: The mean weekly income for atleast one of the four qualification group differs from the
other three.
(c) For the above hypothesis, ANOVA Test has been performed as shown below. The key
output is summarised below.
Since p value is lesser than the significance level, hence null hypothesis would be rejected
and alternative hypothesis would be accepted. This implies that the mean weekly income for
all the four qualification groups cannot be assumed to be same.
11
median value of weekly income tend to differ significantly for the different qualifications.
(b) Null and alternative hypotheses
H0: The mean weekly income for the four qualification groups does not differ significantly
from each other and thereby can be assumed to be same.
Ha: The mean weekly income for atleast one of the four qualification group differs from the
other three.
(c) For the above hypothesis, ANOVA Test has been performed as shown below. The key
output is summarised below.
Since p value is lesser than the significance level, hence null hypothesis would be rejected
and alternative hypothesis would be accepted. This implies that the mean weekly income for
all the four qualification groups cannot be assumed to be same.
11

(d) Number of treatments (qualifications in this case)= 4
Hence, degree of freedom = 4-1 = 3
(e) The relevant output which is useful in this regards is shown below.
12
Hence, degree of freedom = 4-1 = 3
(e) The relevant output which is useful in this regards is shown below.
12

The difference in mean income would be significant for those groups where the p value is
lower than the level of significance i.e. 0.05. Based on the above output, these pairs along
with their respective values are highlighted below.
1) None and Degree ( p value = 0.00 and hence difference in mean income is significant)
2) School and Degree (p value = 0.00 and hence difference in mean income is
significant)
3) Vocational and Degree (p value = 0.02 and hence difference in mean income is
significant)
4) Vocational and None (p value = 0.00 and hence difference in mean income is
significant)
5) Vocational and School (p value = 0.00 and hence difference in mean income is
significant)
(f) The various assumptions which must be satisfied for the ANOVA test are given below.
The residuals of the distributions are normally distributed.
The variances of all the variables are assumed to be same.
Also, the cases are assumed to be independent of each other.
(g) It is imperative that the residual should be distributed in a random manner which would
indicate normal distribution. The requisite residual plot is shown below.
13
lower than the level of significance i.e. 0.05. Based on the above output, these pairs along
with their respective values are highlighted below.
1) None and Degree ( p value = 0.00 and hence difference in mean income is significant)
2) School and Degree (p value = 0.00 and hence difference in mean income is
significant)
3) Vocational and Degree (p value = 0.02 and hence difference in mean income is
significant)
4) Vocational and None (p value = 0.00 and hence difference in mean income is
significant)
5) Vocational and School (p value = 0.00 and hence difference in mean income is
significant)
(f) The various assumptions which must be satisfied for the ANOVA test are given below.
The residuals of the distributions are normally distributed.
The variances of all the variables are assumed to be same.
Also, the cases are assumed to be independent of each other.
(g) It is imperative that the residual should be distributed in a random manner which would
indicate normal distribution. The requisite residual plot is shown below.
13
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

It is evident from the above residual plot that the assumptions listed in part (f) are satisfied.
(h) For a one way ANOVA test, each residual value would essentially be the difference
between the value that is entered and the mean of all the group values. If the entered value
is higher than the sample mean, then residual is positive or else it is negative.
14
(h) For a one way ANOVA test, each residual value would essentially be the difference
between the value that is entered and the mean of all the group values. If the entered value
is higher than the sample mean, then residual is positive or else it is negative.
14
1 out of 14

Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.