Statistics in Practice Project Assignment - May 2019, STAT 193
VerifiedAdded on 2022/11/28
|14
|1201
|305
Homework Assignment
AI Summary
This document presents a comprehensive solution to a Statistics in Practice (STAT 193) project assignment. The solution encompasses various statistical concepts and techniques, including descriptive statistics, hypothesis testing, confidence intervals, and ANOVA. It begins by analyzing data variables, including income, and presents a histogram to assess skewness. The assignment then delves into hypothesis testing, comparing the mean weekly income to a hypothesized value and calculating z-scores for relative standing comparisons. Furthermore, the solution explores the association between marital status and ethnicity using a chi-square test, followed by an ANOVA test to examine income variations across different qualification levels. The document includes the construction of graphs, tables, and detailed explanations of each step, demonstrating a strong understanding of statistical principles and their practical application.

STATISTICS IN PRACTICE
STUDENT ID:
[Pick the date]
STUDENT ID:
[Pick the date]
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Question 1
Variable Name Type
Gender Categorical Nominal
Age Numerical Discrete
Ethnicity Categorical Nominal
Marital Categorical Nominal
Qualification Categorical Nominal
PostSchool Categorical Nominal
Hours Numerical Discrete
Income Numerical Discrete
Question 2
(a) Histogram of weekly income
2
Variable Name Type
Gender Categorical Nominal
Age Numerical Discrete
Ethnicity Categorical Nominal
Marital Categorical Nominal
Qualification Categorical Nominal
PostSchool Categorical Nominal
Hours Numerical Discrete
Income Numerical Discrete
Question 2
(a) Histogram of weekly income
2

11 to 233 233 to 456 456 to 678 678 to 900 900 to 1122 1122 to
1345 1345 to
1567 1567 to
1789
0.0
10.0
20.0
30.0
40.0
50.0
60.0
Histogram : Weekly Income
Weekly Income ($)
Frequency
Based on the above histogram, it is apparent that there is a right skew present in the weekly
income data as the tail on the right of the mean is longer than the one on the left. Also, the
shape of the above distribution is asymmetric which implies that the given distribution is not
normally distributed. The weekly income of most individuals in concentrated on the initial
few classes but there are some values included in the sample where the weekly income is
quite high and hence right skew is introduced.
(b) Point estimates and 99% confidence interval
Point estimates
99% confidence interval
3
1345 1345 to
1567 1567 to
1789
0.0
10.0
20.0
30.0
40.0
50.0
60.0
Histogram : Weekly Income
Weekly Income ($)
Frequency
Based on the above histogram, it is apparent that there is a right skew present in the weekly
income data as the tail on the right of the mean is longer than the one on the left. Also, the
shape of the above distribution is asymmetric which implies that the given distribution is not
normally distributed. The weekly income of most individuals in concentrated on the initial
few classes but there are some values included in the sample where the weekly income is
quite high and hence right skew is introduced.
(b) Point estimates and 99% confidence interval
Point estimates
99% confidence interval
3
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

(c) The appropriate general distribution of the variable X would be student t with mean =
30.68 and standard deviation = 8.68.
The given distribution is suitable considering that the data is skewed owing to which a
normal distribution is not appropriate for capturing the continuous variable X.
(d) The relevant hypothesis test for evaluating the claim that the mean weekly income in
Australia in 2007 was NZD 986. The relevant hypothesis test has been performed in
Excel and the requisite output is indicated below. Considering that population standard
deviation is not known, hence the relevant test statistic is t
T value = (Sample mean – Hypothesised mean)/Standard Error
4
30.68 and standard deviation = 8.68.
The given distribution is suitable considering that the data is skewed owing to which a
normal distribution is not appropriate for capturing the continuous variable X.
(d) The relevant hypothesis test for evaluating the claim that the mean weekly income in
Australia in 2007 was NZD 986. The relevant hypothesis test has been performed in
Excel and the requisite output is indicated below. Considering that population standard
deviation is not known, hence the relevant test statistic is t
T value = (Sample mean – Hypothesised mean)/Standard Error
4
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

The relevant formula view for the above output is shown below.
As the p value (0.00) is lower than the significance level (0.05), hence the null hypothesis is
rejected and alternative hypothesis is accepted. This implies that the average weekly wage in
Australia in 2007 is significantly different from NZD 986.
5
As the p value (0.00) is lower than the significance level (0.05), hence the null hypothesis is
rejected and alternative hypothesis is accepted. This implies that the average weekly wage in
Australia in 2007 is significantly different from NZD 986.
5

(e) It is imperative to compute the z score corresponding to the $ 710.
Standard deviation = $237.50, Sample mean = $ 667, Sample size = 200
Z score = (710-667)/(237.50/2000.5) = 2.56
P( X>710) = 1- P(X≤710) = 1-NORMSDIST(2.56) = 1- 0.9948 = 0.0052
(f) In order to compare the relative standing, the corresponding z score would be found for
the randomly selected New Zealander and Australian.
Z score for randomly selected New Zealander = (850-667)/237.50 = 0.77
Z score for randomly selected Australian = (950-986)/245.70= -0.147
Comparing the above values, it can be concluded that a higher relative standing to the
respective population exists for the randomly selected New Zealander as the corresponding
standardised score is greater than the score for the randomly selected Australian.
Question 3
(a) Considering that the proportion of martial status across the different ethnicities shows
significant difference, hence it would be fair to conclude that there does seem to be an
association between the two variables.
(b) The requisite hypothesis are as stated below.
H0: Marital status and Ethnicity are independent of each other
Ha: Marital status and Ethnicity are not independent of each other
(c) The requisite summary table is shown below.
6
Standard deviation = $237.50, Sample mean = $ 667, Sample size = 200
Z score = (710-667)/(237.50/2000.5) = 2.56
P( X>710) = 1- P(X≤710) = 1-NORMSDIST(2.56) = 1- 0.9948 = 0.0052
(f) In order to compare the relative standing, the corresponding z score would be found for
the randomly selected New Zealander and Australian.
Z score for randomly selected New Zealander = (850-667)/237.50 = 0.77
Z score for randomly selected Australian = (950-986)/245.70= -0.147
Comparing the above values, it can be concluded that a higher relative standing to the
respective population exists for the randomly selected New Zealander as the corresponding
standardised score is greater than the score for the randomly selected Australian.
Question 3
(a) Considering that the proportion of martial status across the different ethnicities shows
significant difference, hence it would be fair to conclude that there does seem to be an
association between the two variables.
(b) The requisite hypothesis are as stated below.
H0: Marital status and Ethnicity are independent of each other
Ha: Marital status and Ethnicity are not independent of each other
(c) The requisite summary table is shown below.
6
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

If there is no association between marital status and ethnicity, then the expected value of
Pacific people who are married = 10*64/200 = 3.2
(d) (i)Percentage of people who never married and are Maori = 13.19%
(ii) Percentage of Maori people who surveyed have never married = (12/22) = 55.55%
7
Pacific people who are married = 10*64/200 = 3.2
(d) (i)Percentage of people who never married and are Maori = 13.19%
(ii) Percentage of Maori people who surveyed have never married = (12/22) = 55.55%
7
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

(e) The hypothesis testing can be facilitated from the following output obtained. The relevant
summary of the key data from the output is given below.
Since p value is lower than the assumed level of significance (5%), hence the available
evidence is sufficient to cause rejection of the null hypothesis and accept the alternative
hypothesis. Hence, it can be concluded that martial status and ethnicity are dependent on one
another.
8
summary of the key data from the output is given below.
Since p value is lower than the assumed level of significance (5%), hence the available
evidence is sufficient to cause rejection of the null hypothesis and accept the alternative
hypothesis. Hence, it can be concluded that martial status and ethnicity are dependent on one
another.
8

9
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

(f) The given conclusion in part (e) did not surprise me as my conclusion in part (a) was the
same as that statistically derived in part (e).
Question 4
(a) Side by side box plot of income by qualification is shown below.
10
same as that statistically derived in part (e).
Question 4
(a) Side by side box plot of income by qualification is shown below.
10
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Yes the above boxplot does indicate that income levels vary according to qualification as the
median value of weekly income tend to differ significantly for the different qualifications.
(b) Null and alternative hypotheses
H0: The mean weekly income for the four qualification groups does not differ significantly
from each other and thereby can be assumed to be same.
Ha: The mean weekly income for atleast one of the four qualification group differs from the
other three.
(c) For the above hypothesis, ANOVA Test has been performed as shown below. The key
output is summarised below.
Since p value is lesser than the significance level, hence null hypothesis would be rejected
and alternative hypothesis would be accepted. This implies that the mean weekly income for
all the four qualification groups cannot be assumed to be same.
11
median value of weekly income tend to differ significantly for the different qualifications.
(b) Null and alternative hypotheses
H0: The mean weekly income for the four qualification groups does not differ significantly
from each other and thereby can be assumed to be same.
Ha: The mean weekly income for atleast one of the four qualification group differs from the
other three.
(c) For the above hypothesis, ANOVA Test has been performed as shown below. The key
output is summarised below.
Since p value is lesser than the significance level, hence null hypothesis would be rejected
and alternative hypothesis would be accepted. This implies that the mean weekly income for
all the four qualification groups cannot be assumed to be same.
11

(d) Number of treatments (qualifications in this case)= 4
Hence, degree of freedom = 4-1 = 3
(e) The relevant output which is useful in this regards is shown below.
12
Hence, degree of freedom = 4-1 = 3
(e) The relevant output which is useful in this regards is shown below.
12
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide
1 out of 14

Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2025 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.