Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

Unlock your academic potential

© 2024 | Zucol Services PVT LTD | All rights reserved.

Added on 2022/11/28

|14

|1201

|305

AI Summary

This document discusses the practical application of statistics. It covers topics such as variables, histograms, hypothesis testing, and the association between marital status and ethnicity. The document also includes a side-by-side box plot of income by qualification. The content is suitable for students studying statistics.

Your contribution can guide someone’s learning journey. Share your
documents today.

STATISTICS IN PRACTICE

STUDENT ID:

[Pick the date]

STUDENT ID:

[Pick the date]

Need help grading? Try our AI Grader for instant feedback on your assignments.

Question 1

Variable Name Type

Gender Categorical Nominal

Age Numerical Discrete

Ethnicity Categorical Nominal

Marital Categorical Nominal

Qualification Categorical Nominal

PostSchool Categorical Nominal

Hours Numerical Discrete

Income Numerical Discrete

Question 2

(a) Histogram of weekly income

2

Variable Name Type

Gender Categorical Nominal

Age Numerical Discrete

Ethnicity Categorical Nominal

Marital Categorical Nominal

Qualification Categorical Nominal

PostSchool Categorical Nominal

Hours Numerical Discrete

Income Numerical Discrete

Question 2

(a) Histogram of weekly income

2

11 to 233 233 to 456 456 to 678 678 to 900 900 to 1122 1122 to

1345 1345 to

1567 1567 to

1789

0.0

10.0

20.0

30.0

40.0

50.0

60.0

Histogram : Weekly Income

Weekly Income ($)

Frequency

Based on the above histogram, it is apparent that there is a right skew present in the weekly

income data as the tail on the right of the mean is longer than the one on the left. Also, the

shape of the above distribution is asymmetric which implies that the given distribution is not

normally distributed. The weekly income of most individuals in concentrated on the initial

few classes but there are some values included in the sample where the weekly income is

quite high and hence right skew is introduced.

(b) Point estimates and 99% confidence interval

Point estimates

99% confidence interval

3

1345 1345 to

1567 1567 to

1789

0.0

10.0

20.0

30.0

40.0

50.0

60.0

Histogram : Weekly Income

Weekly Income ($)

Frequency

Based on the above histogram, it is apparent that there is a right skew present in the weekly

income data as the tail on the right of the mean is longer than the one on the left. Also, the

shape of the above distribution is asymmetric which implies that the given distribution is not

normally distributed. The weekly income of most individuals in concentrated on the initial

few classes but there are some values included in the sample where the weekly income is

quite high and hence right skew is introduced.

(b) Point estimates and 99% confidence interval

Point estimates

99% confidence interval

3

(c) The appropriate general distribution of the variable X would be student t with mean =

30.68 and standard deviation = 8.68.

The given distribution is suitable considering that the data is skewed owing to which a

normal distribution is not appropriate for capturing the continuous variable X.

(d) The relevant hypothesis test for evaluating the claim that the mean weekly income in

Australia in 2007 was NZD 986. The relevant hypothesis test has been performed in

Excel and the requisite output is indicated below. Considering that population standard

deviation is not known, hence the relevant test statistic is t

T value = (Sample mean – Hypothesised mean)/Standard Error

4

30.68 and standard deviation = 8.68.

The given distribution is suitable considering that the data is skewed owing to which a

normal distribution is not appropriate for capturing the continuous variable X.

(d) The relevant hypothesis test for evaluating the claim that the mean weekly income in

Australia in 2007 was NZD 986. The relevant hypothesis test has been performed in

Excel and the requisite output is indicated below. Considering that population standard

deviation is not known, hence the relevant test statistic is t

T value = (Sample mean – Hypothesised mean)/Standard Error

4

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

The relevant formula view for the above output is shown below.

As the p value (0.00) is lower than the significance level (0.05), hence the null hypothesis is

rejected and alternative hypothesis is accepted. This implies that the average weekly wage in

Australia in 2007 is significantly different from NZD 986.

5

As the p value (0.00) is lower than the significance level (0.05), hence the null hypothesis is

rejected and alternative hypothesis is accepted. This implies that the average weekly wage in

Australia in 2007 is significantly different from NZD 986.

5

(e) It is imperative to compute the z score corresponding to the $ 710.

Standard deviation = $237.50, Sample mean = $ 667, Sample size = 200

Z score = (710-667)/(237.50/2000.5) = 2.56

P( X>710) = 1- P(X≤710) = 1-NORMSDIST(2.56) = 1- 0.9948 = 0.0052

(f) In order to compare the relative standing, the corresponding z score would be found for

the randomly selected New Zealander and Australian.

Z score for randomly selected New Zealander = (850-667)/237.50 = 0.77

Z score for randomly selected Australian = (950-986)/245.70= -0.147

Comparing the above values, it can be concluded that a higher relative standing to the

respective population exists for the randomly selected New Zealander as the corresponding

standardised score is greater than the score for the randomly selected Australian.

Question 3

(a) Considering that the proportion of martial status across the different ethnicities shows

significant difference, hence it would be fair to conclude that there does seem to be an

association between the two variables.

(b) The requisite hypothesis are as stated below.

H0: Marital status and Ethnicity are independent of each other

Ha: Marital status and Ethnicity are not independent of each other

(c) The requisite summary table is shown below.

6

Standard deviation = $237.50, Sample mean = $ 667, Sample size = 200

Z score = (710-667)/(237.50/2000.5) = 2.56

P( X>710) = 1- P(X≤710) = 1-NORMSDIST(2.56) = 1- 0.9948 = 0.0052

(f) In order to compare the relative standing, the corresponding z score would be found for

the randomly selected New Zealander and Australian.

Z score for randomly selected New Zealander = (850-667)/237.50 = 0.77

Z score for randomly selected Australian = (950-986)/245.70= -0.147

Comparing the above values, it can be concluded that a higher relative standing to the

respective population exists for the randomly selected New Zealander as the corresponding

standardised score is greater than the score for the randomly selected Australian.

Question 3

(a) Considering that the proportion of martial status across the different ethnicities shows

significant difference, hence it would be fair to conclude that there does seem to be an

association between the two variables.

(b) The requisite hypothesis are as stated below.

H0: Marital status and Ethnicity are independent of each other

Ha: Marital status and Ethnicity are not independent of each other

(c) The requisite summary table is shown below.

6

If there is no association between marital status and ethnicity, then the expected value of

Pacific people who are married = 10*64/200 = 3.2

(d) (i)Percentage of people who never married and are Maori = 13.19%

(ii) Percentage of Maori people who surveyed have never married = (12/22) = 55.55%

7

Pacific people who are married = 10*64/200 = 3.2

(d) (i)Percentage of people who never married and are Maori = 13.19%

(ii) Percentage of Maori people who surveyed have never married = (12/22) = 55.55%

7

Need help grading? Try our AI Grader for instant feedback on your assignments.

(e) The hypothesis testing can be facilitated from the following output obtained. The relevant

summary of the key data from the output is given below.

Since p value is lower than the assumed level of significance (5%), hence the available

evidence is sufficient to cause rejection of the null hypothesis and accept the alternative

hypothesis. Hence, it can be concluded that martial status and ethnicity are dependent on one

another.

8

summary of the key data from the output is given below.

Since p value is lower than the assumed level of significance (5%), hence the available

evidence is sufficient to cause rejection of the null hypothesis and accept the alternative

hypothesis. Hence, it can be concluded that martial status and ethnicity are dependent on one

another.

8

9

(f) The given conclusion in part (e) did not surprise me as my conclusion in part (a) was the

same as that statistically derived in part (e).

Question 4

(a) Side by side box plot of income by qualification is shown below.

10

same as that statistically derived in part (e).

Question 4

(a) Side by side box plot of income by qualification is shown below.

10

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Yes the above boxplot does indicate that income levels vary according to qualification as the

median value of weekly income tend to differ significantly for the different qualifications.

(b) Null and alternative hypotheses

H0: The mean weekly income for the four qualification groups does not differ significantly

from each other and thereby can be assumed to be same.

Ha: The mean weekly income for atleast one of the four qualification group differs from the

other three.

(c) For the above hypothesis, ANOVA Test has been performed as shown below. The key

output is summarised below.

Since p value is lesser than the significance level, hence null hypothesis would be rejected

and alternative hypothesis would be accepted. This implies that the mean weekly income for

all the four qualification groups cannot be assumed to be same.

11

median value of weekly income tend to differ significantly for the different qualifications.

(b) Null and alternative hypotheses

H0: The mean weekly income for the four qualification groups does not differ significantly

from each other and thereby can be assumed to be same.

Ha: The mean weekly income for atleast one of the four qualification group differs from the

other three.

(c) For the above hypothesis, ANOVA Test has been performed as shown below. The key

output is summarised below.

Since p value is lesser than the significance level, hence null hypothesis would be rejected

and alternative hypothesis would be accepted. This implies that the mean weekly income for

all the four qualification groups cannot be assumed to be same.

11

(d) Number of treatments (qualifications in this case)= 4

Hence, degree of freedom = 4-1 = 3

(e) The relevant output which is useful in this regards is shown below.

12

Hence, degree of freedom = 4-1 = 3

(e) The relevant output which is useful in this regards is shown below.

12

The difference in mean income would be significant for those groups where the p value is

lower than the level of significance i.e. 0.05. Based on the above output, these pairs along

with their respective values are highlighted below.

1) None and Degree ( p value = 0.00 and hence difference in mean income is significant)

2) School and Degree (p value = 0.00 and hence difference in mean income is

significant)

3) Vocational and Degree (p value = 0.02 and hence difference in mean income is

significant)

4) Vocational and None (p value = 0.00 and hence difference in mean income is

significant)

5) Vocational and School (p value = 0.00 and hence difference in mean income is

significant)

(f) The various assumptions which must be satisfied for the ANOVA test are given below.

The residuals of the distributions are normally distributed.

The variances of all the variables are assumed to be same.

Also, the cases are assumed to be independent of each other.

(g) It is imperative that the residual should be distributed in a random manner which would

indicate normal distribution. The requisite residual plot is shown below.

13

lower than the level of significance i.e. 0.05. Based on the above output, these pairs along

with their respective values are highlighted below.

1) None and Degree ( p value = 0.00 and hence difference in mean income is significant)

2) School and Degree (p value = 0.00 and hence difference in mean income is

significant)

3) Vocational and Degree (p value = 0.02 and hence difference in mean income is

significant)

4) Vocational and None (p value = 0.00 and hence difference in mean income is

significant)

5) Vocational and School (p value = 0.00 and hence difference in mean income is

significant)

(f) The various assumptions which must be satisfied for the ANOVA test are given below.

The residuals of the distributions are normally distributed.

The variances of all the variables are assumed to be same.

Also, the cases are assumed to be independent of each other.

(g) It is imperative that the residual should be distributed in a random manner which would

indicate normal distribution. The requisite residual plot is shown below.

13

Need help grading? Try our AI Grader for instant feedback on your assignments.

It is evident from the above residual plot that the assumptions listed in part (f) are satisfied.

(h) For a one way ANOVA test, each residual value would essentially be the difference

between the value that is entered and the mean of all the group values. If the entered value

is higher than the sample mean, then residual is positive or else it is negative.

14

(h) For a one way ANOVA test, each residual value would essentially be the difference

between the value that is entered and the mean of all the group values. If the entered value

is higher than the sample mean, then residual is positive or else it is negative.

14

1 out of 14