BUS708 Statistical Modelling: Gender Pay Gap Analysis in Australia
VerifiedAdded on 2023/06/12
|11
|2190
|404
Report
AI Summary
This report investigates the gender pay gap in Australia using statistical modeling techniques applied to two datasets. Dataset 1, a secondary dataset from the ATO, reveals a potential gender pay disparity, while Dataset 2, a primary dataset collected by the researcher, does not support this claim, though its reliability is questioned due to sampling bias. The analysis includes descriptive statistics, graphical representations, and hypothesis testing to assess gender representation across occupations and salary levels. Confidence intervals are calculated for female representation in high-paying occupations. The report concludes that Dataset 1 provides evidence of a gender pay gap, but further research is needed to understand the underlying causes, particularly the role of gender representation in different occupations and potential gender-based salary discrimination. The report also highlights the skewed gender representation in certain occupations as a concern.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.

STATISTICAL MODELLING
STUDENT ID:
[Pick the date]
STUDENT ID:
[Pick the date]
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.

Section 1: Introduction
a) A glaring issue which is gripping Australia labour force is that of differential pay being
given to the two genders. In this regards, as per certain estimates, it has been found that
the wages/salaries drawn by females is about 15% lower in comparison to the other gender
i.e. males. Clearly, the existence of this trend in the future or worsening of the same can
potentially have adverse impact on the participation of females in the workplace and also
have implications for the workforce diversity. While some may argue that the difference in
salary levels is on account of difference in representation of the two genders in different
occupations, but there are some who indicate that the differential wage trend also is
prevalent in occupations where female representation is more than 50%. Further, this
inequality in gender pay also called as gender gap continues to exist despite presence of
legislation which forbids any differentiation based on gender (Livsey, 2017). The central
research question that is thus addressed is to find if gender gap is established in the dataset
provided considering the underlying information about occupations.
b) A unique dataset that has been provided for 1000 taxpayers from Australia is Dataset one.
This is regarded as secondary in nature since the researcher has himself/herself not
collected the data but ATO is the source of the data provided. Hence, it was the ATO
which primarily collected the data and a sample from that data was used for dataset 1
making it secondary data (Flick, 2015). This dataset comprises of four variables in the
form of gender, occupational code, salary amount and gift amount as deduction. While the
first two are categorical variables, the latter two are quantitative variables. The categorical
variables in the dataset are measured using nominal scale unlike quantitative variables
which are measured using interval scale (Hair et. al., 2015).
The five initial cases of this dataset that need to be reported are illustrated as follows.
.
a) A glaring issue which is gripping Australia labour force is that of differential pay being
given to the two genders. In this regards, as per certain estimates, it has been found that
the wages/salaries drawn by females is about 15% lower in comparison to the other gender
i.e. males. Clearly, the existence of this trend in the future or worsening of the same can
potentially have adverse impact on the participation of females in the workplace and also
have implications for the workforce diversity. While some may argue that the difference in
salary levels is on account of difference in representation of the two genders in different
occupations, but there are some who indicate that the differential wage trend also is
prevalent in occupations where female representation is more than 50%. Further, this
inequality in gender pay also called as gender gap continues to exist despite presence of
legislation which forbids any differentiation based on gender (Livsey, 2017). The central
research question that is thus addressed is to find if gender gap is established in the dataset
provided considering the underlying information about occupations.
b) A unique dataset that has been provided for 1000 taxpayers from Australia is Dataset one.
This is regarded as secondary in nature since the researcher has himself/herself not
collected the data but ATO is the source of the data provided. Hence, it was the ATO
which primarily collected the data and a sample from that data was used for dataset 1
making it secondary data (Flick, 2015). This dataset comprises of four variables in the
form of gender, occupational code, salary amount and gift amount as deduction. While the
first two are categorical variables, the latter two are quantitative variables. The categorical
variables in the dataset are measured using nominal scale unlike quantitative variables
which are measured using interval scale (Hair et. al., 2015).
The five initial cases of this dataset that need to be reported are illustrated as follows.
.

c) Yet another dataset which is used is dataset 2 which comprises of only 30 samples and has
two variables namely the salary level and the gender. It would be appropriate to label this
data as primary data owing to the fact that this has been collected by the researcher only.
Even though this is primary data but it is inferior to the secondary dataset which is dataset
1. This is on account of the few issues with the collection methodology. One of these is the
underlying sampling technique which is not probability based but driven by my
convenience and hence the odds of the data being biased are significant. Further, the data
collected may not be accurate due to exaggerated responses by the respondents in an
attempt to exaggerate their salary level owing to personal relationship between them and
myself. The collection has been reduced to just two variables since these are mandatory
for exploring the underlying research question with regards to possible gender gap being
present (Eriksson and Kovalainen, 2015).
Section 2: Descriptive Statistics
a) The appropriate graphical illustration for the given relationship is in the form of the
following column chart.
The above graph reflects on the difference in gender proportions that are expected across
different occupations. It is noteworthy that the difference in the representation levels of the
two genders is clearly very stark across some occupations. An ideal example of this would be
the occupation with corresponding 7 where the representation of females is quite dismal as
less than 5% of the individuals employed in this occupation for the sample comprises of
two variables namely the salary level and the gender. It would be appropriate to label this
data as primary data owing to the fact that this has been collected by the researcher only.
Even though this is primary data but it is inferior to the secondary dataset which is dataset
1. This is on account of the few issues with the collection methodology. One of these is the
underlying sampling technique which is not probability based but driven by my
convenience and hence the odds of the data being biased are significant. Further, the data
collected may not be accurate due to exaggerated responses by the respondents in an
attempt to exaggerate their salary level owing to personal relationship between them and
myself. The collection has been reduced to just two variables since these are mandatory
for exploring the underlying research question with regards to possible gender gap being
present (Eriksson and Kovalainen, 2015).
Section 2: Descriptive Statistics
a) The appropriate graphical illustration for the given relationship is in the form of the
following column chart.
The above graph reflects on the difference in gender proportions that are expected across
different occupations. It is noteworthy that the difference in the representation levels of the
two genders is clearly very stark across some occupations. An ideal example of this would be
the occupation with corresponding 7 where the representation of females is quite dismal as
less than 5% of the individuals employed in this occupation for the sample comprises of

females. A similar situation is witnessed in case of occupation with code 3 even though
female representation is a tad better than the situation in code 7 occupation. Also, it is
noteworthy that males are not subject to minority representation to such an extent is any of
the occupations listed below.
b) The appropriate graphical illustration for the given relationship is in the form of the
following bar chart.
The graph above clearly hints at the females being the dominant gender for lower salary
levels. But as these salary levels tend to increase, the share of males keeps on increasing and
females attain the status of being a minority gender. More than half of the females included in
the sample had annual salary less than $ 40,000. An interesting question that initiates from
the above data is whether the gender gap is the result of overrepresentation of females in low
paying jobs or due to females being given lower salary for the same job for which males are
paid a higher amount. This is a pertinent question which needs to be explored further.
c) The numerical summary for the given variables is represented in the form of a table as
listed below.
female representation is a tad better than the situation in code 7 occupation. Also, it is
noteworthy that males are not subject to minority representation to such an extent is any of
the occupations listed below.
b) The appropriate graphical illustration for the given relationship is in the form of the
following bar chart.
The graph above clearly hints at the females being the dominant gender for lower salary
levels. But as these salary levels tend to increase, the share of males keeps on increasing and
females attain the status of being a minority gender. More than half of the females included in
the sample had annual salary less than $ 40,000. An interesting question that initiates from
the above data is whether the gender gap is the result of overrepresentation of females in low
paying jobs or due to females being given lower salary for the same job for which males are
paid a higher amount. This is a pertinent question which needs to be explored further.
c) The numerical summary for the given variables is represented in the form of a table as
listed below.
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.

The table further highlights the existence of the gender gap in Australian context whereby
female concentration at lower salary levels is higher than the corresponding male
concentration. Further, as salary levels tend to rise, this pay gap tends to become wider which
implies that the proportion of females at high salary levels is quite less assuming that the
given sample dataset 1 is representative of the actual population of interest. The underlying
reasons for the above observations need to be further researched for better clarity on the issue
of gender gap.
d) The appropriate relationship between the quantitative variables indicated can be explored
with the help of a scatter plot indicated below.
The various points in the scatter plot seem to highlight the absence of any meaningful
relationship between the salary amount and gift amount related deduction. This observation
seems to be supported by the coefficient of determination which is almost zero. Considering
that square root of coefficient of determination leads to correlation coefficient value, thus,
female concentration at lower salary levels is higher than the corresponding male
concentration. Further, as salary levels tend to rise, this pay gap tends to become wider which
implies that the proportion of females at high salary levels is quite less assuming that the
given sample dataset 1 is representative of the actual population of interest. The underlying
reasons for the above observations need to be further researched for better clarity on the issue
of gender gap.
d) The appropriate relationship between the quantitative variables indicated can be explored
with the help of a scatter plot indicated below.
The various points in the scatter plot seem to highlight the absence of any meaningful
relationship between the salary amount and gift amount related deduction. This observation
seems to be supported by the coefficient of determination which is almost zero. Considering
that square root of coefficient of determination leads to correlation coefficient value, thus,

coefficient of correlation also assumes a value of zero thereby highlighting the unrelatedness
of the two variables given (Eriksson and Kovalainen, 2015).
Section 3: Inferential Statistics
a) The objective of this task is to estimate the gender representation in those occupation
populations which tend to have the highest level of median salary based on the data
provided. Using the attached excel, these occupations have been indicated as 1,2,3 and 7.
For these occupational codes, the female representation has been computed assuming a
confidence level of 95% (Hillier, 2016).
In line with the above output derived from excel, one can reach the conclusion with a 95%
possibility that the female representation would lie in the interval marked with a lower
boundary of 27.39% and a higher boundary of 47.61%.
In line with the above output derived from excel, one can reach the conclusion with a 95%
possibility that the female representation would lie in the interval marked with a lower
boundary of 54.12% and a higher boundary of 67.99%.
of the two variables given (Eriksson and Kovalainen, 2015).
Section 3: Inferential Statistics
a) The objective of this task is to estimate the gender representation in those occupation
populations which tend to have the highest level of median salary based on the data
provided. Using the attached excel, these occupations have been indicated as 1,2,3 and 7.
For these occupational codes, the female representation has been computed assuming a
confidence level of 95% (Hillier, 2016).
In line with the above output derived from excel, one can reach the conclusion with a 95%
possibility that the female representation would lie in the interval marked with a lower
boundary of 27.39% and a higher boundary of 47.61%.
In line with the above output derived from excel, one can reach the conclusion with a 95%
possibility that the female representation would lie in the interval marked with a lower
boundary of 54.12% and a higher boundary of 67.99%.

In line with the above output derived from excel, one can reach the conclusion with a 95%
possibility that the female representation would lie in the interval marked with a lower
boundary of 7.28% and a higher boundary of 21.01%.
In line with the above output derived from excel, one can reach the conclusion with a 95%
possibility that the female representation would lie in the interval marked with a lower
boundary of 0.00% and a higher boundary of 9.07%.
From the above confidence interval, it would be fait to conclude that there is a under
representation of females in high paying jobs as only in one occupational code i.e. 2 are the
females in slight majority. In occupational codes 3 & 7, there is severe under representation
of females which clearly is an unwelcome observation and needs to rectification going ahead.
b) Hypothesis Testing
possibility that the female representation would lie in the interval marked with a lower
boundary of 7.28% and a higher boundary of 21.01%.
In line with the above output derived from excel, one can reach the conclusion with a 95%
possibility that the female representation would lie in the interval marked with a lower
boundary of 0.00% and a higher boundary of 9.07%.
From the above confidence interval, it would be fait to conclude that there is a under
representation of females in high paying jobs as only in one occupational code i.e. 2 are the
females in slight majority. In occupational codes 3 & 7, there is severe under representation
of females which clearly is an unwelcome observation and needs to rectification going ahead.
b) Hypothesis Testing
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

H0 (Null Hypothesis): p≤0.8 thus implying that male representation in occupation code 7 is
lesser than 80% or 0.08.
H1(Alternative Hypothesis): p>0.8 thus implying that male representation in occupation code
7 is greater than 80% or 0.08.
The test statistics of choice for testing the above hypothesis will be z. Also, the alternative
hypothesis highlights that the relevant test would be right tailed. Assuming the significance
level for the given hypothesis test as 5%, the test results derived from excel are pasted below.
The p value obtained from the above test is 0.0009. Apparently, it is lower that significance
level taken for this test which is an indication of the current evidence being sufficient to
facilitate rejection of null hypothesis (Flick, 2015). Thus, with a confidence level of 95%, it
can be concluded that indeed the claim about representation of males in the given occupation
exceeding 80% is correct.
c) Hypothesis Testing
The test statistics of choice for testing the above hypothesis will be t as population standard
deviation remains unknown for the given variables. Also, the alternative hypothesis
highlights that the relevant test would be two tailed. Assuming the significance level for the
given hypothesis test as 5%, the test results derived from excel are pasted below.
lesser than 80% or 0.08.
H1(Alternative Hypothesis): p>0.8 thus implying that male representation in occupation code
7 is greater than 80% or 0.08.
The test statistics of choice for testing the above hypothesis will be z. Also, the alternative
hypothesis highlights that the relevant test would be right tailed. Assuming the significance
level for the given hypothesis test as 5%, the test results derived from excel are pasted below.
The p value obtained from the above test is 0.0009. Apparently, it is lower that significance
level taken for this test which is an indication of the current evidence being sufficient to
facilitate rejection of null hypothesis (Flick, 2015). Thus, with a confidence level of 95%, it
can be concluded that indeed the claim about representation of males in the given occupation
exceeding 80% is correct.
c) Hypothesis Testing
The test statistics of choice for testing the above hypothesis will be t as population standard
deviation remains unknown for the given variables. Also, the alternative hypothesis
highlights that the relevant test would be two tailed. Assuming the significance level for the
given hypothesis test as 5%, the test results derived from excel are pasted below.

The p value obtained from the above test is 0.000. Apparently, it is lower that significance
level taken for this test which is an indication of the current evidence being sufficient to
facilitate rejection of null hypothesis (Hair et. al, 2015). Thus, with a confidence level of
95%, it can be concluded that indeed the claim about existence of gender gap is supported by
Dataset 1.
d) Hypothesis Testing
The test statistics of choice for testing the above hypothesis will be t as population standard
deviation remains unknown for the given variables. Also, the alternative hypothesis
highlights that the relevant test would be two tailed. Assuming the significance level for the
given hypothesis test as 5%, the test results derived from excel are pasted below.
level taken for this test which is an indication of the current evidence being sufficient to
facilitate rejection of null hypothesis (Hair et. al, 2015). Thus, with a confidence level of
95%, it can be concluded that indeed the claim about existence of gender gap is supported by
Dataset 1.
d) Hypothesis Testing
The test statistics of choice for testing the above hypothesis will be t as population standard
deviation remains unknown for the given variables. Also, the alternative hypothesis
highlights that the relevant test would be two tailed. Assuming the significance level for the
given hypothesis test as 5%, the test results derived from excel are pasted below.

The p value obtained from the above test is 0.9103. Apparently, it is higher that significance
level taken for this test which is an indication of the current evidence being insufficient to
facilitate rejection of null hypothesis (Hillier, 2016). Thus, with a confidence level of 95%, it
can be concluded that the claim about existence of gender gap is not supported by Dataset 2.
Section 4: Conclusion
a) Dataset 1 provides evidence of the existence of gender gap in Australia context. However,
Dataset 2 refutes the same but this would not be considered significant since there are
potential issues with the data that has been identified in Section 1. The key question that
remains unanswered is whether this gender gap is on account of high representation of
females in the low paying jobs or due to gender based discrimination with regards to
salary. Besides, the extremely skewed gender representation witnessed in certain
occupations is clearly a matter of concern going ahead.
b) Further research should be undertaken so as to lend clarity on the questions that the given
research study has raised. In particular, the focus has to be not to explore whether salary
levels of females is lower than males but to highlight the underlying reasons especially in
the backdrop of differing gender representation across occupations.
.
level taken for this test which is an indication of the current evidence being insufficient to
facilitate rejection of null hypothesis (Hillier, 2016). Thus, with a confidence level of 95%, it
can be concluded that the claim about existence of gender gap is not supported by Dataset 2.
Section 4: Conclusion
a) Dataset 1 provides evidence of the existence of gender gap in Australia context. However,
Dataset 2 refutes the same but this would not be considered significant since there are
potential issues with the data that has been identified in Section 1. The key question that
remains unanswered is whether this gender gap is on account of high representation of
females in the low paying jobs or due to gender based discrimination with regards to
salary. Besides, the extremely skewed gender representation witnessed in certain
occupations is clearly a matter of concern going ahead.
b) Further research should be undertaken so as to lend clarity on the questions that the given
research study has raised. In particular, the focus has to be not to explore whether salary
levels of females is lower than males but to highlight the underlying reasons especially in
the backdrop of differing gender representation across occupations.
.
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.

References
Eriksson, P. and Kovalainen, A. (2015) Quantitative methods in business research 3rd ed.
London: Sage Publications.
Flick, U. (2015) Introducing research methodology: A beginner's guide to doing a research
project. 4th ed. New York: Sage Publications.
Hair, J. F., Wolfinbarger, M., Money, A. H., Samouel, P., and Page, M. J. (2015) Essentials
of business research methods. 2nd ed. New York: Routledge.
Hillier, F. (2016) Introduction to Operations Research 6th ed. New York: McGraw Hill
Publications.
Livsey, A (2017) Australia's gender pay gap: why do women still earn less than men?
[online] Available at
https://www.theguardian.com/australia-news/datablog/2017/oct/18/australia-gender-pay-gap-
why-do-women-still-earn-less-than-men [Assessed at May 21, 2018]
Eriksson, P. and Kovalainen, A. (2015) Quantitative methods in business research 3rd ed.
London: Sage Publications.
Flick, U. (2015) Introducing research methodology: A beginner's guide to doing a research
project. 4th ed. New York: Sage Publications.
Hair, J. F., Wolfinbarger, M., Money, A. H., Samouel, P., and Page, M. J. (2015) Essentials
of business research methods. 2nd ed. New York: Routledge.
Hillier, F. (2016) Introduction to Operations Research 6th ed. New York: McGraw Hill
Publications.
Livsey, A (2017) Australia's gender pay gap: why do women still earn less than men?
[online] Available at
https://www.theguardian.com/australia-news/datablog/2017/oct/18/australia-gender-pay-gap-
why-do-women-still-earn-less-than-men [Assessed at May 21, 2018]
1 out of 11
Related Documents

Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.