Statistical Modeling and Data Analysis Report for BUS708, Trimester 1

Verified

Added on  2021/05/27

|9
|2145
|37
Report
AI Summary
This report presents a statistical modeling assignment focusing on data collection, analysis, interpretation, and inference. The assignment utilizes two datasets: one examining the relationship between gender and salary, and another exploring sustainable economic development across different countries. The report is divided into four sections: introduction, descriptive statistics, inferential statistics, and discussion and conclusion. Descriptive statistics, including bar graphs, scatter plots, and numerical summaries, are used to analyze the data, revealing insights into salary disparities and the relationship between salary and gifts. Inferential statistics involve hypothesis testing and ANOVA to draw conclusions about population parameters, such as gender-based salary differences and the proportion of male machine operators and drivers. The final section discusses the findings, including pay parity, salary-gift relationships, and educational levels in Africa, and suggests further research directions. The report demonstrates the application of statistical methods to derive meaningful insights and inform data-driven decision-making.
tabler-icon-diamond-filled.svg

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Running head: DATA ANALYSIS 1
BUS708 Statistics and Data Analysis
Statistical Modeling Assignment
Trimester 1, 2018
Student Name
Institution
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Statistical Modeling
Section 1: Introduction
This is a statistical modelling assignment. This assignment is meant to test on the data
collection and analysis. There is also a specific test of knowledge of interpretation of analysis results
and making meaningful statistical inferences. In short, this is a data- driven decision making
assignment. The assignment is divided into four major sections; introduction, descriptive statistics,
inferential statistics and discussion and conclusion.
There are two data sets used in completing this assignment, dataset1 and dataset2. Data
set1 is about gender and the amount of salary and their occupation. This data set has got three
variables; gender, salary/wage amount and gift amount. This is a secondary data extracted from the
database of the Australian Taxation Office (ATO). The first five cases of this dataset are displayed
below.
Gender Occ_code Sw_amt Gift_amt
Male 0 0 0
Male 7 4310 0
Female 2 70839 131
Female 3 79996 383
Male 9 0 0
The research question associated with this dataset is to find out the relationship between
the amount of salary and the gender of an individual. Several statistical analyses are involved in
achieving the objective of this study. These statistical tests include; descriptive statistics analysis,
inferential statistics and hypothesis testing.
Dataset 2 is a sample of different countries across Africa, America and Asia and their
development index. This is a secondary data extracted from the United Nations website (Public,
2010). This is data is about the level of sustainable economic development across these countries
(Public, 2010). The variables in this data set include; continent, access to improved sanitation
facilities, mobile telephone subscriptions and women’s average years in school. The first five cases of
this dataset are displayed below.
Continent
Access to
improved
sanitation
facilities
Mobile
telephon
e
subscripti
ons
Women's
average
years in
school
AFRICA 87.60 106.38 7.74
2
Document Page
Statistical Modeling
AFRICA 51.59 60.84 5.31
AFRICA 19.72 85.64 2.73
AFRICA 63.43 169.00 8.71
AFRICA 19.73 80.64 1.86
This research is interested in finding out the level of education among women in these
countries. To achieve this, this study will seek to find out the average duration taken by women in
school. This objective will be achieved by use of descriptive statistics (i.e. frequency table, graphical
display and summary statistics).
Section 2: Descriptive Statistics
Descriptive statistics in this section has been done entirely using dataset1. Descriptive
statistics gives the characteristics of the data in form of the graphical display of the variables and
summary statistics (David & David, 2000). Graphical display of data includes pie charts, bar graphs
and line graphs (Krishnamoorthy, 2005).
Likewise, summary statistics include the numerical characteristics of the data such as the
mean, median and mode (Knight, 2000). There are four descriptive statistics analyses done. These
four descriptive statistics analyses include three graphical displays and one suitable summary
statistics. The descriptive analyses are outlined below;
The first is a graphical display meant to describe the relationship between the variable
gender and Occ_Code. This has been done using a bar graph. The bar graph below represents the
spread of the number of workers in each occupation category in terms of gender (male or female).
The length of the bar represents the number of individuals in terms of gender (male or female) in
each occupation (Tim, 2005). From the graph, it is clear that there are more males than females in
the occupation codes 0, 2, 3, 7 and 8.
3
Document Page
Statistical Modeling
Oc
c_c
od
e
0 1 2 3 4 5 6 7 8 9
0
20
40
60
80
100
120
A Bar Graph of Number of
Individuals in occupation
Against Gender
Male
Female
OCC_Code
Frequency
The second descriptive analysis is a graphical display meant to describe the relationship
between the variables gender and SW-amt. A scatter plot has been developed to compare the
amounts of salary across the gender (males and Females). The individual earnings are plotted and
those individuals with higher earnings/salary amounts are found at a considerably far away from the
x- axis line (Krishnamoorthy, 2005).
From the plot below, it is clear that averagely, males earn more amounts of salary than their
female counterparts. Similarly, it is clear that generally, majority of the individuals (males and
females) earn significantly low amounts of salary.
0 100 200 300 400 500 600
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
Scatter Plot of Salary
across the Gender
Males
Females
Sa;ary
The third descriptive analysis is meant to describe the relationship between the variable
gender and SW_amnt using a suitable numerical summary. This analysis is meant to compare the
amount of salary earned by the males and female employees. The numerical analyses used in this
analysis are mean, maximum and minimum (Krishnamoorthy, 2005).
4
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Statistical Modeling
From the table below, the mean salary for the males is 53933.88 while the mean salary for
their female counterparts is 36044.18. This is a clear indication that generally, males earn more
salary than females. Similarly, the minimum amount of salary earned by males is 0 same the
minimum amount of salary earned by females.
On the other hand, the maximum amount of salary earned by a male employ is 839840
while the maximum amount of salary earned by a female is 308183. This is a further indication that
generally, males earn higher amount of salary than females.
Males
Females
Mean 53933.88 Mean 36044.18182
Minimum 0 Minimum 0
Maximum 839840 Maximum 308183
The fourth and the last descriptive statistics in this section is meant to describe the
relationship between the variables SW_amnt and Gift_amount. This analysis is outlines how the
amount of salary paid to an individual is related to the amount of gift given to that individual. A line
graph has been developed to describe this relationship. From the graph, it is clear that the mount of
salary is generally higher than the amount of gift.
0
200
400
600
800
1000
1200
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
A Scatter Plot of Salary
Amount and Gift amount
Sw_amt
Gift_amt
Amount
Section 3: Inferential Statistics
This section uses both datase1 and dataset2 in making statistical inferences. Statistical
inference is where the outcome of a sample analysis is used in describing the population from which
the sample was drawn (Knight, 2000). Inferences are made from the descriptive statistics outcome,
graphical display outcome or summary statistics outcome (Tim, 2005). The inferences made are
outlined below.
5
Document Page
Statistical Modeling
The first inference is based on the median salary. The median salary is 34788.5. This is the
value separating the higher half salary amounts and the lower half salary amounts.Sw_amt
Mean 44811.134
Standard Error 1694.223927
Median 34788.5
Therefore, the top 4 occupations based on the median are those with average salaries equal
or very closer to the median. The top four occupations are; professionals, clerical and administrative
workers, community and personal service workers and machinery operators and drivers. In these
occupations, there are two males and two females. This is fair representation of gender.
Gender Occ_code Sw_amt Gift_amt
Male 5 34916 0
Female 2 34890 40
Male 7 34645 11
Female 4 34687 186
The second statistical inference is a hypothesis test to ascertain whether the proportion of
machinery operators and drivers who are males is more than 80%. To test this hypothesis, we first
develop the following hypothesis;
H0: p= 0.8
H1: p> 0.8
Rejection region is such that; we reject null hypothesis whenever the p values is less than
the alpha value which is 0.05. Test statistic is calculated as follows;
Sample proportion, u= 0.49
Sample size= 44
Sample Sd= 65891.4
Therefore, the test statistics is calculated as
6
Document Page
Statistical Modeling
T=
0.4908
65891.4
44
=0.000047
And the p value 0.002458 which less than the alpha value. We conclude that there is no
sufficient evidence to prove that the population has more than 80% male machine operators and
drivers (Knight, 2000).
The third inferential statistics is hypothesis test to ascertain whether there is a difference in
salary amount between the genders. This test can be achieved by carrying out a single factor ANOVA
test. A single factor ANOVA test is used in testing whether there is a significant difference in the
means of two samples with size of more than 30 observations (Tim, 2005).
The following hypothesis is used in conducting this test;
H0: There is no difference in salary amount between males and females
H1: There is a difference in the salary amount between males and females
The following table is the output of the single factor ANOVA. The p value is 0.000000202.
The p value is less than the alpha value, 0.05. Hence we reject the null hypothesis that there is no
difference in salary amount between males and females. We conclude that statistically, there is no
sufficient evidence to prove that there is no difference in salary amount between males and females.
The fourth and the final analysis has been done on datset2. The inferential statistics carried
out here is the use of a suitable numerical summary method to determine the level of sustainable
economic development in Africa. This has been done by analysing the average number of years that
women take in studies. The table below shows the summary output.
7
ANOVA
SourceofVariatio
n
SS df MS F
P-value F crit
Between Groups 7.74E+10 1 7.74E+10 27.4072 2.02E-7 3.851103
Within Groups 2.73E+12 966 2.83E+09
Total 2.81E+12 967
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Statistical Modeling
From the output below, the average number of year taken by women in studies is 5.76.
Similarly, the minimum number of years taken by a woman in studies is 1.41 and the maximum is
11.48 years. This is an indication that the women take relatively shorter time for their studies (OECD,
2004). This could be possibly because majority of them do not opt obtain tertiary education. This
signifies that there is still a low level of sustainable economic development in Africa (Mark, 2009)Women's average years in school
Mean 5.76
Minimum 1.41
Maximum 11.48
Section 4: Discussion and Conclusion
A number of conclusions can be drawn from this study. It is evidenced that there is
pay parity between males and females. Males earn more salary than females. A further
research could be conducted to find out whether there is pay parity between males and
females for the same job. Similarly, a research could be done to establish the root course of
this pay parity. Similarly, a research could be done to establish whether there is a correlation
between the male and female salaries.
It has also been observed that there is a difference in the amount of salary and the
gift. It is clear that the amount of salary if far much higher than the amount of gift. A
research could be conducted to establish whether there is a significant difference in the
amount of salary one is earning and the amount of gift.
Women in Africa take a short period of time in their studies. This is an indication that
majority of African women do not pursue higher education. This could suggest a low level of
economic development among the African women (OECD, 2004). A research can be done in
this area to establish the causes of low education levels among the women in Africa.
8
Document Page
Statistical Modeling
References
David, J. S., & David, S. (2000).
Handbook of parametric and nonparametric statistical Procedures.
Knight, K. (2000).
Mathematical Statistics- Volume in Texts in Statistical Scence Series. Chapman and
Hall.
Krishnamoorthy, K. (2005).
Handbook of Statistical Distributions with Applications.
Mark, H. (2009).
Economic Development, Education and Transnational Corporations (Routldge
Studies in Development Economics).
OECD. (2004).
Economic, Environmental and Social Aspects (Oecd Sustainable Development Studies).
OECD Publishing .
Public, U. N. (2010).
Auditing for Social Change: A Strategy for Citizen Engagement in Public Sector
Accountability (Economic & Social Affairs).
Tim, S. (2005).
Mastering Statistical Process Control: A handbook for Performance Improvement
Using Cases.
9
chevron_up_icon
1 out of 9
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]