FIN60003: Business Modelling and Analytics Comprehensive Report

Verified

Added on  2023/04/03

|22
|3717
|404
Report
AI Summary
This report provides an analysis of the Social Progress Index data, focusing on key indicators like child mortality rate, access to electricity, adult literacy rate, life expectancy, political rights, and discrimination against minorities. The analysis employs descriptive statistics, confidence intervals, and hypothesis tests to compare different continents and assess the factors influencing social progress. The report includes a stratified sample of 77 observations from 182 countries, with a focus on Africa and Asia due to their significant representation in the dataset. Key findings highlight disparities in various indicators across continents, with Europe and America generally showing higher levels of social progress compared to Africa and Asia. The report concludes with confidence intervals and hypothesis tests to validate statements about access to basic knowledge and political rights, providing insights for businesses and policymakers.
Document Page
Modelling and Analytics. 1
Business Modelling and Analytics.
Name
ID number
Date and Time
Professor.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Modelling and Analytics. 2
Executive Summary
In the business world, we understand that there are many contributing factors towards the
achievement of set goals. Just take care as a business manager, that a business without some set
specific goals is a failing business. This calls for approved measures ranging in all techno-
development analytics to all futuristic forecasts that have to be done before a decision is reached.
Currently, the world has decided to come up with such goals and as Michael Green says, an
objective without the statistics behind can indeed never be attained. It is fallacy to lack variables
to account for in any field that requires optimal and decision achievement. In fact, to be precise,
we need to employ current business analytics in order to achieve the SGDs (YouTube, 2019) that
we insist of day after day. The programmers of the world objectives should sit back and call the
analysists so that the mention of SGDs, may be a real walk-talk and not just a mere talk. This
calls us to look at the Social Progress Index data with all scrutiny. For instance, the dataset has
182 countries and the highest proportion of the derived issues in the dataset comes from Africa
and Asia. These two continents have the largest proportion of countries with about 26% and 27%
respectively. Furthermore, the categories to look at are about 12 with subsection count of 45
items to look at.
Just as portrayed by the Social progress index data, many businesses can take advantage
of the data in order to reflect keenly on the current world trends hence the managers can know
where to touch. Such position includes the trends in different economic, social and political set-
up of the whole universe. Diversification is something that all businesses need to employ in order
to increase their profit-making capabilities. Thus, the continent you set up your business really
matters a lot. You need to observe strictly the consumption levels in each of the 182 countries of
the world. Therefore, analysis is quite an important tool before objective setting.
Document Page
Modelling and Analytics. 3
Introduction.
Aha! analytics is the key to Modelling. This if the focus of this report, taking cares of the
stratified sampled data from the Social Progress Index Dataset. The major analytics will be one
the following sub categories of the data:
- The child Mortality rate
- Access to electricity.
- Adult literacy rate
- Life expectancy rate at 60
- Political rights and finally
- Discrimination and violence against minorities.
These factors were categorically chosen from the Social Progress Index data which was a survey
carried out to report the SGDs achievement analysis. As business practioneers and analysts, we
shall also look deep into this data with a sample of about 77 observations after removing the
blanks from the required sample of 100 observations. The filters applied were only to remove the
blanks and hence the task was achieved in a precisely well-designed manner. The purpose of the
analysis was to find the major characteristics of the dataset that are acquired after application of
sampling in a stratified way. The strata were selected proportionately with the highest
consideration on continents. Africa recording 26% while Asia recording 27% had the highest
proportion sample due to their many countries, while Oceania recorded the least proportion.
About 17% of the countries sampled were from the American continents whereas the 23% were
from the Europe continent. This can be verified from the table 1 and fig 1 in the appendix. We
Document Page
Modelling and Analytics. 4
confidently could say that the sampling error incurred was minimal as about 75% of the data was
well taken care of during the data cleaning process. This well sampled dataset can be found in
the attached excel file called SocialProgressIndex).xlsx and under the sheet named Stratified
Sample Dataset. The whole analytical process goes on as follows…
Descriptive Statistics.
From the sampling procedure, the first six and last six dataset were extracted to figure out
the general view of the data. The data contained 8 variables with an additional variable called the
random which was a random number generated for sampling. (see table 2 & 3 in appendix). As
previously stated, out of a population of 182 observations, the sample was only 77 observations.
This is a representation of about 42.3% of the overall data. However, the data was found to
contain many blanks which bring in more disparity and lack of data consistency (YouTube,
2019). Of the 9 variables, the nominal variable was the Continent while the ordinal variable was
the country name and random. The rest of the variables could be ranked and hence easy for data
analysis. This implies that the continuous variables were only 6 while two were categorical
leaving out the random column. For this part, the random and country name variable were not
used anywhere in the analysis, and were thus dropped off.
Looking at each of the seven remaining variables, the grouping factor that gave the data
meaning was the continent. A count of a total of 5 continents i.e. Africa, Oceania, Asia, America,
and Europe. The first subcategory analyzed was the average child mortality rate in each
continent. It depicts that the rate is at 38% in Africa, 37% in Oceania, 12% in Asia, 10% in
America while only 3% rate in Europe. This implies that most children we dying faster in Africa
and Oceania that in Europe. The average child mortality rate was at a figure of 38.53 and it
varied with a figure of 35.90. the middle mortality rate was reported as 25.50 and the least
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Modelling and Analytics. 5
counted rate was 2.70. However, the range between the least and most high rate was at about
154.20. This is quite a huge variability that can also be seen at 1289.12.
Further moving to average access to electricity, we can see that Europe has the highest
average at a scale of 100 while Africa and Oceania have the least access at a scale of 38.50 and
33.88 respectively. The sum sample average is at 72.96 while the sample deviation from the
average mean is at 32.92. The data is concentrated on the left with a 0.7942 scale. The middle
data scale is at 96. The difference between the minimum access to electricity and the maximum
access to electricity is 93.60. figure 2 in the appendix shows a histogram of the data.
In addition to access to electricity, we can look at the average literacy rate. Looking at
table 2 in the appendix, we can see that Europe has the highest average adult literacy rate at 25%
while Africa and Oceania Still lag behind. The average of this variable is 82.20 while the middle
rate is at 82.22. the variation from the mean is about 19.4 while the difference between the
maximum rate and the minimum rate is 69.53. We can report that the data is uniform in nature
with a 34.3% account. However, most of the data is concentrated on the left with a skewness of -
1.15.
Reporting on the wellness index category, we look at the life expectancy at 60. Figure 3
and table 3 from the appendix depicts this well. Most Life expectancy is averaged at 18.85 and
the middle life expectancy at 60 is 18.72. The continent with the highest average life expectancy
is America Followed by Europe. However, Oceania and Africa still record the lowest average
life expectancy at 60. The variation from the sample average life expectancy at is 60 is 7.103 and
the deviation off the mean is 2.665. Taking the difference between the maximum of 25.08 and
14.41, we get a range of 10.67. This difference is slightly high.
Document Page
Modelling and Analytics. 6
Also, looking at the average of the political rights, most awareness index is in Europe
followed by America. The least awareness index is recorded by Asia at 12% followed by Africa
at 15%. This can be presumably be due to the poor basic level of knowledge depicted by the
variable in category 5. The average of the all the political rights variable is at 20.27 while the
median is 22 while the highest appearing data is 4. The sample variation is at 152.85 while the
data is centered to the left by a figure of -0.1206. There is an indication that at least zero
awareness of the political rights. This can well be seen at the table 4 and figure 4 in the appendix.
Finally, the average discrimination and violence against minorities is at 6.54. Oceania
records the highest rate of discrimination followed by Asia, then Africa while the least record is
in America followed by Europe. The standard error in the data is 0.1972 and the middle rate is at
6.7. The most appearing index is at 6 and the sample variation from the mean is 2.99 or truncated
to 3. At 95% Confidence, we can report that about +/-0.3927 is off the mean. The graph in figure
5 and table 5 best describes this case.
Confidence Intervals
After performing an analysis on the indices where the data can lie, the level of surety
were measured for two variables , one from category three and the other from category 11. We
are 95% confident that the average access to electricity, a sub-category in category 3, is between
65.5 and 80.4 off the mean. The level of confidence is at a range of +/-7.4. Any other data
sampled data that lies outside this category can be termed and outlier. This is as shown in the
table 6 in the appendix.
Also, for the confidence interval for the category 11, and majoring down to the variable
Discrimination and violence against minorities, we found out that the confidence level is about
Document Page
Modelling and Analytics. 7
0.3927. Thus, we are 95% confident that the average discrimination and violence against
minorities lies between 6.1552 and 6.9408 (refer to table 7 in appendix). Note that confidence
levels are of utter importance in analysis as one can never be exactly sure of a specific average
value but can always approximate basing on the level of confidence provided. This avoids many
unbiased cases during the reporting of data.
Hypothesis Tests
Some statements were created for check for their truth during the analysis. This engaged
a hypothesis test and the best test done for these allegedly right statements was by the use of F-
test. The Sample variance was the key issue under testing and it was found out that the t-test and
z-test could not best fit the analysis.
The first hypothetical statement under consideration was whether the average access to
basic knowledge is higher in American countries as compared to African countries. The sample
observation for Africa was 25 while that of America was 13. The degree of freedom for America
was half that of Africa and the F-statistic reported was 7.864. The F-Value on a one tail was less
than the critical F-value on one tail. The decision therefore was that it is indeed true that at 95%
confidence level, the American countries have a higher Adult literacy rate as compared to the
Africa’s adult literacy rate. This is shown forth by Data 1 in the appendix.
In addition, the second hypothesized statement was to check whether there exists a
difference in terms of personal rights between Asian and European countries. The samples of
each category respectively were again at 24 and 13, with the degrees of freedom being 23 and 12
respectively. The F-statistic found was at 1.142 which represent an F-value of 0.4187. On
comparing the F-value with the Critical F both on a one tail, we found out that the F-value was
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Modelling and Analytics. 8
far much less than the F-critical value hence there is indeed evidence that the awareness of
political rights is higher in European Countries as Compared to Asian countries. View Data 2 in
the appendix to get the glimpses of this.
After keenly looking at the two statements, we also hard to look if whether there was a
difference between health wellness between Europe and American Countries. This test gave a
quite different expectation from the first two statements. Well, on both 12 degrees of freedom for
Europe and America and a sample similar to each other of 13 observations, the difference noted
was so minimal to be accounted for. The mean life expectancy between the two countries was
about the same and the sample variance differed slightly. The test however depicts that the F-
value reported of 0.2244 is less than the Critical F of 0.3722 hence we just made a decision to
accept that there is a slight variation between the terms of health and wellness between Europe
and America.
Correlation and Regression
Finally, analytics was on correlation and regression. Figures 6 and 7 show the correlation
plots. There was a negative relationship depicted by child mortality and life expectancy. Some
key outliers were annoted on the graph and the relationship shown was highly linear. We could
say that the regression equation can be written as:
Life expectancy = 20.898 - 0.0533Child mortality.
This equation implies that at every life expectancy level, there is a constant of about 20.898.
Also, at every life expectancy level, there is a reduction of 0.0533 child Mortality. This implies
that as the Child mortality rate increases, there is a decrease of life expectancy by 0.0533.
Looking upon the correlation coefficient, we can note a negative which implies a negative
Document Page
Modelling and Analytics. 9
relationship of 71% between the child mortality rate and the life expectancy at 60. Commenting
on the coefficient of determination, we can see that about 51.4% of the Life expectancy rate can
be accounted for by Child mortality rate. The reverse here is also true. Also, this is quite a good
model and can be used for prediction modelling. But however, we need to perform a regression
to see the adjusted coefficient of determination. In the summary output of regression found in
table 9 in the appendix, we can see that the adjusted R squared is 50.8% just a deviation of 0.6%.
Looking at the overall significance of the model, we cannot from the Anova table 1 in the
appendix that Significant F of 2.13E-13 which is far much less than the critical value of 0.05 at
95% confidence. The total degrees of freedom for this model is 76, and the regression degrees of
freedom is only 1, while that of the residuals is 75. Thus, this is quite a significant model. From
the coefficients table, looking at the p-values of each parameter, we see that both 1.67E-68 and
2.13E-13 are far much less than the critical value of 0.05. This model therefore is really apt.
In our excel file, we find this in regression model 2 and in figure 7 from the appendix.
This was to find the relationship between access to basic knowledge and personal rights. The
scatter plot quite revealed a lot of data inconsistencies, but a linear relationship was found by the
regression curve of:
Political rights = 10.327+0.121Adult Literacy rate.
This positive linear relationship was not really fine. The data were far off the regression line, and
hence a poor model.
From the point of correlation coefficient, only 18.99% of the Political rights was linearly
related to Adult Literacy rate. The coefficient of determination depicted the worst model, of only
3.61% of Political rights being accounted for by the Adult Literacy rate. From the output
Document Page
Modelling and Analytics. 10
summary, we see that the adjusted R-Squared is 2.3% which is so much lower. To insist more
findings, the Anova table 2 shows that on 76 total df, 1 df of regression and 75df of Residuals,
the model is quite not significant. F-value of 0.098086 is greater than the critical value of 0.05.
the p-values in the coefficients table implies that we should drop out all of the independent
variables.
As modelers, regression speaks quite a lot and thus this whole output is in the appendix.
Conclusion and Limitations
In conclusion, we can see that this sample dataset quite depicts more data characteristics
about the whole population. Stratified sampling performed on this data made the data quite easier
to work on as compared to working on the whole set of the data. The data procedures involved
we quite easier to be carried out (Little and Rubin, 2019). But the mega problem was on the
cleaning stage as it takes a lot of time to clean and filter data according to required
characteristics. Also, some output of the report could well indicate the wrong results from the
real-world analytics. This could lead to bad off errors in prediction that would result in poor
decision making. However, it is always good to analyze data after collecting in order to obtain
meaning from it.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Modelling and Analytics. 11
References
YouTube. (2019). The global goals we've made progress on -- and the ones we haven't | Michael
Green. [online] Available at: https://www.youtube.com/watch?v=N3SQlrmV1cE [Accessed 30
May 2019].
Little, R.J. and Rubin, D.B., 2019. Statistical analysis with missing data (Vol. 793). Wiley.
Kiron, D. and Shockley, R., 2011. Creating business value with
analytics. MIT Sloan Management Review, 53(1), p.57.
Asllani, A., 2014. Business Analytics with Management Science Models and Methods. FT Press.
Appendix.
Document Page
Modelling and Analytics. 12
1) Table 1
Row Labels
Count of Countries
Names
AFRICA 26.4%
AMERICA 17.0%
ASIA 27.5%
EUROPE 23.6%
OCEANIA 5.5%
Grand Total 100.0%
2) Fig 1
0.26373626373
6264; 26%
0.17032967032
967; 17%0.27472527472
5275; 27%
0.23626373626
3736; 24%
0.05494505494
50549; 5%
Total
AFRICA
AMERICA
ASIA
EUROPE
OCEANIA
3) Fig 2
Document Page
Modelling and Analytics. 13
10 20 30 40 50 60 70 80 90 100 More
0
5
10
15
20
25
30
35
40
45
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
Histogram of Access to Electricity
Frequency
Cumulative %
Bins
Frequency
4) Table 2
Row Labels
Average of Adult literacy
rate
EUROPE 98.73503308
AMERICA 92.18436462
ASIA 87.51337583
AFRICA 66.0994024
OCEANIA 47.269465
Grand Total 82.19866468
Min Max
30.47 100.00
5) Table 3
Continent
Average of Life expectancy at
60
AMERICA 21.42967462
EUROPE 21.10583846
ASIA 18.7763625
OCEANIA 16.913965
AFRICA 16.5480648
Grand
Total 18.84576519
min Max
14.41 25.08
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Modelling and Analytics. 14
6) Fig 3
23%
22%
20%
18%
17%
Total
AMERICA
EUROPE
ASIA
OCEANIA
AFRICA
7) Table 4
Continent
Average of Political
rights
EUROPE 30.38461538
AMERICA 28.76923077
OCEANIA 23
AFRICA 16.72
ASIA 13.66666667
Grand
Total 20.27272727
8) Figure 4
Document Page
Modelling and Analytics. 15
27%
26%20%
15%
12%
Total
EUROPE
AMERICA
OCEANIA
AFRICA
ASIA
9) Table 5
Discrimination and violence against
minorities
Mean 6.548052
Standard Error 0.1972
Median 6.7
Mode 6
Standard Deviation 1.730425
Sample Variance 2.994371
Kurtosis -1.05961
Skewness -0.07683
Range 6.7
Minimum 3.1
Maximum 9.8
Sum 504.2
Count 77
Confidence Level(95.0%) 0.392758
10) Figure 5
Document Page
Modelling and Analytics. 16
O C E A N I A A S I A A F R I C A E U R O P E A M E R I C A
0
1
2
3
4
5
6
7
8
9
Total
Total
11) Table 6
Confidence Interval for Category 3
12) Table 7
Confidence Intervals For Category 11
13) Data 1
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Modelling and Analytics. 17
Continent
Adult literacy
rate(Africa)
Continent
Adult literacy
rate(America)
AFRICA 79.61 AMERICA 98.09
AFRICA 71.16 AMERICA 82.78
AFRICA 38.45 AMERICA 95.14 F-Test Two-Sample for Variances
AFRICA 88.22 AMERICA 92.59
AFRICA 37.75 AMERICA 96.63 Adult literacy rate(Africa)
AFRICA 85.50 AMERICA 94.58 Mean 66.0994024
AFRICA 88.47 AMERICA 97.65 Variance 303.9853315
AFRICA 74.99 AMERICA 99.71 Observations 25
AFRICA 36.75 AMERICA 92.47 df 24
AFRICA 40.02 AMERICA 94.52 F 7.86469618
AFRICA 78.14 AMERICA 87.65 P(F<=f) one-tail 0.00031764
AFRICA 77.22 AMERICA 79.07 F Critical one-tail 2.505481547
AFRICA 79.31 AMERICA 87.54
AFRICA 43.27
AFRICA 73.85
AFRICA 49.03
AFRICA 83.24
AFRICA 55.57
AFRICA 76.58
AFRICA 59.77
AFRICA 78.02
AFRICA 79.36
AFRICA 47.60
AFRICA 64.66
AFRICA 65.96
Document Page
Modelling and Analytics. 18
14) Data 2
Continent
Political rights
(Asian
Countries)
Continent
Political rights
(European
ASIA 10.00 EUROPE 28.00
ASIA 16.00 EUROPE 5.00
ASIA 4.00 EUROPE 21.00
ASIA 2.00 EUROPE 33.00
ASIA 20.00 EUROPE 37.00
ASIA 28.00 EUROPE 38.00 F-Test Two-Sample for Variances
ASIA 7.00 EUROPE 38.00
ASIA 11.00 EUROPE 35.00 Political rights (Asian Countries)
ASIA 1.00 EUROPE 29.00 Mean 13.66666667
ASIA 9.00 EUROPE 36.00 Variance 108.2318841
ASIA 27.00 EUROPE 36.00 Observations 24
ASIA 35.00 EUROPE 38.00 df 23
ASIA 31.00 EUROPE 21.00 F 1.142211738
ASIA 7.00 P(F<=f) one-tail
ASIA 12.00 F Critical one-tail
ASIA 12.00
ASIA 5.00
ASIA 0.00
ASIA 32.00
ASIA 13.00
ASIA 13.00
ASIA 1.00
ASIA 14.00
Document Page
Modelling and Analytics. 19
ASIA 18.00
15) Figure 6
0.00 20.00 40.00 60.00 80.00 100.00 120.00 140.00 160.00 180.00
0.00
5.00
10.00
15.00
20.00
25.00
30.00
15.75
25.08
f(x) = − 0.0532559840849086 x + 20.8976421816195
R² = 0.514684427498106
Relati onship between Child Mortality and Life
Expectancy at 60.
Child Mortality Rate
Life Expectancy
16) Figure 7
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Modelling and Analytics. 20
20.00 30.00 40.00 50.00 60.00 70.00 80.00 90.00 100.00 110.00
0.00
5.00
10.00
15.00
20.00
25.00
30.00
35.00
40.00
f(x) = 0.120998039098399 x + 10.3268500305062
R² = 0.036062110780492
Relationship Between Access to basic Knowledge and Personal
Rights.
Adult Literace Rate
Political Rights
17) Summary output 1
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.71741
5
R Square
0.51468
4
Adjusted R
Square
0.50821
4
Standard Error
1.86910
2
Observations 77
18) Anova table 1
ANOVA
df SS MS F Significan
Document Page
Modelling and Analytics. 21
ce F
Regressio
n 1
277.871
4
277.871
4
79.5386
2 2.13E-13
Residual 75
262.015
5 3.49354
Total 76
539.886
9
19) Coefficient table 1
Coefficien
ts
Standar
d Error t Stat
P-
value
Lower
95%
Upper
95%
Intercept 20.89764
0.31353
4
66.6519
7
1.67E-
68
20.2730
5
21.5222
3
Child mortality
rate(1) -0.05326
0.00597
1
-
8.91844
2.13E-
13
-
0.06515
-
0.04136
20) Summary output 2
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.1899
R Square
0.03606
2
Adjusted R
Square 0.02321
Standard Error
12.2192
9
Observations 77
21) Anova table 2
ANOVA
df SS MS F
Significan
ce F
Regressio
n 1
418.943
4
418.943
4
2.80584
3 0.098086
Residual 75
11198.3
3
149.311
1
Total 76 11617.2
Document Page
Modelling and Analytics. 22
7
22) Coefficient table 2
Coefficien
ts
Standar
d Error t Stat P-value
Lower
95%
Upper
95%
Intercept 10.32685
6.09871
1
1.69328
4
0.09455
1
-
1.8224
1
22.4761
1
Adult literacy
rate(5) 0.120998
0.07223
5
1.67506
5
0.09808
6
-
0.0229
0.26489
7
chevron_up_icon
1 out of 22
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]