World Health and Population Analysis

Verified

Added on 2023/06/12

AI Summary

This data analysis presents health and population statistics for East Asian and Pacific countries. The report focuses on health issues and factors that might affect the total fertility of a woman from 2001 to 2015. The report includes one variable, two variable, cluster and linear regression statistical techniques to present and extract information from the data.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.

World Health and Population Analysis 1
WORLD HEALTH AND POPULATION ANALYSIS
Name
Course Number
Date
Faculty Name

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

World Health and Population Analysis 2
World Health and Population Analysis
1. Introduction
In this data analysis, health and population statistics will be presented for East Asian and
Pacific countries. It is an open report without fixed objectives but in this paper will focus on
health issues and factors that might affect the total fertility of a woman from 2001 to 2015. The
development of a country is determined by various factors which should be improved as a whole.
As a result, countries are making policies which are focused on the general development as
opposed to dedicated approaches. Further, research has also taken an important role in
supporting the informed decision to the government and non-governmental organisations. The
target group for this report is the government agencies, researchers and business legislatures.
The main limitation of this report is that it only includes information from East Asia and
Pacific region information, hence it may not be generalizable to other areas across the world.
Also, it might not feasible to countercheck the validity of the data because we only depend on
data from the World Bank. The data will be downloaded from the World Bank development
indicators database, extracted and filtered using the MS Excel software to fit the criteria of this
report. Using the advanced filter of MS Excel, the selected filter names for this report will be
used to select the rows which include such data for analysis. Also, data from 2001 to 2015 will
be extracted because this report as stated in the assignment requirements.
One variable, two variable, cluster and linear regression statistical techniques will use
present and extract information from the data. For the one and two analysis, appropriate graphs
will be used to present the data and explanations provided about the distributions and
relationships. Further, cluster analysis and k-means techniques will be briefly explained. The
cluster analysis will be performed to understand explain some groupings. Lastly, simple linear
regression technique to explain some relationships between variables – with linear plots created
to depict their correlation.
2. Data Setup
Data pre-processing is required to create a tidy dataset which is analysable using the r
software. First, it will be important to understand the variables to be analysed – which will guide
the analysis stage. This will allow a focused analysis for the East Asia and Pacific region data as

World Health and Population Analysis 3
a whole and use other countries to depict the idea of cluster analysis by adding more categories,
which are the countries. In this analysis, we will focus on the total fertility rate represented as
births per woman, adolescent fertility rate (births per 1000 women aged 15 – 19 years) and
women unemployment rate which is presented as the per cent of the female labour force, an
International Labour Organisation estimate.
library(cluster)
setwd("E:/Documents/740362")
getwd()
## [1] "E:/Documents/740362"
mydata <- read.csv("mydata.csv")
dim(mydata)
## [1] 166 5
names(mydata)
The working directory was set to the location of the dataset to allow easy import and
traceable working environment. The dataset has 5 variables which are the country, year and the
three variables to be analysed as mentioned above. Cluster package has been loaded to allow for
cluster and k-means analysis. Generally, the report will report statistic and graphs for health and
population data for East Asia and Pacific region(Everitt et al., 2011).
3. Exploratory Data Analysis
3.1 One Variable Analysis
3.1.1 Adolescent Fertility Rate
summary(Adol_Fert_rate[Country == "East Asia & Pacific"])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 17.93 18.39 19.20 19.40 20.53 21.05
hist(Adol_Fert_rate[Country == "East Asia & Pacific"], main = "Adolescent Fertility
Rate Births Per 1000 Women (15 - 19)",
xlab = "East Asia & Pacific Adolescent Fertility Rate")

World Health and Population Analysis 4
Figure 1: Adolescent fertility rate of East Asia & Pacific Region
The histogram above shows the distribution of the adolescent fertility rates (births per
1000 women between 15 – 19 years) from 2001 to 2015. The highest frequency of the rates is
around 20.5 to 21, which is around 3 rate statistics. Most of the years had adolescent rates of
between 17.5 to 20 births per 1000 women between 15 to 19 years(Scott, 2010).
3.1.2 Total Fertility Rate
summary(Total_Fert_Rate[Country == "East Asia & Pacific"])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.780 1.787 1.796 1.794 1.800 1.806
boxplot(Total_Fert_Rate[Country == "East Asia & Pacific"], main = "Total Fertility
Rate - Number of births per Woman")

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

World Health and Population Analysis 5
Figure 2: Boxplot of Total Fertility rate
The above boxplot was used to present the distribution of total fertility rate from 2001 to
2015 for East Asia and Pacific region. We can conclude that the distribution is not normal and
the number of year with total fertility rate between those above the mean is not equal to those
below. The approximate median number of children a woman gave birth to is 1.796.
3.1.3 Female unemployment rate
summary(Female_Unemployment[Country == "East Asia & Pacific"])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.617 3.660 3.829 3.821 3.954 4.098
boxplot(Female_Unemployment[Country == "East Asia & Pacific"],
main = "Women Unemployment Rate")

World Health and Population Analysis 6
Figure 3: the Female unemployment rate
According to the summary statistics and the histogram plot above, women unemployment
rate is approximately normal for the year 2001 to 2015. This shows that not much has been done
to reduce the unemployment rates all through this period. Boxplot was chosen because the
variable is continuous(Cohen, Manion and Morrison, 2011).
3.2 Two Variable Analysis
3.2.1 Adolescent fertility rate and Total fertility rates
plot(Adol_Fert_rate[Country == "East Asia & Pacific"], Total_Fert_Rate[Country ==
"East Asia & Pacific"],
main = "Adolescent fertility rate by Total fertility rate",
xlab = "Adolescent fertility rate", ylab = "Total Fertility rate")

World Health and Population Analysis 7
Figure 4: Adolescent fertility by total fertility rate
There is a positive correlation between the adolescent fertility rate and total fertility rate.
Scatter plot was chosen because both variables are quantitative, hence looking for nature of the
association(Roberts, 2013).
3.2.2 Female employment rate and total fertility rate
Figure 5: Female unemployment rate and total fertility rate

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

World Health and Population Analysis 8
There is a negative association between the female unemployment rate and the total
fertility rates. Also, the scatter plot was chosen because it is the best in representing continuous
variables.
4 Advanced Analysis
4.1 Clustering
Cluster analysis and k-means assist in finding the best combinations of variables to
predict the best category a value might belong. In this, the standard values are calculated and
compared based on their mean values. Central values are determined, which might not
necessarily be the real means, though they might explain the groups effectively. In this report,
female unemployment rates, adolescent and total fertility rates have been used to predict the
countries using cluster analysis.
Figure 6: Cluster plot for country groupings
The cluster analysis plot above indicates that the two components which have been
created using the three variables explain 93.44% of the point variability. This shows that the
statistics are distinct between the 12 countries(Everitt et al., 2011).

World Health and Population Analysis 9
4.2 Linear Regression
4.2.1 Adolescent and total fertility
fit_1 <- lm(Adol_Fert_rate[Country == "East Asia & Pacific"] ~ Total_Fert_Rate[Country
== "East Asia & Pacific"])
summary(fit_1)
plot(Adol_Fert_rate[Country == "East Asia & Pacific"], Total_Fert_Rate[Country ==
"East Asia & Pacific"], xlab = "Adolescent fertility rate",
ylab = "Total fertility rate",
main = "Relationship between Adolescent Fertility and Total fertility rate")
abline(fit_1, col = 2)
Call:
## lm(formula = Adol_Fert_rate[Country == "East Asia & Pacific"] ~
## Total_Fert_Rate[Country == "East Asia & Pacific"])
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.5957 -0.6786 0.1023 0.7308 1.3998
##
## Coefficients:
## Estimate Std. Error
## (Intercept) -136.46 52.71
## Total_Fert_Rate[Country == "East Asia & Pacific"] 86.88 29.38
## t value Pr(>|t|)
## (Intercept) -2.589 0.0225 *
## Total_Fert_Rate[Country == "East Asia & Pacific"] 2.957 0.0111 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9379 on 13 degrees of freedom
## Multiple R-squared: 0.4021, Adjusted R-squared: 0.3562
## F-statistic: 8.744 on 1 and 13 DF, p-value: 0.01112
There is significant positive association between adolescent fertility rate and total fertility
rate(Pandis, 2016).
4.2.2 Female unemployment rate and total fertility rate
fit_2 <- lm(Female_Unemployment[Country == "East Asia & Pacific"] ~
Total_Fert_Rate[Country == "East Asia & Pacific"])
summary(fit_2)
##
## Call:
## lm(formula = Female_Unemployment[Country == "East Asia & Pacific"] ~
## Total_Fert_Rate[Country == "East Asia & Pacific"])
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.07105 -0.04504 -0.01954 0.02821 0.14946
##
## Coefficients:
## Estimate Std. Error
## (Intercept) 35.170 3.547
## Total_Fert_Rate[Country == "East Asia & Pacific"] -17.474 1.977

World Health and Population Analysis 10
## t value Pr(>|t|)
## (Intercept) 9.916 1.98e-07 ***
## Total_Fert_Rate[Country == "East Asia & Pacific"] -8.838 7.39e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.06311 on 13 degrees of freedom
## Multiple R-squared: 0.8573, Adjusted R-squared: 0.8464
## F-statistic: 78.12 on 1 and 13 DF, p-value: 7.388e-07
plot(Female_Unemployment[Country == "East Asia & Pacific"], Total_Fert_Rate[Country ==
"East Asia & Pacific"], xlab = "Female Unemployment rate",
ylab = "Total fertility rate",
main = "The relationship between female unemployment rate and total fertility
rate")
abline(fit_2)
There is a significant negative association between female unemployment rate and total fertility
rate(Zou, Tuncali and Silverman, 2003).
5 Conclusion
In conclusion, adolescent fertility rates and female unemployment rates are significant
predictors of total fertility rates. The government should work on increasing the employment
rates of women to reduce the fertility rates. Also, adolescent fertility rates should be reduced to
decrease the total fertility rate.
6 Reflections
I actually worked well on processing the data into an analysable tiny dataset in R. I did
not encounter any problem.

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

World Health and Population Analysis 11
References
Cohen, L., Manion, L. and Morrison, K. (2011) ‘Descriptive Statistics’, in Research methods in
education, pp. 622–640. doi: 10.1213/ANE.0000000000002471.
Everitt, B. S. et al. (2011) Cluster Analysis, Quality and Quantity. doi: 10.1007/BF00154794.
Pandis, N. (2016) ‘Linear regression’, American Journal of Orthodontics and Dentofacial
Orthopedics, 149(3), pp. 431–434. doi: 10.1016/j.ajodo.2015.11.019.
Roberts, D. (2013) Statistics 2 - Correlation Coefficient and Coefficient of Determination,
MathBits.com. Available at: http://mathbits.com/MathBits/TISection/Statistics2/correlation.htm.
Scott, D. W. (2010) ‘Histogram’, Wiley Interdisciplinary Reviews: Computational Statistics,
2(1), pp. 44–48. doi: 10.1002/wics.59.
Zou, K. H., Tuncali, K. and Silverman, S. G. (2003) ‘Correlation and Simple Linear Regression’,
Radiology, 227(3), pp. 617–628. doi: 10.1148/radiol.2273011499.