Analysis of East Asia and Pacific Health and Population Data
VerifiedAdded on 2023/06/11
|22
|3497
|268
AI Summary
This report represents the finding of the analysis on the data provided by the World Bank on the Health and Population in the East Asia and Pacific region. The analysis focuses on two main subsets in the data. These subsets are Immunization Data and Alcohol Consumption Data.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
TABLE OF CONTENTS
1. INTRODUCTION ……………………………………………………………………………………. 1
2. DATA SETUP …………………………………………………………………………………………. 2
3. EXPLORATORY DATA ANALYSIS …………………………………………………………….. 4
ONE VARIABLE ANALYISIS ………………………………………………………….. 4
TWO VARIABLE ANALYSIS ………………………………………………………….. 7
4. ADVANCED ANALYSIS ……………………………………………………………………………. 10
HIERARCHICAL AND K MEANS CLUSTERING ………………………………… 10
LINEAR REGRESSION …………………………………………………………………… 13
5. CONCLUSION …………………………………………………………………………………………. 18
6. REFLECTION …………………………………………………………………………………………… 19
REFERENCES ……………………………………………………………………………………………….. 20
Page | i
TABLE OF CONTENTS
1. INTRODUCTION ……………………………………………………………………………………. 1
2. DATA SETUP …………………………………………………………………………………………. 2
3. EXPLORATORY DATA ANALYSIS …………………………………………………………….. 4
ONE VARIABLE ANALYISIS ………………………………………………………….. 4
TWO VARIABLE ANALYSIS ………………………………………………………….. 7
4. ADVANCED ANALYSIS ……………………………………………………………………………. 10
HIERARCHICAL AND K MEANS CLUSTERING ………………………………… 10
LINEAR REGRESSION …………………………………………………………………… 13
5. CONCLUSION …………………………………………………………………………………………. 18
6. REFLECTION …………………………………………………………………………………………… 19
REFERENCES ……………………………………………………………………………………………….. 20
Page | i
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
1. INTRODUCTION
This report represents the finding of the analysis on the data provided by the World
Bank on the Health and Population in the East Asia and Pacific region. The analysis focuses on
two main subsets in the data. These subsets are Immunization Data and Alcohol Consumption
Data.
Immunization is an important health care aspect in any country. This aspect determines
the survival rate of infants and children below five years of age. Polio, for instance, can develop
into a disease that affects an individual’s entire live. Investing in immunization drives, is
therefore a very important step in ensuring both the survival of the infants and children aged
five and below, and a healthy life. This report compares the immunization in countries in East
Asia and Pacific to obtain a view of the distribution of immunization among the populations in
East Asia and Pacific. This report also examines the relationship that exist between the
immunization and the amount of money allocated for health in the specific countries. The aim
for examining the relationship is to establish whether higher allocations for health mean a
higher allocation for immunization and if lower allocations for health mean lower allocation for
immunization. The findings will provide experts in the health sectors in the various countries
with information on the impact of investment in health to the immunization, information that
form a reliable basis for health planning and funding.
The report also relates alcohol consumption in East Asia and Pacific to the individual
income in each country. This comparison intends to establish whether the level of income has
any influence on the amount of alcohol consumption. This analysis first compares the alcohol
consumption and the Gross National Income (GNI) of the individual countries in East Asia and
Pacific. The comparisons will provide an understanding of the nature of each of the two
variables in the region. The outcome from this analysis will be relevant to researchers and
professionals interested in social data analysis.
Page | 1
1. INTRODUCTION
This report represents the finding of the analysis on the data provided by the World
Bank on the Health and Population in the East Asia and Pacific region. The analysis focuses on
two main subsets in the data. These subsets are Immunization Data and Alcohol Consumption
Data.
Immunization is an important health care aspect in any country. This aspect determines
the survival rate of infants and children below five years of age. Polio, for instance, can develop
into a disease that affects an individual’s entire live. Investing in immunization drives, is
therefore a very important step in ensuring both the survival of the infants and children aged
five and below, and a healthy life. This report compares the immunization in countries in East
Asia and Pacific to obtain a view of the distribution of immunization among the populations in
East Asia and Pacific. This report also examines the relationship that exist between the
immunization and the amount of money allocated for health in the specific countries. The aim
for examining the relationship is to establish whether higher allocations for health mean a
higher allocation for immunization and if lower allocations for health mean lower allocation for
immunization. The findings will provide experts in the health sectors in the various countries
with information on the impact of investment in health to the immunization, information that
form a reliable basis for health planning and funding.
The report also relates alcohol consumption in East Asia and Pacific to the individual
income in each country. This comparison intends to establish whether the level of income has
any influence on the amount of alcohol consumption. This analysis first compares the alcohol
consumption and the Gross National Income (GNI) of the individual countries in East Asia and
Pacific. The comparisons will provide an understanding of the nature of each of the two
variables in the region. The outcome from this analysis will be relevant to researchers and
professionals interested in social data analysis.
Page | 1
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
2. DATA SETUP
The R Code that imported and loaded the Health and Population data, HealthAndPopulation.csv
into RStudio for analysis is as represented below
The structure and dimensions of the Health And Population Data were obtained using the str()
and dim() functions in R.
From the structure and dimensions analysis, the Health and Population Data has 967 entries
and 19 variables. The entries however are inclusive of five extra rows that at the end of the
data file that give information about the data file. Therefore, the correct number of entries is
962 with the variables correctly recorded as 19. The 19 variables in the dataset are factors with
the Country Name variable having 37 levels (Excluding the level representing the column
name). The 37 level represent the 37 countries whose data is in the dataset.
The preprocessing of the Health and Population Data involved the creation of two subsets from
the HealthAndPopulation.csv dataset. The resulting subsets were the Immunization Data.csv
and Alcohol Consumption Data.csv. The Alcohol Consumption Data subset contains the Country
Name, Alcohol Consumption and GNI variables for the year 2015. The Immunization Data
subset contains the Country Name, Immunization BCG and Health Expenditure Total variables
for the year 2014.
The preprocessing also involved the omitting of entries with missing values in the subsets using
the codes below for importing and preprocessing the subsets in R
Page | 2
#Importing the data for East Asia and Pacific as EAP
EAP <- read.csv("C:/Users/user/Documents/IDS/HealthAndPopulation.csv", header = T)
#Data Dimensions
dim(EAP)
#Data Structures
str(EAP)
summary(EAP)
2. DATA SETUP
The R Code that imported and loaded the Health and Population data, HealthAndPopulation.csv
into RStudio for analysis is as represented below
The structure and dimensions of the Health And Population Data were obtained using the str()
and dim() functions in R.
From the structure and dimensions analysis, the Health and Population Data has 967 entries
and 19 variables. The entries however are inclusive of five extra rows that at the end of the
data file that give information about the data file. Therefore, the correct number of entries is
962 with the variables correctly recorded as 19. The 19 variables in the dataset are factors with
the Country Name variable having 37 levels (Excluding the level representing the column
name). The 37 level represent the 37 countries whose data is in the dataset.
The preprocessing of the Health and Population Data involved the creation of two subsets from
the HealthAndPopulation.csv dataset. The resulting subsets were the Immunization Data.csv
and Alcohol Consumption Data.csv. The Alcohol Consumption Data subset contains the Country
Name, Alcohol Consumption and GNI variables for the year 2015. The Immunization Data
subset contains the Country Name, Immunization BCG and Health Expenditure Total variables
for the year 2014.
The preprocessing also involved the omitting of entries with missing values in the subsets using
the codes below for importing and preprocessing the subsets in R
Page | 2
#Importing the data for East Asia and Pacific as EAP
EAP <- read.csv("C:/Users/user/Documents/IDS/HealthAndPopulation.csv", header = T)
#Data Dimensions
dim(EAP)
#Data Structures
str(EAP)
summary(EAP)
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
The datasets were stored in the TacD and ImD data frames. The as.numeric function also
converted the Alcohol Consumption, GNI, Immunization and Health Expenditure variables from
factors.
Page | 3
#Data Preparation
#Specifying The Datasets for Analysis
#Alcohol Consumption Data (TacD)
TacD <- read.csv("C:/Users/user/Documents/IDS/Alcohol Consumption Data 2015.csv",
header = T)
TacD[TacD == ".."] <- NA
TacD <- na.omit(TacD)
TacD$Alcohol.Consumption <- as.numeric(TacD$Alcohol.Consumption)
TacD$GNI <- as.numeric(TacD$GNI)
TacD$Country.Name <- factor(TacD$Country.Name)
#Immunization Data (ImD)
ImD <- read.csv("C:/Users/user/Documents/IDS/Immunization Data 2014.csv", header =
T)
ImD[ImD == ".."] <- NA
ImD <- na.omit(ImD)
ImD$Immunization <- as.numeric(ImD$Immunization)
ImD$Health.Expenditure <- as.numeric(ImD$Health.Expenditure)
ImD$Country.Name <- factor(ImD$Country.Name)
The datasets were stored in the TacD and ImD data frames. The as.numeric function also
converted the Alcohol Consumption, GNI, Immunization and Health Expenditure variables from
factors.
Page | 3
#Data Preparation
#Specifying The Datasets for Analysis
#Alcohol Consumption Data (TacD)
TacD <- read.csv("C:/Users/user/Documents/IDS/Alcohol Consumption Data 2015.csv",
header = T)
TacD[TacD == ".."] <- NA
TacD <- na.omit(TacD)
TacD$Alcohol.Consumption <- as.numeric(TacD$Alcohol.Consumption)
TacD$GNI <- as.numeric(TacD$GNI)
TacD$Country.Name <- factor(TacD$Country.Name)
#Immunization Data (ImD)
ImD <- read.csv("C:/Users/user/Documents/IDS/Immunization Data 2014.csv", header =
T)
ImD[ImD == ".."] <- NA
ImD <- na.omit(ImD)
ImD$Immunization <- as.numeric(ImD$Immunization)
ImD$Health.Expenditure <- as.numeric(ImD$Health.Expenditure)
ImD$Country.Name <- factor(ImD$Country.Name)
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
3. EXPLORATORY DATA ANALYSIS
A. ONE VARIABLE ANALYSIS
I. ALCOHOL CONSUMPTION ANALYSIS FOR 2015
This analysis investigated the rates of alcohol consumption across the East Asia and Pacific
region. The R Code below plotted the results for this analysis
The analysis produced the boxplot graph in Figure 1 below
Figure 1
The results of the analysis in the plot above show Vietnam leading in the alcohol consumption
in East Asia and Pacific followed by Thailand, Mongolia and China. Indonesia has the lowest
alcohol consumption in the region.
Page | 4
#Total Alcohol Consumption Analysis
plot(TacD$Country.Name,TacD$Alcohol.Consumption,
ylab = "Alcohol Consumption",
main = "Alcohol Consumption in East Asia & Pacific (2015)")
3. EXPLORATORY DATA ANALYSIS
A. ONE VARIABLE ANALYSIS
I. ALCOHOL CONSUMPTION ANALYSIS FOR 2015
This analysis investigated the rates of alcohol consumption across the East Asia and Pacific
region. The R Code below plotted the results for this analysis
The analysis produced the boxplot graph in Figure 1 below
Figure 1
The results of the analysis in the plot above show Vietnam leading in the alcohol consumption
in East Asia and Pacific followed by Thailand, Mongolia and China. Indonesia has the lowest
alcohol consumption in the region.
Page | 4
#Total Alcohol Consumption Analysis
plot(TacD$Country.Name,TacD$Alcohol.Consumption,
ylab = "Alcohol Consumption",
main = "Alcohol Consumption in East Asia & Pacific (2015)")
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
II. GROSS NATIONAL INCOME (GNI) ANALYSIS
This analysis compared the Gross National Income (GNI) levels across the East Asia and Pacific
region. The GNI provides the information on the average annual income per person in the
country of interest. The R Code below plotted the results for this analysis
The analysis produced the boxplot graph in Figure 2 below
Figure 2
The results of the GNI analysis in the graph above show China leading in the GNI levels in East
Asia and Pacific followed by Tuvalu, Australia and Thailand. The country with the lowest GNI in
the region is Cambodia.
Page | 5
#GNI Analysis
plot(TacD$Country.Name,TacD$GNI,
ylab = "GNI",
main = "GNI in East Asia & Pacific (2015)")
II. GROSS NATIONAL INCOME (GNI) ANALYSIS
This analysis compared the Gross National Income (GNI) levels across the East Asia and Pacific
region. The GNI provides the information on the average annual income per person in the
country of interest. The R Code below plotted the results for this analysis
The analysis produced the boxplot graph in Figure 2 below
Figure 2
The results of the GNI analysis in the graph above show China leading in the GNI levels in East
Asia and Pacific followed by Tuvalu, Australia and Thailand. The country with the lowest GNI in
the region is Cambodia.
Page | 5
#GNI Analysis
plot(TacD$Country.Name,TacD$GNI,
ylab = "GNI",
main = "GNI in East Asia & Pacific (2015)")
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
III. IMMUNIZATION BCG DATA ANALYSIS
This analysis compared the immunization BCG in the different countries in the East Asia and
Pacific region. The R Code below plotted the results for this analysis
The analysis produced the boxplot graph in Figure 3 below
Figure 3
The results of the Immunization BCG analysis in the graph above show that up to eleven
countries lead in the Immunization BCG in East Asia and Pacific. These countries include China,
Thailand, Fiji and Tuvalu. Kiribati has the lowest Immunization BCG in the region.
Page | 6
#Immunization Analysis
plot(ImD$Country.Name,ImD$Immunization,
ylab = "Immunization BCG",
main = "Immunization BCG in East Asia & Pacific (2015)")
III. IMMUNIZATION BCG DATA ANALYSIS
This analysis compared the immunization BCG in the different countries in the East Asia and
Pacific region. The R Code below plotted the results for this analysis
The analysis produced the boxplot graph in Figure 3 below
Figure 3
The results of the Immunization BCG analysis in the graph above show that up to eleven
countries lead in the Immunization BCG in East Asia and Pacific. These countries include China,
Thailand, Fiji and Tuvalu. Kiribati has the lowest Immunization BCG in the region.
Page | 6
#Immunization Analysis
plot(ImD$Country.Name,ImD$Immunization,
ylab = "Immunization BCG",
main = "Immunization BCG in East Asia & Pacific (2015)")
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
B. TWO VARIABLE ANALYSIS
I. ALCOHOL CONSUMPTION - GROSS NATIONAL INCOME
(GNI) ANALYSIS
This analysis examined the relationship between the alcohol consumption and the Gross
National Income in East Asia and Pacific. The R Code below plotted the outcome of the analysis
The analysis produced the scatterplot in Figure 4 below
Figure 4
The results in the plot above does not show any directly definable relationship between the
Alcohol Consumption and the GNI in the East Asia and Pacific Region.
Page | 7
#Alcohol Consumption Data Analysis
plot(TacD$GNI,TacD$Alcohol.Consumption,
xlab = "GNI", ylab = "Alcohol Consumption",
main = "Alcohol Consumption - GNI Analysis in East Asia & Pacific")
B. TWO VARIABLE ANALYSIS
I. ALCOHOL CONSUMPTION - GROSS NATIONAL INCOME
(GNI) ANALYSIS
This analysis examined the relationship between the alcohol consumption and the Gross
National Income in East Asia and Pacific. The R Code below plotted the outcome of the analysis
The analysis produced the scatterplot in Figure 4 below
Figure 4
The results in the plot above does not show any directly definable relationship between the
Alcohol Consumption and the GNI in the East Asia and Pacific Region.
Page | 7
#Alcohol Consumption Data Analysis
plot(TacD$GNI,TacD$Alcohol.Consumption,
xlab = "GNI", ylab = "Alcohol Consumption",
main = "Alcohol Consumption - GNI Analysis in East Asia & Pacific")
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
II. IMMUNIZATION BCG – HEALTH EXPENDITURE ANALYSIS
This analysis related the Immunization BCG and Health Expenditure variables to establish the
relationship between them. The data on the Health Expenditure consisted of the totals of both
government and private health sectors in the region. The R Code below plotted the outcome of
the analysis
The analysis produced the scatterplot in Figure 5 below
Figure 5
Page | 8
#Immunization Data Analysis
plot(ImD$Health.Expenditure,ImD$Immunization,
xlab = "Health Expenditure",
ylab = "Immunization BCG",
main = "Immunization BCG - Health Expenditure Analysis in East Asia & Pacific")
II. IMMUNIZATION BCG – HEALTH EXPENDITURE ANALYSIS
This analysis related the Immunization BCG and Health Expenditure variables to establish the
relationship between them. The data on the Health Expenditure consisted of the totals of both
government and private health sectors in the region. The R Code below plotted the outcome of
the analysis
The analysis produced the scatterplot in Figure 5 below
Figure 5
Page | 8
#Immunization Data Analysis
plot(ImD$Health.Expenditure,ImD$Immunization,
xlab = "Health Expenditure",
ylab = "Immunization BCG",
main = "Immunization BCG - Health Expenditure Analysis in East Asia & Pacific")
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
The outcome of the analysis represented in the plot above show an almost spread out trend,
with high immunization BCG being spread across health expenditure levels. However, the
lowest immunization BCG is in the least half of the health expenditure and the country with the
highest health expenditure level has the highest immunization BCG in the region.
Page | 9
The outcome of the analysis represented in the plot above show an almost spread out trend,
with high immunization BCG being spread across health expenditure levels. However, the
lowest immunization BCG is in the least half of the health expenditure and the country with the
highest health expenditure level has the highest immunization BCG in the region.
Page | 9
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
4. ADVANCED ANALYSIS
A. HIERARCHICAL AND K MEANS CLUSTER ANALYSIS
Clustering group together items that are similar in a dataset based on a predetermined
condition or attribute (Galit, et al., 2018). It investigates the relationships and the nature of the
data in multivariate data sets (Jon, 2006). This cluster analysis focused on the Gross National
Income (GNI) in East Asia and Pacific region for the year 2015. The analysis aimed at grouping
the countries in the region according to their GNI levels. The GNI, as an economic indicator,
gives a view of how the economies in the region compare and relate.
The R Code below produced the plot for the hierarchical cluster analysis
The analysis produced the plot in Figure 6 below
Figure 6
Page | 10
#Hierarchical Cluster plot for GNI in 2015
Hclust <- hclust(dist(TacD[,c(1,3)]))
plot(Hclust, labels = TacD$Country.Name,
ylab = "GNI",
main = "GNI EAST ASIA & PACIFIC 2015")
4. ADVANCED ANALYSIS
A. HIERARCHICAL AND K MEANS CLUSTER ANALYSIS
Clustering group together items that are similar in a dataset based on a predetermined
condition or attribute (Galit, et al., 2018). It investigates the relationships and the nature of the
data in multivariate data sets (Jon, 2006). This cluster analysis focused on the Gross National
Income (GNI) in East Asia and Pacific region for the year 2015. The analysis aimed at grouping
the countries in the region according to their GNI levels. The GNI, as an economic indicator,
gives a view of how the economies in the region compare and relate.
The R Code below produced the plot for the hierarchical cluster analysis
The analysis produced the plot in Figure 6 below
Figure 6
Page | 10
#Hierarchical Cluster plot for GNI in 2015
Hclust <- hclust(dist(TacD[,c(1,3)]))
plot(Hclust, labels = TacD$Country.Name,
ylab = "GNI",
main = "GNI EAST ASIA & PACIFIC 2015")
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
The plot of the hierarchical cluster analysis above shows two main clusters from the top. These
two clusters represent the top half and the bottom half in the GNI levels in the region. The top
half on the left forms two more clusters with the left most country, China, representing the
country with the highest GNI followed by Tuvalu, Australia and Thailand. The bottom half is also
divided into two more clusters with the left most sub cluster of Myanmar, Cambodia and
Malaysia being countries with the lowest GNI level in the region.
The R Code below produced the plot for the K means cluster analysis
The analysis produced the plot below
Figure 7
Page | 11
#K means Cluster plot for GNI in 2015
set.seed(10)
Kclust <- kmeans(TacD[,3],3)
clusplot(TacD, Kclust$cluster, color=T, shade=T,
labels=2, lines=0,
main = "GNI EAST ASIA & PACIFIC 2015")
The plot of the hierarchical cluster analysis above shows two main clusters from the top. These
two clusters represent the top half and the bottom half in the GNI levels in the region. The top
half on the left forms two more clusters with the left most country, China, representing the
country with the highest GNI followed by Tuvalu, Australia and Thailand. The bottom half is also
divided into two more clusters with the left most sub cluster of Myanmar, Cambodia and
Malaysia being countries with the lowest GNI level in the region.
The R Code below produced the plot for the K means cluster analysis
The analysis produced the plot below
Figure 7
Page | 11
#K means Cluster plot for GNI in 2015
set.seed(10)
Kclust <- kmeans(TacD[,3],3)
clusplot(TacD, Kclust$cluster, color=T, shade=T,
labels=2, lines=0,
main = "GNI EAST ASIA & PACIFIC 2015")
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
The K means analysis grouped the East Asia and Pacific countries into three groups depending
on their level of GNI. The biggest cluster labeled 3 in red consisted of countries such as
Malaysia, Myanmar, Nauru and Solomon Islands indicated as 17, 21, 22 and 31 respectively.
These are the countries with the lowest GNI, we can conclude that a huge number of countries
in East Asia and Pacific have low GNI. The cluster with countries with the highest GNI labeled as
1 contains countries such as China and Australia indicated as 2 and 5 respectively.
Page | 12
The K means analysis grouped the East Asia and Pacific countries into three groups depending
on their level of GNI. The biggest cluster labeled 3 in red consisted of countries such as
Malaysia, Myanmar, Nauru and Solomon Islands indicated as 17, 21, 22 and 31 respectively.
These are the countries with the lowest GNI, we can conclude that a huge number of countries
in East Asia and Pacific have low GNI. The cluster with countries with the highest GNI labeled as
1 contains countries such as China and Australia indicated as 2 and 5 respectively.
Page | 12
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
B. LINEAR REGRESSION
The modelling of linear regressions is an analysis method that generates a linear equation to
describe the nature of the relationship between the variables (two or more) in question
(Freedman, 2009). Linear regression is a linear relationship given by the equation
Xi = d0 + d1v1 + e1, for a random variable X and controlled variable v (Jorge, et al., 2013). The
analysis will involve the two variables in each of the subsets.
I. ALCOHOL CONSUMPTION – GROSS NATIONAL INCOME
(GNI) MODEL
This model describes the relationship between the alcohol consumption and the Gross National
Income (GNI) in East Asia and Pacific Region.
The R Code below generated the Alcohol Consumption – Gross National Income (GNI) Model
The model produced the plot below
Page | 13
#Alcohol Consumption Data Model
AcModel <- lm(Alcohol.Consumption~GNI, data = TacD)
summary(AcModel)
plot(Alcohol.Consumption~GNI, data = TacD,
main = "Alcohol Consumption Data Model")
B. LINEAR REGRESSION
The modelling of linear regressions is an analysis method that generates a linear equation to
describe the nature of the relationship between the variables (two or more) in question
(Freedman, 2009). Linear regression is a linear relationship given by the equation
Xi = d0 + d1v1 + e1, for a random variable X and controlled variable v (Jorge, et al., 2013). The
analysis will involve the two variables in each of the subsets.
I. ALCOHOL CONSUMPTION – GROSS NATIONAL INCOME
(GNI) MODEL
This model describes the relationship between the alcohol consumption and the Gross National
Income (GNI) in East Asia and Pacific Region.
The R Code below generated the Alcohol Consumption – Gross National Income (GNI) Model
The model produced the plot below
Page | 13
#Alcohol Consumption Data Model
AcModel <- lm(Alcohol.Consumption~GNI, data = TacD)
summary(AcModel)
plot(Alcohol.Consumption~GNI, data = TacD,
main = "Alcohol Consumption Data Model")
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
Figure 8
Table 1 below represents the output of the Alcohol Consumption – Gross National Income (GNI)
Model
Call:
lm(formula = Alcohol.Consumption ~ GNI, data = TacD)
Residuals:
Min 1Q Median 3Q Max
-11.2321 -4.5224 -0.9751 6.6943 12.3082
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.4760 3.1051 3.696 0.00126 **
GNI 0.1351 0.1786 0.756 0.45738
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.242 on 22 degrees of freedom
Multiple R-squared: 0.02535, Adjusted R-squared: -0.01895
F-statistic: 0.5723 on 1 and 22 DF, p-value: 0.4574
Table 1
Page | 14
Figure 8
Table 1 below represents the output of the Alcohol Consumption – Gross National Income (GNI)
Model
Call:
lm(formula = Alcohol.Consumption ~ GNI, data = TacD)
Residuals:
Min 1Q Median 3Q Max
-11.2321 -4.5224 -0.9751 6.6943 12.3082
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.4760 3.1051 3.696 0.00126 **
GNI 0.1351 0.1786 0.756 0.45738
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.242 on 22 degrees of freedom
Multiple R-squared: 0.02535, Adjusted R-squared: -0.01895
F-statistic: 0.5723 on 1 and 22 DF, p-value: 0.4574
Table 1
Page | 14
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
The results from the table above show that the linear equation describing the relationship
between the alcohol consumption and the Gross National Income (GNI) in East Asia and Pacific
is:
Alcohol Consumption = 11.4760 + [0.1351 * (GNI)]
The R2 value for the model equals 0.02535; this means the model explains 2.535% of the
relationship between the alcohol consumption and the Gross National Income.
Page | 15
The results from the table above show that the linear equation describing the relationship
between the alcohol consumption and the Gross National Income (GNI) in East Asia and Pacific
is:
Alcohol Consumption = 11.4760 + [0.1351 * (GNI)]
The R2 value for the model equals 0.02535; this means the model explains 2.535% of the
relationship between the alcohol consumption and the Gross National Income.
Page | 15
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
II. IMMUNIZATION – HEALTH EXPENDITURE MODEL
This model describes the relationship between the immunization and the health expenditure in
East Asia and Pacific Region.
The R Code below generated the Immunization – Health Expenditure Model
The model produced the plot below
Figure 9
Page | 16
#Immunization Data Model
ImModel <- lm(Immunization~Health.Expenditure, data = ImD)
summary(ImModel)
plot(Immunization~Health.Expenditure, data = ImD,
main = "Immunization - Health Expenditure Model")
II. IMMUNIZATION – HEALTH EXPENDITURE MODEL
This model describes the relationship between the immunization and the health expenditure in
East Asia and Pacific Region.
The R Code below generated the Immunization – Health Expenditure Model
The model produced the plot below
Figure 9
Page | 16
#Immunization Data Model
ImModel <- lm(Immunization~Health.Expenditure, data = ImD)
summary(ImModel)
plot(Immunization~Health.Expenditure, data = ImD,
main = "Immunization - Health Expenditure Model")
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
Table 2 below represents the output of the Immunization – Health Expenditure Model
Call:
lm(formula = Immunization ~ Health.Expenditure, data = ImD)
Residuals:
Min 1Q Median 3Q Max
-8.0109 -2.4853 0.1355 3.4416 6.0915
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.8402 1.7531 3.902 0.000675 ***
Health.Expenditure 0.2585 0.1020 2.534 0.018216 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.022 on 24 degrees of freedom
Multiple R-squared: 0.2111, Adjusted R-squared: 0.1782
F-statistic: 6.421 on 1 and 24 DF, p-value: 0.01822
Table 2
The results from the table above show that the linear equation describing the relationship
between immunization and health expenditure in East Asia and Pacific is
Immunization = 6.8402 + [0.2585 * Health Expenditure]
The R2 value for the model equals 0.2111; this means that the model explains 21.11% of the
relationship between the immunization BCG and the health expenditure of countries in the
region.
Page | 17
Table 2 below represents the output of the Immunization – Health Expenditure Model
Call:
lm(formula = Immunization ~ Health.Expenditure, data = ImD)
Residuals:
Min 1Q Median 3Q Max
-8.0109 -2.4853 0.1355 3.4416 6.0915
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.8402 1.7531 3.902 0.000675 ***
Health.Expenditure 0.2585 0.1020 2.534 0.018216 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.022 on 24 degrees of freedom
Multiple R-squared: 0.2111, Adjusted R-squared: 0.1782
F-statistic: 6.421 on 1 and 24 DF, p-value: 0.01822
Table 2
The results from the table above show that the linear equation describing the relationship
between immunization and health expenditure in East Asia and Pacific is
Immunization = 6.8402 + [0.2585 * Health Expenditure]
The R2 value for the model equals 0.2111; this means that the model explains 21.11% of the
relationship between the immunization BCG and the health expenditure of countries in the
region.
Page | 17
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
5. CONCLUSION
The analysis in this report indicate that China and Thailand are among the countries that lead in
both Gross National Income (GNI) and alcohol consumption. This implies that that in the two
countries, the increase in the GNI has resulted in the increase in alcohol consumption.
The analysis also indicates that China and Tuvalu are among the countries leading in both Gross
National Income (GNI) and Immunization BCG, implying that the higher the health expenditure
the higher the immunization BCG.
China, Tuvalu, Australia and Thailand lead the East Asia and Pacific region in terms of GNI and
from analysis there could economic interrelation between the four nations. Myanmar,
Cambodia and Malaysia on the other hand have the lowest GNI in the region.
The analysis however, does not find any significant linear relationship between alcohol
consumption and Gross National Income (GNI), and Immunization and Health Expenditure for
the region. Both the linear models explain a fraction of the relationships with 2.535% for
Alcohol Consumption Data and 21.11% for the Immunization Data.
Page | 18
5. CONCLUSION
The analysis in this report indicate that China and Thailand are among the countries that lead in
both Gross National Income (GNI) and alcohol consumption. This implies that that in the two
countries, the increase in the GNI has resulted in the increase in alcohol consumption.
The analysis also indicates that China and Tuvalu are among the countries leading in both Gross
National Income (GNI) and Immunization BCG, implying that the higher the health expenditure
the higher the immunization BCG.
China, Tuvalu, Australia and Thailand lead the East Asia and Pacific region in terms of GNI and
from analysis there could economic interrelation between the four nations. Myanmar,
Cambodia and Malaysia on the other hand have the lowest GNI in the region.
The analysis however, does not find any significant linear relationship between alcohol
consumption and Gross National Income (GNI), and Immunization and Health Expenditure for
the region. Both the linear models explain a fraction of the relationships with 2.535% for
Alcohol Consumption Data and 21.11% for the Immunization Data.
Page | 18
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
6. REFLECTIONS
The Health and Population dataset contained numerous missing entries. This meant data
preprocessing had to involve the omitting of these entries. Although this process ensured
complete data for analysis, it also limited the accuracy of the analysis.
The alcohol consumption data for instance, only had values for the year 2015, thereby
restricting the analysis to the year 2015. The data on the Health Expenditure also had similar
limitations with data available for the year 2015. Therefore, the analysis is not representative of
the most recent findings.
The missing entries also meant that the analysis did not represent the entire East Asia and
Pacific region and thus not completely representative of the region as a whole.
In conclusion, a more complete dataset would provide reliable analysis for the East Asia and
Pacific region.
References
Page | 19
6. REFLECTIONS
The Health and Population dataset contained numerous missing entries. This meant data
preprocessing had to involve the omitting of these entries. Although this process ensured
complete data for analysis, it also limited the accuracy of the analysis.
The alcohol consumption data for instance, only had values for the year 2015, thereby
restricting the analysis to the year 2015. The data on the Health Expenditure also had similar
limitations with data available for the year 2015. Therefore, the analysis is not representative of
the most recent findings.
The missing entries also meant that the analysis did not represent the entire East Asia and
Pacific region and thus not completely representative of the region as a whole.
In conclusion, a more complete dataset would provide reliable analysis for the East Asia and
Pacific region.
References
Page | 19
ANALYSIS OF EAST ASIA AND PACIFIC HEALTH AND POPULATION DATA
Freedman, D. A., 2009.
Statistical Models: Theory and Practice. 1st ed. London: Cambridge
University Press.
Galit, S. et al., 2018.
Data Mining for Business Analytics. 1st ed. New Delhi: John Wiley & Sons,
Inc..
Jon, K. R., 2006. The Practice of Cluster Analysis.
Journal of Classification, 23(1), pp. 3-30.
Jorge, A. A., Angela, A. & Edson, Z. M., 2013. Robust Linear Regression Models: Use of Stable
Distribution for the Response Data.
Open Journal of Statistics, Volume 3, pp. 3-5.
Page | 20
Freedman, D. A., 2009.
Statistical Models: Theory and Practice. 1st ed. London: Cambridge
University Press.
Galit, S. et al., 2018.
Data Mining for Business Analytics. 1st ed. New Delhi: John Wiley & Sons,
Inc..
Jon, K. R., 2006. The Practice of Cluster Analysis.
Journal of Classification, 23(1), pp. 3-30.
Jorge, A. A., Angela, A. & Edson, Z. M., 2013. Robust Linear Regression Models: Use of Stable
Distribution for the Response Data.
Open Journal of Statistics, Volume 3, pp. 3-5.
Page | 20
1 out of 22
Related Documents
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.