University Data Analysis and Visualization: Brexit Referendum

Verified

Added on  2022/09/12

|14
|2986
|18
Report
AI Summary
This report presents a comprehensive analysis of the 2016 UK Brexit referendum, utilizing data from the UK Electoral Commission and evaluation surveys. The analysis explores the correlation between leave vote percentages and voter turnout, employing descriptive and inferential statistics, including correlation coefficients and scatterplots generated using SAS software. The report examines regional variations in voting patterns, highlighting the positive correlation in London compared to negative correlations in other regions. Data visualizations, such as histograms and scatter graphs, are used to illustrate the findings, evaluating the claim that voter turnout is higher in areas favoring Brexit. The report concludes by discussing the limitations of correlation coefficients and summarizing the key findings, providing insights into the socio-economic and political impacts of the Brexit discussions in the UK.
Document Page
Data analysis and visualization
1
<University>
Data analysis and visualization
<Author>
31 August 2024
<Professor’s name>
<Program of Study>
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Data analysis and visualization
2
Table of Contents
Introduction................................................................................................................................3
Understanding of the Brexit Referendum..................................................................................3
Descriptive statistics...................................................................................................................3
Summary of the Correlation Coefficient Values........................................................................4
Discussion on the Correlation Coefficient values......................................................................5
Table 1: UK Correlation Coefficient value as whole.................................................................5
Table 2: Correlation Coefficient Values for the regions............................................................5
Data visualizations analysis.......................................................................................................6
Evaluating the claim...................................................................................................................6
Weakness of Correlation Coefficients........................................................................................7
Conclusion..................................................................................................................................7
Document Page
Data analysis and visualization
3
Abstract
The current analysis aims to present a report of the Brexit discussions obtained from the 2016
UK referendum results and evaluation surveys. Furthermore, it is worth to note that the
United Kingdom went for a referendum on 23rd June of 2016 with key areas of contest being
the exiting the European Union or not. The referendum dataset from the surveys were
obtained from the United Kingdom website making a trusted and reliable data. Similarly, the
dataset used can be obtained from the https://www.kaggle.com/electoralcommission/brexit-
results.
All the correlation coefficient for all regions are negative apart from the correlation
coefficient value for the London which is positive (0.7062) representing a moderate positive
correlation value. In addition, the analysis confirms that the general trend of the correlation
coefficients for the different regions within UK is between -0.1373 to -0.58506 where the
correlation coefficient value of -0.1373 is for the region East Midlands while the value of -
0.58506 represents the region east United Kingdom respectively. The results indicated that
majority of the participants are for Brexit.
Introduction
Well, every country at any given time have a trending problem that the country is going
through. For example, in the current times, there has been a debate in the United Kingdom
(UK) about the “Brexit”. Specifically, key areas of concerns for the UK is the impact of the
Brexit when it comes to the socio-economic and political impacts of the Brexit discussions in
UK, (Moloney, 2017). On this note, the current analysis aims to present a report of the Brexit
discussions obtained from the 2016 UK referendum results and evaluation surveys.
Generally, there has been a claim from the population that the number of voting turnouts is
more when people leave the voting areas during the referendum periods. Hence, the analysis
play an important role when it comes to decision making on this claim. For the purposes of
accomplishing the task, SAS software have been used to visualize the results including
presenting the correlations coefficients and scatterplots by use of the referendum evaluation
analysis survey dataset. In other words, both descriptive statistics and inferential statistics
have been used to enhance decision making. Again, histograms have been used to visualize
the results given that bars are easy to interpret and understand by just having a glance at the
findings.
Objectives why use the referendum data
To evaluate the claim that the number of voting turnouts is more when people leave the
voting areas during the referendum periods.
Understanding of the Brexit Referendum
First and foremost, there is need to have an understanding about the Brexit which has
circulated recently in the social networks. This is necessary before performing any statistical
test on the stated claim above. For to note, “Brexit” is basically an abbreviation from the
terminology known as the “British Exit” meaning the exit of the United Kingdom from the
European Union (Lalić-krstin and Silaški, p.3).
Furthermore, it is worth to note that the United Kingdom went for a referendum on 23rd June
of 2016 with key areas of contest being the exiting the European Union or not. The
Document Page
Data analysis and visualization
4
implication here is that, a group of the citizens who are of the opinion of the UK exiting the
EU voted in its favour while those on the contrary opinion voted against the Brexit. Well,
from the preliminary results, some studies confirmed the findings from the voting pattern that
the citizens in favour of the Brexit accounts for 52% while those who were against the UK
exiting the European Union accounts for 48%. In short, the results indicates that on average,
more than half of the participants had the opinion that the UK should actually exit from the
EU hence favouring the Brexit while 48% did not see the need for the UK exiting the EU
meaning that it should remain a member of the EU, (Ford and Goodwin, p.17).
Descriptive statistics
The following tables results indicate descriptive statistics.
Pct_Leave
Mean 52.83651
Standard Error 0.541162
Median 54
Mode 56.37
Standard Deviation 10.56305
Sample Variance 111.5781
Range 55.56
Minimum 20
Maximum 75.56
Confidence Level (95.0%) 1.064046
Pct_Turnout
Mean 73.77121
Standard Error 0.259514
Median 74.35
Mode 80.03
Standard Deviation 5.06552
Sample Variance 25.65949
Range 27.32
Minimum 56.25
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Data analysis and visualization
5
Maximum 83.57
Confidence Level (95.0%) 0.510264
Region
Mean turnout
% Mean Leave vote % Mean Remain vote %
East 75.74 54.05 45.95
East Midlands 75.45 58.41 41.59
London 70.31 41.86 58.06
North East 69.13 59.48 40.52
North West 70.74 56.12 43.88
Scotland 68.62 43.09 56.91
South East 77.05 52.17 47.83
South West 77.34 52.13 47.87
Wales 71.96 53.35 46.65
West Midlands 74.53 58.65 41.35
Yorkshire and The Humber 72.32 56.11 43.89
Whole UK 73.77 52.84 47.16
Summary of the Correlation Coefficient Values
The correlation coefficient results using at least two variables have been presented in this
section. At least two variables which is one of the assumptions of the correlation test were
considered while testing the associations. As mentioned earlier, the referendum dataset from
the surveys were obtained from the United Kingdom website making a trusted and reliable
data. Similarly, the dataset used can be obtained from the
https://www.kaggle.com/electoralcommission/brexit-results
First and foremost, the findings on the leave vote variable and the turnout variable have been
calculated first. From the results, the percentage correlation coefficient of the two variables is
-0.0107 which represents a weak negative association. Therefore, it is prudent to assume that
there is a weak, negative relationship between the leave vote percentage and the turnout
percentage in the UK.
Secondly, all the other correlation coefficient’s results have been presented for each region of
the United Kingdom. Looking at the findings of the correlation coefficient for all regions, all
the values are negative apart from the correlation coefficient value for the London which is
positive (0.7062) representing a moderate positive correlation value. In addition, the analysis
confirms that the general trend of the correlation coefficients for the different regions within
UK is between -0.1373 to -0.58506 where the correlation coefficient value of -0.1373 is for
the region East Midlands while the value of -0.58506 represents the region east United
Kingdom respectively.
Finally, one of the UK regions which is Northern Ireland have a zero-correlation coefficient
value when it comes to the comparisons of the association between the leave vote percentage
and turnout percentage. Well, this might be because data points for the Northern Ireland is a
single pair unlike to two pairs required for a correlation test.
Document Page
Data analysis and visualization
6
Discussion on the Correlation Coefficient values
Based on the key variables of concerns that is the leave vote percentage and turnout
percentage, the correlation coefficient values for the whole United Kingdom as well as the
correlation coefficient values for the different regions of the United Kingdom have been
summarized in a table shown below. Consistently, the data visualization and analytics have
been presented by use of the scatter plots and histograms while identifying the existing
associations within the variables of interest above.
The use of the correlation coefficient values is to help in the identification of the strength of
associations within the variables which in this case are the leave vote percentage and turnout
percentage variables. Normally, the correlation coefficients values range from -1 to +1 where
-1 represents a very strong, but negative linear relationship between the variables whereas the
correlation coefficient value of +1 represents a very strong but positive linear relationship
between two variables. However, a correlation coefficient value of zero (0) confirms that
there is no association between the two variables as stated by (Goodwin and Leech, p.252).
By interpretations, all the correlation coefficient values with a negative symbol indicates that
the slope is downwards meaning that there is a negative gradient while the correlation
coefficient values with a positive symbol indicates that the slope is upwards meaning that
there is a positive gradient. Looking at the analysis, the results indicate that London is the
only region with a positive gradient whereas other regions within United Kingdom plus the
UK have a negative gradient. As mentioned earlier, the data points for the Northern Ireland is
limited hence hindering the generation of the gradient value for this region.
Table 1: UK Correlation Coefficient value as whole
Correlation Coefficient
United Kingdom -0.01077
(Source: UK Electoral Commission)
Table 2: Correlation Coefficient Values for the regions
(Source: UK Electoral Commission)
Region Correlation Strength
East -0.58506 Moderate
East Midlands -0.13731 Very weak
London 0.706249 Strong
North East -0.5147 Moderate
North West -0.39718 Weak
Northern Ireland 0 None
Scotland -0.32524 Weak
South East -0.25209 Weak
South West -0.39053 Weak
Wales -0.26518 Weak
West Midlands -0.52353 Moderate
Yorkshire and The Humber -0.52354 Moderate
Document Page
Data analysis and visualization
7
Data visualizations analysis
Region Wise Scatter Graphs
(Source: UK Electoral Commission)
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Data analysis and visualization
8
Document Page
Data analysis and visualization
9
Document Page
Data analysis and visualization
10
UK Scatter Graph (Source: UK Electoral Commission)
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Data analysis and visualization
11
Mean Values Graph
(Source: UK Electoral Commission)
Evaluating the claim
In order to visualize the mean results of the turnout percentage, leave vote percentage and
remain vote percentage, a histogram presented in the figure above was drawn. Looking at the
analysis visualized above, the results show that majority of the participants voted in favour of
the Brexit as accounted for by 54.05% than those who voted against it as accounted for by
Document Page
Data analysis and visualization
12
45.95%. The results confirm that more than half of the participants are for the United
Kingdom leaving the European Union.
The findings are similar when it comes to different regions within the UK. For instance, in
East Midlands region, 58.41% of the voters were for the Brexit while only 41.59% were
against the United Kingdom leaving the European Union. Similarly, in Northern East region,
59.48% of the participants were for Brexit while 40.52% were against it. Therefore, on
average, it is prudent to conclude that most of the citizens within the United Kingdom are for
Brexit.
Despite the fact that the visualized results confirming that majority of the people in UK are
for Brexit does not fully satisfy the claim that majority of the people turning out for voting
will vote in favor of the Brexit. Hence, a statistical significance test can be adopted to test the
claim from the population that the number of voting turnouts is more when people leave the
voting areas during the referendum periods. In addition, the main statistical analysis called a
confidence interval has been used to test the claim. For to note, a confidence interval is a
measure of the inferential statistics which is used to test whether a true mean exist between
the provided dataset and that this mean lies within the values provided. In addition, a
confidence levels (z) and standard deviations (s) are normally assumed to be from a normal
distributed dataset.
On this note, the provided formula can be used to test the claim. In this formula, n refers to
the total observations within the provided dataset.
From the provided formula, the letter x is used to mean the average value from the variables;
the leave vote percentage or the remain vote percentage, while the letter z is normally
calculated to be 1.96 at 95% confidence interval. Again, the letter s refers to the standard
deviation.
In summary, all the calculated values are replaced in the formula above as shown below:
Pct_Remain
x 47.15588 46.09695 48.21481
z 1.96
s 10.54567 0.540271
n 381 19.51922 1.058931
Pct_Leave
x 52.83651 51.77723 53.89579
z 1.96
s 10.54918 0.540451
n 381 19.51922 1.059284
In addition, with the above replacements, a 95% confidence level for the leave vote
percentage has been found to be [51.8, 53.9] and the confidence interval of remain vote
percentage was calculated to be [46.1, 48.2]. Now that there is no overlapping between the
two confidence intervals is a clear indication that there is some form of the statistical
chevron_up_icon
1 out of 14
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]