MAT4MDS Assignment: Analysis of Refugee Migration Data Science Paper

Verified

Added on  2023/01/10

|7
|1765
|68
Report
AI Summary
Document Page
Introduction and general view of the paper in focus
Migration, as seen over recent years has been in the center of various, the ongoing “British exit
(Brexit)”. One particular article on the role of immigration in the Brexit notes that, “…EU
opponents saw immigration as a national issue, as it affected the internal life of the country, thus
many chose the “leave” option” (Mauldin & Friedman, 2016). This might be a justification
among many on why numerous researchers have sought to examine the issue on immigration, its
causes and effects. In a paper by (Najem & Faour, 2018) which will be the focus of analysis for
this paper though in a mathematical perspective on the statistics, models and data used to achieve
the results and conclusion of the paper. The researchers, conduct a Debye–Hückel theory on
refugees’ migration, after which they realize that the radiation model inspired by the Debye–
Hückel theory better predicts refugee mobility in comparison to the performance of the gravity
model which fails.
Specifically, we examine the extent to which mathematical models are used such as the model
constructs, known strengths, and their possible influence on the results obtained, the use of visual
methods in analysis i.e. graphs which will include: the type, its relevance that might have led to
its use in the paper, shortcomings etcetera as well as determine the source of the data, its validity.
In this regard we will then examine the conclusions made with regard to the data analysis
conducted.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Article analysis
Mathematical models
Gravity Model
The paper’s mathematical models are generally derived with a basis to new econometrics’
gravity model. In practice, the gravity model has most of its use in analyses involving
international statistics whose goals are to estimate the interaction effects between, say two cities
or countries. Mathematically, the basic interaction between two regions is represented by:
1
Where G is a constant, F is the flow occurring between the two places say Syrian and Lebanese
cities as in the study’s case, D is the distance between the two cities and M is the flux i.e.
influence for moving from city A to city B. In modern times, the distance between cities can be
obtained by the help of software such as Google API. In order to maintain our flow with the
focus article mathematical modelling uses, we examine the use of an interaction analysis model
i.e. gravity and radiation model. Unlike in the paper, let’s examine a simpler form of the model
which describes the interaction between two different cities conducting trade given by:
2
3
Document Page
In comparison to model 3, model 2 infers that the patterns of “…bilateral aggregated bilateral
trade flows among any two countries A and B is “proportional to the gross national products of
those countries and inversely proportional to the distance between them” (Chaney, 2011).
Therefore, we can conclude that in an econometric gravity model, the interaction between any
two cities is influenced by distance in an inverse proportion form that is the large the distance the
lower the interaction effect and where interaction might be trade, immigration, war etcetera.
Radiation model
The researchers further use a “Radiation model” sue to its lack of static compared to the gravity
model. In econometrics, a radiation model, unlike the gravity model accounts for other options
that are presented to the refugees from point A to point B, such that they might choose other
cities other than B (Simini, M, Maritan, & Baraba´si, 2012). Basically, the radiation model
defines current average influx between two cities:
4
In equation 4, the model is parameter free and T is the total number of refugees, from city i and
ni and mi are the total number of persons in city I and j respectively. In addition, Sij is the total
population with a centered density i but touches j and excludes the origin of the refugees as well
as the destination population (Curiel, Pappalardo, Gabrielli, & Bishop, 2018). Specifically, the
radiation model is suitable in measuring human mobility, an argument that is justified from the
research results where results from the radiation model are significant while those from the
gravity model fail to estimate the mobility of the refugees.
Document Page
Implementation of the gravity and radiation models
For both models, data is generated by using the formulas 2 and 3 which are then used in a linear
regression model which takes the form:
For i=1, 2, …, n; Xi are the response variables, βi are the regression coefficients. The regression
model is useful in predicting a continuous variable through regressing exogenous variables such
as the number of refugees and distance as in this case. The regression model was implemented
for the data from the radiation model which was predicted against refugee fluxes (Najem &
Faour, 2018). Moreover, a regression model can be fitted on its independent of the gravity and
radiation models as it is done in the article.
In order to interpret the results of a regression model, several statistics including an F-test, t-test,
multiple R-Squared, and the adjusted R-Squared to test for the model’s goodness of fit.
Data
Another key aspect in research is data. Data, in any scientific research forms the basis for all
analyses, discussion and inferences that will be drawn with an aim of addressing the study
objective. In this study. The datasets are obtained from a number of sources such as the
distances which are generated using Google API software and stored under distance.csv
(information on the distance between the origin in Syria and destination in Lebanon) and Syria–
Syria-distance.csv (pairwise distance matrix between different cities in Syria) with 15
observations for the 15 cities of Syria. Whereas the Density.csv dataset contains 26 variables and
15 observations for each where the observations is “…the population density between every
Syrian city i and Lebanese city j” (Najem & Faour, 2018), the Migration data which is stored in
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
the Migration.xlsx obtained from Lebanese census on refugees. It contains number of refugees
fleeing from a given city in Syria with a specific destination in Lebanon.
Graphs
Predominantly, three graphs are used to represent different distributions:
Chord graph
It is plotted using Migration.xlsx data to visualize the possible destination in Lebanon of
refugees from a given Syrian city. In a chord diagram, the relationships between different matrix
data points are graphically represented to show potential connections so as to enable easy
determination of which point is linked with which other point since they allow to visualization of
weighted relationships between variables (Abel, 2018).
Scatter Plot
The scatter plot in figure 2, is used to visualize the results of the model
That is log Tij against the results of the equation above. Ideally, a scatter plot in data analysis is
adopted when there is need to represent the values obtained from two different variables x and y.
In this case it shows the distribution between the predicted variables and the observed variables
so as to examine the performance of the models. In a good model, the predicted data points
should lie approximately close to the regression line.
Map
Another visualization tool used is the map which shows the distances between different interest
points which is in figure 3. A map is useful in estimating distances between different places,
Document Page
showing how the focus places occur in relation to each other. In the study, the map is used to
show concentration of Syrian refugees from a given city in Lebanon cities.
Conclusions made
After the data is analyzed, the researchers use the multiple R-Squared and Adjusted R-Squared
statistics that are from the three models used so as to examine the performance of the gravity,
radiation, and the independent regression models and determine which is the most suitable. From
the analysis, the regression model has the highest adjusted R-Squared of 0.8 compared to the
other models indicating that it accounts for up to 80% of the variation in the predicted data which
is relatively high. To understand the distribution of the refugees from Syria in Lebanon, there is
use of background information which conclude that the distribution is influenced by the arrival
of Syrians before the war who later acted as contacts for the fleeing Syrians. Conclusively, the
original research theory had a hold in determining the mobility of refugees using the regression
model as an analysis tool where both the radiation and gravity models failed short.
Document Page
References
Abel, G. J. (2018, Feb 9). CHORD DIAGRAM. Retrieved from Data to Visual: https://www.data-
to-viz.com/graph/chord.html
Chaney, T. (2011). The Gravity Equation in International Trade: An Explanation. Chicago:
University of Chicago.
Curiel, R. P., Pappalardo, L., Gabrielli, L., & Bishop, S. R. (2018). Gravity and scaling laws of
city to city migration. PLoS ONE, 13(7), 1-19. doi:10.1371/journal.pone.0199892
Mauldin, J., & Friedman, G. (2016, July 5). 3 Reasons Brits Voted For Brexit. Retrieved from
Forbes: https://www.forbes.com/sites/johnmauldin/2016/07/05/3-reasons-brits-voted-for-
brexit/#5dbfc5b41f9d
Najem, S., & Faour, G. (2018). Debye–Hückel theory for refugees’ migration. EPJ Data
Science, 7(22), 1-9. doi:10.1140/epjds/s13688-018-0154-8
Simini, F., M, C. G., Maritan, A., & Baraba´si, A. (2012). A universal model for mobility and
migration patterns. Nature, 484(7392), :96–100. doi:10.1038/nature10856
chevron_up_icon
1 out of 7
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]