University Data Science Project: Earthquake Prediction (CIS 632)

Verified

Added on  2022/09/26

|18
|4334
|17
Project
AI Summary
This project proposal outlines a data science project focused on earthquake prediction. The project begins with an executive summary highlighting the importance of data science in addressing real-world problems. It includes a background review of earthquake occurrences and their impact, followed by a clear problem statement emphasizing the need for accurate earthquake forecasting. The proposed solution involves utilizing data science techniques, including machine learning, to analyze seismic data and predict earthquakes. The project objectives are clearly defined, focusing on improving prediction accuracy and mitigating the impact of earthquakes. The expected outcomes include the development of a predictive model and a framework for earthquake forecasting. The proposal also includes references to relevant research and studies. The project aims to provide insights into earthquake prediction using data science, offering a practical approach to a significant global challenge. The project also discusses various methods that can be utilized such as machine learning, regression trees as well as incorporation of training data that might be used for the purpose of having a better determination of the subduction zones.
Document Page
Running head: EARTHQUAKE PREDICTION WITH DATA SCIENCE
EARTHQUAKE PREDICTION WITH DATA SCIENCE
Name of the Student:
Name of the University:
Author’s Note:
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
1EARTHQUAKE PREDICTION WITH DATA SCIENCE
Executive Summary:-
Data science is inter corrective area that applies technical approaches, procedures, systems and
algorithms to remove information and understandings from several operational and formless
data. Currently, capable data specialists realize that they must progress past the outdated skills of
examining vast amounts of information, programming skills and data mining. In order to expose
suitable intelligence for their administrations, data experts must master the full range of the data
science life sequence and possess a level of elasticity and accepting to exploit returns at every
phase of the procedure. At this moment, the knowledge of data science can be helpful to mitigate
many different types of a real-world problem. In this project report, the writer is describing an
earthquake problem which can predict through data science.
Document Page
2EARTHQUAKE PREDICTION WITH DATA SCIENCE
Table of Contents
Background Review:-......................................................................................................................3
Problem Statement:-........................................................................................................................4
Proposed Solution:-.........................................................................................................................6
Project Objective:-.........................................................................................................................12
Expected Outcome:-......................................................................................................................13
References:-...................................................................................................................................15
Document Page
3EARTHQUAKE PREDICTION WITH DATA SCIENCE
Background Review:-
It is identified that earthquakes are a significant problem, even major nations like Japan
are struggling with many difficulties. It is the consequence of an unexpected discharge of energy
in the Earth's coating that makes seismic waves (Melgar and Bock 2015). It is a significant risk
to the natural and human environments, in which persons died, buildings distorted and cities
damaged. Forecast and mitigation are generally directed in order to decrease the effect of an
earthquake on the atmosphere. Experts can forecast where major quakes are probable to happen,
however, based on the measure of earth plates and movement locations. They also can create
general estimates about when quakes might happen in a specific area, by searching at the past of
earthquakes in the county and sensing where compression is constructing along with fault
positions.
The warning about the location and the power of the upcoming earthquakes can be
delivered to an acceptable point by maps of seismic zones. It includes the analysis of the
frequency of the past earthquakes, the place of their hypocenters and epicentres. Conversely, in
areas that have not been suffered by quakes, the difficulties become complicated in such
circumstances (Yamagiwa et al. 2015). Close learning of the geological construction of the
seismic zones is prepared against the contextual of the geological construction of the province in
which quakes have never been disclosed. In circumstance the environmental features are
originated to be indistinguishable, then the seismic zone will be roughly the similar. However, in
the deficiency of the identical structures, the region must be considered as aseismic.
Predicting earthquakes are one of the most vital difficulties in Earth science because of
their upsetting significances. Present scientific studies connected to earthquake predicting
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
4EARTHQUAKE PREDICTION WITH DATA SCIENCE
attention on specific vital points like location, process and timing. In the previous time, many
scientists can predict the earthquake with the help of different geological knowledge and precise
mathematical calculations. Most of the cases, all calculation are defeated to natural disasters. In
this project are proposes that data science is one of the modern thought which can help to predict
this effect.
If the quakes in different collections can be connected to specific mechanisms that
usually create quakes in a geothermal tank, if experts can recognize what is occurring in the tank
in near real-time, they can investigate with monitoring water flows to make lesser cracks, and
consequently, heated H2O to make steam and ultimately electricity. These approaches could also
aid to decrease the probability of activating more massive earthquakes — at The Springs, and
wherever else fluid is propelled underground, counting at fracking-fluid removal sites. Lastly,
the devices could aid recognize the cautionary marks of a big one on its technique.
Problem Statement:-
Earthquake forecast is an extent of investigation of high technical and public attraction.
The aim for this is not only that quakes can happen huge numbers of losses in a short time, but
also because quakes can have an enormous community and financial impact on society (Glennie
et al. 2014). Earthquake forecast in the sense of creating deterministic estimates about the time,
place, and magnitude. An earthquake forecast must identify the predictable magnitude ratio, and
the topographical area anticipated magnitude range, and the time interval within which it will
occur, and the time break within which it will occur with the necessary accuracy. It is revealed
that there is an essential problem for choice makers that declarations of scientists regarding
earthquake incidence either cover very definite evidence but are very indeterminate or cover
pervasive evidence but are very definite. Earthquake threat can to an enormous range be
Document Page
5EARTHQUAKE PREDICTION WITH DATA SCIENCE
abridged by expressing and implementing suitable building codes. Additionally, scientists must
also allocate a confidence level to every prediction. An optimal threshold can be designed if
charges can be allocated to continuing a state of warning and to unexpected earthquakes.
Conversely, if prospects are specified for every region, then an expectation arrangement
can be verified using entire of the possibilities, and there is no essential to launch inception for
announcing alerts. These divide the problem of scientific analysis suggestions from the policy
difficulties of how to reply to estimates (Maeda et al. 2015). Faster updating displays a useful
concern, because exact information on the incidence of essential quakes may not be accessible
before substantial aftershocks happen. One explanation is to identify in development the
instructions by which the theory will be modernized and let the apprising be done mechanically
on the origin of initial information. A more cautious estimation of the theory may be carried out
later with reviewed information, as long as no modifications are completed except as identified
by guidelines recognized in advance. Likewise, the mechanism of seismic wave circulation in 3D
elastic media is well assumed. Conversely, given the detail that the population majority that is
susceptible by quakes, is living in the third world, it is vibrant that this cannot only be
appreciated. So, at face importance, the difficulties would seem to be straightforward. The
concerns, however, are considerable. The perfect three-dimensional arrangement of the ground's
crust is partly known, and it is unreasonable to collect enough information to describe it
appropriately. For these explanations, the earthquake forecast is not only a methodical concern: it
also has a problematic political measurement
Document Page
6EARTHQUAKE PREDICTION WITH DATA SCIENCE
Proposed Solution:-
Scientists have put forward a brief description regarding the procedure of earthquakes to
be one of the most impossible tasks. The fact can be stated that the prediction of potential
earthquakes occurring can only predicted a smaller portion of the potential occurrence but not the
whole (Morra et al. 2019). However, besides they fact that the exact prediction cannot still be
made, the scientists that taken into consideration various forms of advancements to put forward a
near perfect prediction of such happenings occurring.
The progress has been made with the help of complex forms of simulation as well as
various techniques that have a direct association with the procedure of modelling to put forward
better outcomes of forecast of the earthquakes that occur. The various methods that can be been
utilized are the likes of machine learning, regression trees as well as incorporation of training
data that might be used for the purpose of having a better determination of the subduction zones
(Hough 2016). This particular model employed the earthquakes that ranged within the magnitude
of (Mw 6.2-8.3) at a particular place as well as highlighted the particular estimation of upon the
timing along with the place where it has already occurred. This greatly acts as a primary
contributor towards forecasting the occurrences of earthquakes in a better way. Hence, this can
be deployed to the laboratories for dealing with the earthquakes to predict the future occurrences
with greater efficiency.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
7EARTHQUAKE PREDICTION WITH DATA SCIENCE
Figure-1: Earthquake prediction
(Source- Elshin and Tronin 2020)
It can be stated that the full time prediction of a particular earthquake taking place is very
difficult to predict. However, it also states that the aftershocks as well as the other existing
predictive results have proved to be providing with better results that might be predicted (Elshin
and Tronin 2020). Tests can be run that might highlight the fact that the neural network model in
terms of machine learning as well as patterns that are associated with the past aftershocks greatly
provision with a helping hand towards improving the prediction analysis of the natural calamity.
Machine learning has provided the field with other sorts of improvement such as warning
systems that work early when the prediction has been already made. In regards to this, it shall be
Document Page
8EARTHQUAKE PREDICTION WITH DATA SCIENCE
kept in mind the signals that are generated from the oceans can act deceptive in terms of an
approaching earthquake that might be a false alarm of some other natural calamity. Hence,
making use of Pwaves that are generated by the past occurred earthquakes can be used to train
the algorithm of the machine learning method for helping the machine to recognize the
approaching earthquakes in a better way as well as to differentiate the same from the other forms
of natural calamities to reduce the triggering of false alarms (Hajikhodaverdikhan et al. 2018).
This has proven to be beneficial to recognize the waves that are generated by the earthquakes and
the ones that are generated by the other forms of natural calamities. This has been visibly
identified to be 99% accurate in terms of results.
While the natural earthquakes are considered to be the main concern, the induced
earthquakes that can potentially occur from the fracking or the wells can be potentially attracting
the second focus while carrying out the research. The likelihood of an earthquake can be judged
form the methods of sedimentology as well as lithology that can have a great effect upon the
same (Mignan and Broccardo 2019). A learning model that consists of the logistic regression
machine can be put to use to highlight the hydrologic as well as the geologic features that can be
combined to put forward the creation of such existing earthquakes as well as the likelihood of the
earthquakes occurring at the induced areas.
Earthquakes have been identified to be a major concern that exists in the regions, which
are highly populated. The search for tools in regards to the goals of data science for predicting
them, the earthquake has been forecasted using methods that relate to data science. The primary
setup has the inclusion of two hydraulic presses that apply a certain amount of pressure upon a
vertically lying plate (Mukherjee and Das 2020). This setup was a near experience of a tectonic
plate and the relative movements close of a fault line. In addition to this, the setup also creates
Document Page
9EARTHQUAKE PREDICTION WITH DATA SCIENCE
earthquakes at a consistent rate. Hence, this setup can be greatly utilized to create the
environment of an earthquake for carrying out the prediction of the occurrence.
The seismic data from this setup is supposed to be gathered and followed by a prediction
of an approaching earthquake. The three types of data that needs to be recorded are,
i) Feature extraction or might as well be selection with the use of tfresh.
ii) Model CV comparison.
iii) Gridsearch on LGBM (Wang et al. 2017).
In regards to the experiment that shall been carried out, the training related data consists
of two primary features namely the likes of acoustic data, failure timing, seismic signal as well as
the other forms of prediction that are being tried for.
Figure-2: Acoustic Data
(Source- Wang et al. 2017)
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
10EARTHQUAKE PREDICTION WITH DATA SCIENCE
The figure above represents the acoustic data that has been extracted from the training
set. Each of the large spikes that exists within the graph highlights an earthquake that has
occurred during the time of the experiment.
Figure-3: Acoustic Data and time failure
(Source- Asim et al. 2017)
The above diagram highlights the zoomed view of two major earthquake events that have
occurred (Asim et al. 2017). The acoustic data has been highlighted along with the time failure in
regards to the earthquake events that have occurred when the particular experiment was carried
out.
After this, the feature extraction procedure shall be carried out where the various forms of
data for predicting the occurrences from the time series or the detection of signals that are in
hand shall be done. These data are then used for the purpose of performing the regression
Document Page
11EARTHQUAKE PREDICTION WITH DATA SCIENCE
problem that shall show outcomes close to predicting the occurrence of an earthquake problem
(Joffe et al. 2018). However, to get hold of better outcomes in the form of results, there is a
primary necessity of making proper usage of prevalently existing time series and the relative
analysis that will prove to be an efficient helping hand towards carrying out the prediction using
regression analysis and highlighting the alarming occurrence of the same. Some of the features
that have been used are the likes of mean of a particular segment, standard deviation and that of
IQR. However, for the gathering of better outcomes of results, there is particular need for getting
hold of better features that share a prevalence with the time series and the relative analysis.
For the extraction of the required segment of features, there is a particular need of making
use of EC2 instances that are large. After the entire process of extraction, 4184 segments have
been gathered shall be placed within the training set. The testing of multiple regressors such as
the likes of OLS, Random Forest, CatBoost, SVM as well as light gradient Boost method have
been put to use (Martinelli 2018). For the regression calculation, the following features need to
be analysed with the help of required tools. The features for the analysis procedure are,
i) Peak numbers- the method that carries out the counting of the number of peaks that
occur in a window.
ii) Autoregressive Model Coefficients- this model carries out the activity of transforming
the particular time series into the regression problem that considers the earlier
existing values of the time series while the coefficients of the regression model are
proved to be an important part of the entire process.
iii) Fast Fourier Transform Variance- this method refers to the decomposition of the
particular signal like that of the seismic signal into a specific product consisting of
multiple frequencies. This is the procedure that specifically comes into usage when
chevron_up_icon
1 out of 18
circle_padding
hide_on_mobile
zoom_out_icon