1CIS8008 Assignment 3: Predictive Modeling and Data Visualization

Verified

Added on  2025/05/03

|24
|1801
|391
AI Summary
Desklib provides solved assignments and past papers to help students succeed.
Document Page
1
CIS8008
ASSIGNMENT 3
Student ID:
Student Name:
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
2
Table of Contents
List of Figures............................................................................................................................3
Task 1.........................................................................................................................................4
Task 1.1..................................................................................................................................4
Task 1.2..................................................................................................................................9
Task 1.3................................................................................................................................11
Task 1.4................................................................................................................................13
Task 3.......................................................................................................................................18
Task 3.1................................................................................................................................18
Task 3.2................................................................................................................................19
Task 3.3................................................................................................................................20
Task 3.4................................................................................................................................21
Task 3.5................................................................................................................................22
References................................................................................................................................24
Document Page
3
List of Figures
Figure 1: Design EDA................................................................................................................3
Figure 2: Results EDA...............................................................................................................4
Figure 3: Result Stats.................................................................................................................5
Figure 4: Bar Chart EDA...........................................................................................................5
Figure 5: Scatter Chart EDA......................................................................................................6
Figure 6: Correlation Matrix Test..............................................................................................7
Figure 7: Decision Tree Model Process.....................................................................................8
Figure 8: Result..........................................................................................................................9
Figure 9: Decision Tree..............................................................................................................9
Figure 10: Logistic Regression Process...................................................................................10
Figure 11: Example set result...................................................................................................11
Figure 12: Graphs logistic regression result.............................................................................11
Figure 13: cross validation process..........................................................................................12
Figure 14: cross validation decision tree result........................................................................12
Figure 15: Decision tree model Performance result.................................................................13
Figure 16: Cross validation decision tree model......................................................................13
Figure 17: Cross validation Decision tree scatter plot.............................................................14
Figure 18: cross validation model for the logistic regression..................................................14
Figure 19: cross validation process logistic regression............................................................15
Figure 20: Accuracy logistic regression...................................................................................15
Figure 21: example set of the cross validation of the logistic regression................................16
Figure 22: result 3.1.................................................................................................................18
Figure 23: result 3.2.................................................................................................................19
Figure 24: result 3.3.................................................................................................................20
Figure 25: result 3.4.................................................................................................................21
Figure 26: Dashboard tableau..................................................................................................22
Document Page
4
Task 1
Task 1.1
The exploratory analysis of this data i.e. weatherAUS.csv will be done by using the Rapid
Miner tool. Though many other tools exist but the Rapid Miner is the best. Rapid Minor is a
tool which is very easy to use and doesn’t require particular training or a professional for
performing the analysis of the data. It is a tool which can be used by a person who doesn’t
have the knowledge related to the data analysis process which runs in the background. In this
process, a data set is read through the operator. The data set file is selected from the
repository and then the file is read with the use of the read csv operator. Then, using the
extract statistics operator which pulls out overall statistics and then write it in another file for
using it for further analysis.
Below image shows the design of the exploratory data analysis.
Figure 1: Design EDA
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
5
After running the process, we get the result which is the extracted statistics file. The image
below displays the result after running the process.
Figure 2: Results EDA
The variables which are there in the data analysis are as follows:
Name, Role, Type, Missing, Minimum, Maximum, Average, Deviation, Least, Most, Values,
Earliest & Latest Date and duration. The 5 most important variables which will be used for
predicting the Rain tomorrow are Minimum, Maximum, Average, Missing and Deviation.
Document Page
6
Figure 3: Result Stats
The image displayed above consists of all the important variables needed for the prediction.
Figure 4: Bar Chart EDA
Bar chart of the explored data analysis is displayed above. There are a total of five variables
which are included in the graph.
Document Page
7
Figure 5: Scatter Chart EDA
Scatter chart of the explored data analysis is displayed above. There are a total of five
variables which are included in the chart.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
8
Correlation Test
Figure 6: Correlation Matrix Test
Document Page
9
Task 1.2
A decision tree represents graphically all decision possible based on some conditions. It is
termed as a tree of decision as it begins from a single box, which branches off like a tree in a
number of solutions. With the use of the Rapid Miner, the decision tree is developed. Now, as
we know that there are various other tools which are present in the market, but Rapid Miner
is the best amongst them.
Figure 7: Decision Tree Model Process
In the decision tree process, we will first retrieve the data from the file weatherAUS.csv.
Basically, retrieving weatherAUS.csv data set. Then we will be setting roles for the label
variable i.e. RainTomorrow and Date from the data set variables. Further, we will be building
a decision tree model with the use of the weather training data set. Now, finally applying the
decision tree model to the weatherAUS.csv data set. Hitting run so that the process can be
executed. Once, the running of the decision tree process is completed, we will have a decision
tree. The overall process for applying a decision tree model on a data set is really easy and
faster to execute.
Document Page
10
Figure 8: Result
Figure 9: Decision Tree
The above image displays the decision tree as an outcome for the overall prediction of the
rain.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
11
Task 1.3
The next task is to develop the logistic regression model. This model is developed for the
weatherAUS.csv data set. The logistic model is a commonly-used predictive model in
statistical terms that, while many more complicated extensions exist, uses a logistic feature to
model a binary parameter. Logistic regression estimates the logistic model variables in the
analysis of the data.
Figure 10: Logistic Regression Process
In the Logistic regression process, we will first retrieve the data from the file
weatherAUS.csv. Basically, retrieving weatherAUS.csv data set. Then we will be setting
roles for the label variable i.e. RainTomorrow and Date from the data set variables. Further,
we will be building a Logistic regression model with the use of the weather training data set.
Now, finally applying the Logistic regression model to the weatherAUS.csv data set. Hitting
run so that the process can be executed. Once, the running of the Logistic regression process
is completed, we will have a Logistic regression model. The overall process for applying a
Logistic regression model on a data set is really easy and faster to execute. We will also be
adding a performance operator.
Document Page
12
Figure 11: Example set result
Figure 12: Graphs logistic regression result
chevron_up_icon
1 out of 24
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]