CIS8008 Assignment 3: Data Analysis and Visualization Project

Verified

Added on  2025/05/03

|26
|1866
|298
AI Summary
Desklib provides solved assignments and past papers to help students succeed.
Document Page
CIS8008
ASSIGNMENT 3
Student ID:
Student Name:
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Table of Contents
List of Figures............................................................................................................................3
Task 1.........................................................................................................................................4
Task 1.1..................................................................................................................................4
Task 1.2..................................................................................................................................8
Task 1.3................................................................................................................................11
Task 1.4................................................................................................................................14
Task 3.......................................................................................................................................20
Task 3.1................................................................................................................................20
Task 3.2................................................................................................................................21
Task 3.3................................................................................................................................22
Task 3.4................................................................................................................................23
Task 3.5................................................................................................................................24
References................................................................................................................................25
Document Page
List of Figures
Figure 1: Explanatory Data Analysis Process............................................................................4
Figure 2: Example set EDA process..........................................................................................5
Figure 3: Example set EDA process statistics............................................................................5
Figure 4: Extract stats results.....................................................................................................6
Figure 5: Scatter Plot..................................................................................................................6
Figure 6: Correlation Test Process.............................................................................................7
Figure 7: Correlation Test Result...............................................................................................7
Figure 8: DT process Model......................................................................................................8
Figure 9: Example set - DT process...........................................................................................9
Figure 10: DT process stats........................................................................................................9
Figure 11: Decision Tree..........................................................................................................10
Figure 12: Decision Tree model scatter plot............................................................................10
Figure 13: Logistic Regression Model Implementation...........................................................11
Figure 14: Example set apply model logistic regression.........................................................12
Figure 15: Example set stats LR..............................................................................................12
Figure 16: Scatter plot logistic regression result......................................................................13
Figure 17: Cross Validation Operator Implementation............................................................14
Figure 18: Cross Validation with Decision Tree......................................................................15
Figure 19: Decision Tree after cross validation.......................................................................15
Figure 20: Example Set of the cross validation.......................................................................16
Figure 21: Kappa metric of the cross validation of the Decision tree......................................16
Figure 22: Scatter plot of the cross validation result................................................................17
Figure 23: Cross validation implementation for logistic regression........................................17
Figure 24: Example set cross validation logistic regression....................................................18
Figure 25: Stats example set logistic cross validation.............................................................18
Figure 26: Precision of the logistic regression after the cross validation................................19
Figure 27: Result scatter plot logistic.......................................................................................19
Figure 28: Task 3.1 Tableau worksheet...................................................................................20
Figure 29: Task 3.2 Tableau worksheet...................................................................................21
Figure 30: Task 3.3 Tableau worksheet...................................................................................22
Figure 31: Task 3.4 Tableau worksheet...................................................................................23
Figure 32: Tableau Dashboard Image......................................................................................24
Document Page
Task 1
Task 1.1
EDA is a method for analysing sets of data to clarify their key features, mostly by visual
methods. A mathematical model may or may not be used, but EDA is mainly used to
determine which data can show beyond all formal data analysis or monitoring. The data
which we are given is the weather data of Australia and we will be doing an EDA of the data
with the help of the software called Rapid Miner. As we know technology is increasing at a
constant rate so, we are having various other software which can do the EDA but we will
choose rapid miner as it’s the best and the easy software to use. The data set's EDA will
enable us to assess what needs to be sorted out with the data to analyse the Decision Tree
analysis and analyse logistic regression.
Figure 1: Explanatory Data Analysis Process
Image shown above is the EDA process for the data set given.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Figure 2: Example set EDA process
Image shown above is the result of the EDA process executed.
Figure 3: Example set EDA process statistics
Image shown above is the statistics of the EDA process executed.
Document Page
Figure 4: Extract stats results
Figure 5: Scatter Plot
Image shown above is the scatter plot of the EDA process executed.
Document Page
Figure 6: Correlation Test Process
The correlation matrix determines the correlation among all features and, depending on these
correlations, can generate a weight vector. It is statistical computing, that can also show if
pairs of features are connected as well as how powerfully.
Figure 7: Correlation Test Result
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Task 1.2
A decision tree is a collection of nodes like a tree designed to make a class affiliation
decision or an approximate of a statistical target. A dividing rule for a particular attribute
reflects every other node. This model is created with the help of the software named Rapid
Miner. As we know technology is increasing at a constant rate so, we are having various
other software which can build the decision tree model but we will choose rapid miner as it’s
the best and the easy software to use.
We will first start with the implementation of the decision tree model. The image below
represents the decision tree model process.
Figure 8: DT process Model
Document Page
Figure 9: Example set - DT process
In this process, the data set will be used and will get retrieved. After retrieval of the data set,
the role will be set up for the Rain Tomorrow variable. Then, after the decision tree model is
executed and then apply model will get executed. The process model will run and the results
will be shown. The second image is of the example set apply the model result of the decision
tree process.
Figure 10: DT process stats
Document Page
Figure 11: Decision Tree
Figure 12: Decision Tree model scatter plot
The above image shows the scatter plot of the decision tree model.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Task 1.3
Logistic regression is comparable to linear regression used during the binomial response
variable or label. There can be two classifications in a binomial action parameter i.e. Yes
or No. Now, the later step is to implement the logistic regression model. This process model
is generally used for the prediction of the statistics extracted from the data set being provided.
Figure 13: Logistic Regression Model Implementation
This process will be implemented with the use of the data retrieve operator which will
retrieve the data set and then the data will be converted from nominal to binomial data. Roles
will get set with the use of Rain tomorrow variable and then apply model operator will be
used. The process model will run and the results will be shown. The below image is of the
example set apply the model result of the logistic regression model process.
Document Page
Figure 14: Example set apply model logistic regression
Figure 15: Example set stats LR
chevron_up_icon
1 out of 26
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]