Weka-based Data Analysis: Iris and Diabetes Datasets

Verified

Added on  2025/06/23

|12
|432
|213
AI Summary
Desklib provides solved assignments and past papers to help students succeed.
Document Page
Assessment item 2
1
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Table of Contents
Task 1...............................................................................................................................................3
Task 2...............................................................................................................................................7
Task 3.............................................................................................................................................11
List of figures
Figure 1: opening of Weka..............................................................................................................3
Figure 2: Weka workbench..............................................................................................................3
Figure 3: opening of file..................................................................................................................4
Figure 4: opening of diabetes.arff file.............................................................................................4
Figure 5: file opened........................................................................................................................5
Figure 6: histogram of age...............................................................................................................6
Figure 7: iris.arff file opened...........................................................................................................7
Figure 8: petal_width removed........................................................................................................7
Figure 9: saved as iris.3D.arff..........................................................................................................8
Figure 10: histogram of sepal_length..............................................................................................8
Figure 11: histogram of sepal_width...............................................................................................9
Figure 12: histogram of petal_length...............................................................................................9
Figure 13: histogram of class.........................................................................................................10
Figure 14: scatter plot....................................................................................................................11
Figure 15: plot for sepal_length and petal_length.........................................................................12
Figure 16: plot for sepal_width and sepal_length..........................................................................12
2
Document Page
Task 1
Figure 1: opening of Weka
Figure 2: Weka workbench
3
Document Page
Figure 3: opening of file
Figure 4: opening of diabetes.arff file
4
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Figure 5: file opened
a) How many instances and attributes (including the class attribute) does this dataset
have?
The dataset of diabetes.arf is having 768 instances and 9 attributes.
b) How many classes are present in the dataset and how many instances are there for each
class?
Here is only one class in the dataset. There are two instances in the class namely
tested_negative and tested_positive. The count is 500 and 268 respectively.
5
Document Page
c) Use histograms (with default settings) to show which age group has the highest number
of samples?
Figure 6: histogram of age
6
Document Page
Task 2
a) Open the iris.arff file from ~/weka/data/ folder in a text editor, then remove
‘petal_width’ attribute and save it as iris.3D.arff. Please make sure that the Attribute-
Relation File Format (.arff) is correctly preserved.
Figure 7: iris.arff file opened
7
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Figure 8: petal_width removed
Figure 9: saved as iris.3D.arff
b) Load this file in workbench and include a screenshot of the histograms (with default
setting) for each attribute in this dataset.
8
Document Page
Figure 10: histogram of sepal_length
Figure 11: histogram of sepal_width
9
Document Page
Figure 12: histogram of petal_length
Figure 13: histogram of class
10
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Task 3
a) Load the file (iris.3D.arff) that you have created in the previous task in workbench and
generate a scatter plot using the‘visualize’ menu option to show data distribution for
each two attributes in a two-dimensional visualisation.
Figure 14: scatter plot
11
Document Page
b) Visually compare the plots for (sepal_length, sepal_width) and (sepal_length,
petal_length) and comment on which one of them shows a better class separability in this
dataset. Justify your answer with screenshots.
Figure 15: plot for sepal_length and petal_length
Figure 16: plot for sepal_width and sepal_length
12
chevron_up_icon
1 out of 12
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]