logo

Data Mining and Visualization: Performance Comparison of Classification Algorithms

   

Added on  2023-06-05

6 Pages956 Words66 Views
Data mining and visualization
Student Name:
Instructor Name:
Course Number:
Data Mining and Visualization: Performance Comparison of Classification Algorithms_1
Introduction
The aim of this report is to present analysis of a data mining evaluation of the performance of
different classification algorithms. The data set (vote.arff) was loaded into Weka and comparison
of the performance on the data set for three classification algorithms was peformed. The three
different classigfication algorithms include:
Decision Tree
Naive Bayes
k-Nearest Neighbour (KNN)
Decision Tree
A decision tree refers to a classifier that is given as a recursive slit-up of the case space. The
decision tree comprises of nodes that shape an established tree, which means it is a coordinated
tree with a node called "root" that has no approaching edges (Kamiński, Jakubczyk, & Szufel,
2017). Every single other node have precisely one approaching edge. A node with active edges is
called an interior or test node. In this section, we sought to perform a data mining algorithm
which is based on classification using Decision Tree approach. Using a 60% split where 60% in
the training set and 40% in the testing test, we ran a decision tree test. The results as can be seen
in the screenshot given below (figure 1) showed that 171 attributes were correctly identified
using this methodology while only 3 attributes were incorrectly classified. This gives the
approach a performance percentage of 98.2759%
Data Mining and Visualization: Performance Comparison of Classification Algorithms_2
Figure 1: Classification using decision tree approach
Naïve Bayes
This is a machine learning algorithm that falls in a family of simple "probabilistic classifiers"
that is based on application of Bayes’ theorem having a strong focus on naïve unrelated
assumptions between the features (Niculescu-Mizil & Caruana, 2005). Naive Bayes classifiers
are profoundly versatile, requiring various parameters direct in the quantity of factors
(highlights/indicators) in a learning problem (Karimi & Hamilton , 2011). In this section, we
sought to perform a data mining algorithm which is based on classification using Naïve Bayes
approach. The best performance for this methodology was found to be using a 65% split where
65% in the training set and 35% in the testing test. The results as can be seen in the screenshot
given below (figure 2) showed that 139 attributes were correctly identified using this
Data Mining and Visualization: Performance Comparison of Classification Algorithms_3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Comparative Exploration of KNN, J48 and Lazy IBK Classifiers in Weka
|19
|2887
|140

Data Mining and Visualization for Business Intelligence
|14
|1554
|444

Data Mining Case Study 2022
|25
|1821
|23

Data Mining for Cardiac Arrhythmia Detection using KNN, Naive Bayes, SVM, Gradient Boosting, Model Tree and Random Forest
|10
|4822
|461

Text Classification Using Naïve Bayes
|15
|1230
|324

Assignment on Intelligent Systems for Analytics
|47
|6004
|28