Comparative Exploration of KNN, J48 and Lazy IBK Classifiers in Weka

Verified

Added on  2023/06/07

|19
|2887
|140
AI Summary
This article explores three different classification techniques in data mining - KNN, J48 and Naive Bayes. A comparative evaluation of these techniques has been performed in terms of accuracy and cost analysis. The study shows that decision tree (J48) performed better than KNN and Naive Bayes for the given dataset. The article provides a detailed explanation of each algorithm and their performance analysis.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
16
Comparative Exploration of KNN, J48 and
Lazy IBK Classifiers in Weka

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
16
Abstract
A data mining technique that predicts that a group belongs to data instances in a particular record
with certain restrictions is known as a Classification tool. The data classification problem has
been found in several areas of data mining (Xhemali, Hinde, & Stone, 2009). This is the problem
of knowing a number of characteristic variables and an objective variable. Customer orientation,
medical diagnostics, social networking analysis, and artificial intelligence are some places of
application. This article focuses on different classification techniques, their pros, and cons. In
this article, J48 (decision tree), k-nearest neighbor and naive Bayes classification algorithms
have been used. A comparative evaluation of J48, k-nearest neighbor and naive Bayes in
connection with voting preferences has been performed. The results of the comparison presented
in this document relate to the accuracy of the classification and cost analysis. The result reveals
the efficacy and precision of the classifiers.
Document Page
16
Table of Contents
Abstract........................................................................................2
Introduction................................................................................... 4
K- Nearest Neighbor Classification................................................4
Naive Bayes Classification.............................................................5
Decision Tree................................................................................5
Comparison among K-NN (IBK), Decision Tree (J48) and Naive
Bayes Techniques.........................................................................6
Instrument for the Comparison.....................................................6
Data Exploration...........................................................................7
Performance Investigation of the Classifiers.................................9
Classification by Naïve Bayes........................................................9
K-Nearest Neighbor Classification...............................................11
Decision Tree (J48) Classification................................................14
Conclusion..................................................................................17
Reference.................................................................................... 18
Document Page
16
Introduction
In data classification, the format of the data organized accurately into categories in accordance to
similarity and specificity, so that objects in different groups are dissimilar and the algorithm
assigns each instance in order to minimize the error (Brijain, Patel, Kushik, & Rana, 2014).
Categorization of labels and classification of data based on the constraints of the model is created
with the associated data set and class labels. The classification utilizes a controlled approach
which can be categorized in two steps. Firstly, the permutation of training with the pre-
processing phase that constructs the classification model. Secondly, the application of the model
on an experimental data set with class variables (Jadhav, & Channe, 2016). The current article is
focused on exploring three different classification techniques in data mining. This study
compares between K-NN classification, Decision Tree and Bayesian Network based on the
precision of each algorithm. This comparative guide may help future researchers to develop
innovative algorithms in the field of data mining (Islam, Wu, Ahmadi, & Sid-Ahmed, 2007).
K- Nearest Neighbor Classification
The KNN algorithm assumes that the samples are close to one another and it is an instance-based
learning method. These types of classifiers are also known as lazy learners. KNN creates a
classifier by storing all the samples without marking. The slow learning algorithms take less
calculation time during the learning phase and are based on the learning of similarity. That is,
for a sample of X data to be sorted, it will search for its nearest K neighbor and then assign X to
the class name to which most of its neighbors belong. The performance of the nearest k neighbor
algorithm is influenced by choice of the value of k.

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
16
Naive Bayes Classification
The Naive Bayes classification is based on the Bayesian statistical model and is called naive
because the classification is based on the assumptions that all factors have a contribution to the
classification and correlate with each other. This hypothesis is called Simple Bayes or Bayesian
independence. This naïve classification predicts the likelihood of a class membership, such as the
exact class label likelihood of a particular element. A naive Bayesian classification assumes the
presence of a given class-specific variables' attributes does not have to be performed with the
presence of another attribute. The naive Bayes classification technique is used when the size of
the inputs is high. The basis of the Bayesian classification is the Bayes theorem on conditional
probability where it is specifically used to compute the posterior probabilities, such as p (c | x)
and p (x | c).
Decision Tree
The decision tree is a prediction model that associates the observations of an element with the
objective value of the element. The decision tree algorithm is a data extraction induction
technique that segregates records by using the depth-first or the breadth-first approach. A
decision tree consists of root nodes, internal nodes, and leaf nodes. The resemblance with an
organization chart can be noticed, where each inner node specifies a test condition for an
attribute. The result of the test condition gets represented as a branch and each end node receives
a class tag. The root node is generally the topmost node of the decision tree. In a decision tree,
each route is an adjective rule. In general, it uses the depth-first approach in two phases, viz.
Document Page
16
growth and circumcision. Tree formation takes place in a top-down technique. At this stage, the
structure is recursively divided until the data elements belong to the same class plate. Trimming
is used to improve the prediction and notation of the algorithm by minimizing the excessive
problem of the tree (Taruna, & Pandey, 2014).
Comparison among K-NN (IBK), Decision Tree (J48) and Naive Bayes Techniques
Document Page
16
Instrument for the Comparison
Waikato environment for knowledge analysis (WEKA) has been used as the statistical tool for
exploring and analyzing the performance of the three classifiers. The collection of visualization
tools and classification algorithms was sufficient for the purpose of the current investigation
(Dan, Lihua, & Zhaoxin, 2013). The Vote.ARFF file was imported in the Weka pre-process
window. The filters for classification was applied in the classify tab.
Data Exploration
The Vote.ARFF file contained a class with two options, Democrats and Republicans. There was
435 set of responses from people about their choices. The opinions about sixteen attributes were
present in the file, where a number of missing values/responses was noticed. In Weka, there are
various choices for the treatment of missing values. The investigation opted for ‘Replace Missing
Value’ filter that replaced the missing values by the Modal values of the respective attribute. The
representation of the class and the attributes has been provided in Figure 1 and Figure 2.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
16
Figure 1: Database Class Division Opened in Weka
Figure 2: Visualization of All the Attributes of Classification
Document Page
16
Performance Investigation of the Classifiers
In the Weka environment, the classifiers of the study were applied with 10 fold cross-validation.
The cross-validation technique was found to be statistically significant in the investigation of
classifiers (Thornton, Hutter, Hoos, & Leyton-Brown, 2013). The number of instance s were
found for KNN, J48, and Naïve Bayes algorithms, which was followed by cost analysis and
classification precision based on voting choices of the two class of people. The confusion matrix
for all three classifiers was examined for actual and predicted categorization. The terms related
to confusion matrix were identified as "True Positive" (when the actual and the predicted value
has the same outcome), "False Positive" (when the actual result indicates that the particular
attribute is present), "Precision" (measure of quality and precision), and “Recall” (measure of
quality and totality).
Classification by Naïve Bayes
The output of the Bayes algorithm has been provided in Figure 3. The correctly classified
instances were 392 (P = 90.11%), whereas 43 instances were classified incorrectly (P = 9.86%).
Correctly predicted valid results for democrats and republicans were 238 and 154. Precision for
predicting the responses of Democrats (Precision = 0.94) was higher than that of Republicans
(Precision = 0.91). From the cost analysis of the algorithm, it was found that correctly predicted
proportion of Democrats was 61.38%, and that of Republicans was 38.62%.
Document Page
16
Figure 3: Naive Bayes Output from Weka
Figure 4: Cost Analysis Curves of Naive Bayes for Democrats

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
16
Figure 5: Cost Analysis Curves of Naive Bayes for Republicans
K-Nearest Neighbor Classification
The output of KNN (k = 1) algorithm has been provided in Figure 6. The correctly classified
instances were 407 (P = 93.56%), whereas 28 instances were classified incorrectly (P = 6.44%).
Correctly predicted valid results for democrats and republicans were 250 and 157. Correctly
predicted valid results for democrats and republicans were 250 and 157 Precision for predicting
the responses of Democrats (Precision = 0.96) was higher than that of Republicans (Precision =
0.90). From the cost analysis of the algorithm, it was found that correctly predicted proportion of
Democrats was 61.38%, and that of Republicans was 38.62%. The KNN (k = 5) algorithm
correctly classified instances were 409 (P = 94.02%), whereas 26 instances were classified
incorrectly (P = 5.98%). The optimality incorrect classification was achieved at k = 5 in the
KNN algorithm.
Document Page
16
Figure 6: IBK (KNN) Output from Weka for K = 1
Figure 7: IBK (KNN) Output from Weka for K = 5
Document Page
16
Figure 8: Cost Analysis Curves of KNN for Democrats
Figure 9: Cost Analysis Curves of KNN for Republicans

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
16
Decision Tree (J48) Classification
The output of the J48 algorithm has been provided in Figure 10. The correctly classified
instances were 419 (P = 96.32%), whereas only 16 instances were classified incorrectly (P =
3.68%). Correctly predicted valid results for democrats and republicans were 259 and 160.
Precision for predicting the responses of Democrats (Precision = 0.97) was higher than that of
Republicans (Precision = 0.95). From the cost analysis of the algorithm, it was found that
correctly predicted proportion of Democrats was 61.38%, and that of Republicans was 38.62%.
The decision tree of the model has been provided in figure 13.
Figure 10: J48 (Decision Tree) Output from Weka
Document Page
16
Figure 11: Cost Analysis of Decision Tree for Democrats
Figure 12: Cost Analysis of Decision Tree for Republicans
Document Page
16
Figure 13: Decision Tree of J48 Classifier
Result Difference for KNN, J48 and Lazy IBK Classifiers
Parameters KNN (k=5) J48 IBK Lazy
Correctly Classified
Instances 94.02% 96.32% 90.11%
Incorrectly Classified
Instances 5.98% 3.68% 9.89%
Relative Error 46.88% 38.08% 61.91%
Confusion True
Positive 251 259 236
Confusion True
Negative 158 160 154
Best classifier was identified as J48 (Decision Tree).

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
16
Conclusion
Based on a research and analysis comparison of the three data mining classification algorithms, it
showed that all three algorithms were highly accurate and has a lower error rate. The
comprehension presented in the decision tree in the form of rules that are easier for people to
understand (Sheikh, Karthick, Malathi, Sudarsan, & Arun, 2016). The result of the
implementation in Weka with the same data set showed that the decision tree yielded better
results than KNN, Bayesian classifications (Frank et al., 2009). Though the accuracy of the other
two classifications was quite high, KNN was observed to have better accuracy as the
classification algorithm compared to Naïve Bayes. The comparative study showed every
algorithm that has its advantages and disadvantages, as well as its own execution range. No
algorithm can meet all constraints and criteria. Depending on the application and requirement, a
specific algorithm can be selected. Although Naive Bayes can surpass more advanced
classification methods for certain complex data set, the present research found it at least exact. In
this experiment, the decision tree surpasses the naive Bayes and the next neighbor K. The reason
for the good performance of the decision tree is not that there are no dependencies in the data
attribute. The good performance is caused by the accuracy and explicitly of the structural nature
of the function in the classification (Anyanwu, & Shiva, 2009). In other instance, where the
distribution of dependencies between all attributes on classes that affect the Naive Bayes
classification might have a positive effect on the classification algorithm (COE, 2012). Time to
take the model for three classifiers was zero, indicating that the data set was less complex in
nature. In a conclusive statement, it can be inferred that for the current data set decision tree
performed better. In the future, a similar comparison could also yield interesting results for
different software platform with various data sets (Salama, Abdelhalim, & Zeid, 2012; Solanki,
2014).
Document Page
16
Reference
Anyanwu, M. N., & Shiva, S. G. (2009). Comparative analysis of serial decision tree
classification algorithms. International Journal of Computer Science and Security, 3(3),
230-240.
Brijain, M., Patel, R., Kushik, M., & Rana, K. (2014). A survey on decision tree algorithm for
classification.
COE, J. (2012). Performance comparison of Naïve Bayes and J48 classification
algorithms. International Journal of Applied Engineering Research, 7(11), 2012.
Dan, L., Lihua, L., & Zhaoxin, Z. (2013, January). Research on text categorization on WEKA. In
Intelligent System Design and Engineering Applications (ISDEA), 2013 Third
International Conference on (pp. 1129-1131). IEEE.
Frank, E., Hall, M., Holmes, G., Kirkby, R., Pfahringer, B., Witten, I. H., & Trigg, L. (2009).
Weka-a machine learning workbench for data mining. In Data mining and knowledge
discovery handbook (pp. 1269-1277). Springer, Boston, MA.
Islam, M. J., Wu, Q. J., Ahmadi, M., & Sid-Ahmed, M. A. (2007, November). Investigating the
performance of naive-Bayes classifiers and k-nearest neighbor classifiers. In
Convergence Information Technology, 2007. International Conference on (pp. 1541-
1546). IEEE.
Jadhav, S. D., & Channe, H. P. (2016). Comparative study of K-NN, naive Bayes and decision
tree classification techniques. International Journal of Science and Research, 5(1).
Salama, G. I., Abdelhalim, M., & Zeid, M. A. E. (2012). Breast cancer diagnosis on three
different datasets using multi-classifiers. Breast Cancer (WDBC), 32(569), 2.
Document Page
16
Sheikh, F., Karthick, S., Malathi, D., Sudarsan, J. S., & Arun, C. (2016). Analysis of Data
Mining Techniques for Weather Prediction. Indian Journal of Science and
Technology, 9(38).
Solanki, A. V. (2014). Data mining techniques using WEKA classification for sickle cell
disease. International Journal of Computer Science and Information Technologies, 5(4),
5857-5860.
Taruna, S., & Pandey, M. (2014, February). An empirical analysis of classification techniques
for predicting academic performance. In Advance Computing Conference (IACC), 2014
IEEE International (pp. 523-528). IEEE.
Thornton, C., Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2013, August). Auto-WEKA:
Combined selection and hyperparameter optimization of classification algorithms.
In Proceedings of the 19th ACM SIGKDD international conference on Knowledge
discovery and data mining (pp. 847-855). ACM.
Xhemali, D., Hinde, C. J., & Stone, R. G. (2009). Naïve Bayes vs. Decision Trees vs. Neural
Networks in the classification of training web pages. Retrieved from
https://dspace.lboro.ac.uk/dspace-jspui/handle/2134/5394
1 out of 19
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]